Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Advice for multilingual site 3

Status
Not open for further replies.

Sleidia

Technical User
May 4, 2001
1,284
FR

Hi guys,

I'm planning to create a website in 15 different languages.
But I'm hesitating when it comes to the database structure for such a website.

Most of the content should be taken from the database.

But the question is : should I separate the languages by column, by table or by database?

Until now, for each table, I had every column duplicated for every language such like : title_en, title_fr, title_ja, text_en, text_fr, text_ja, etc ...

With this method, I would end up with tables made of more than 30 columns and sometimes 50 columns.

What option would you choose if you were me?

Thanks for the help :)


 
I wouldn't hard-wire the languages into the database structure; it makes it difficult to add new languages, makes queries complicated, and violates relational principles.

In a situation like specifying translations for strings I'd use a translations table, with columns for (1) language code, (2) string code, (3) translation. For example:
[tt]
EN HELLO hello
FR HELLO bonjour
EN GOODBYE goodbye
FR GOODBYE au revoir
EN PAGE_TITLE welcome to our website
FR PAGE_TITLE bienvenu à notre site web
[/tt]
 

Thanks Tony,

I agree that it's not good if you want to preserve the relational principles so much but I don't get why you view your method as simpler.

With your method, queries become more complicate because I'd have to use two tables even for the most basic feature on the website. With my method, I use a single query like this :

Code:
<?php
$sql_query[0] = "
SELECT 
page_ID, page_ID AS current_page_ID, page_parent_ID, page_home_" . pLANG . " AS page_home 
FROM def_pages 
WHERE 
page_home_" . pLANG . " LIKE 'yes'
";
?>

You can see that the variable pLANG does all the work.

Also, in this particular example, the table serves as a way to let the website administrator add pages in the site. There is one page per row. This makes it possible to add a page for all the languages by just adding one row and make sure the page ID remains the same for all the languages with the least amount of programming work.

As for adding new languages, I'm the only one person who is supposed to add new languages. So, I would only have to add new columns in all the tables that need multilingual support. It doesn't sound like a big deal to me.

Seriously, the only thing that bothers me with my method is the amount of columns.

But if you think I've overlooked something or that something would backfire with my method, please let me know.

Thanks again for your time :)





 
Thanks guelphdad.

Taken at :

A standard piece of database design guidance is that the designer should begin by fully normalizing the design, and selectively denormalize only in places where doing so is absolutely necessary to address performance issues.[7] However, some modeling disciplines, such as the dimensional modeling approach to data warehouse design, explicitly recommend non-normalized designs, i.e. designs that in large part do not adhere to 3NF.[8]

Like for every discipline in webdesign (ie: HTML/CSS design), I find it troubling when some people blindly follow a principle regardless of the particularities of a certain project. In my opinion, the best method is the method that is the best for a certain project.

That's why I think that my question was stupid in the first place.

I'm the only one who's supposed to know about the existing code and what I want to achieve.

So, sorry for bothering guys ;)
 
Sleidia, it wasn't a stupid question and you weren't bothering anybody. You asked for our advice and we gave it, which you're free to accept or reject. Any design that works acceptably well for you is perfectly valid, whether it conforms to supposed "best practice" or not. Best of luck with your project.
 
Sleidia

you are misunderstanding what they are suggesting, they aren't saying don't normalize your data at all, they are suggesting that sometimes it isn't necessary to go to 3rd normal form or beyond. they aren't suggesting rejecting database normalization outright which is what you seem to be doing because you have your mind set on using your preconceived structure.

Tony, Sleidia has asked this question on at least two forums that I'm aware of. The suggestion is to understand normalization.

As you know many people get caught up in doing their project without normalization because they don't know what it is or after they learn what it is they'd have to change their table structure or front end application code.

My thought is too many people rush ahead without thought to the consequences going forward.
 
Let me give you a few thoughts to ponder about.

Seeing you use PHP, you can use the gettext library. As far as I know, it does not use a database at all, but just translation files. See for a nice explanation.


If you do want to use the database, think of what your entities are. I had the same problem once, and it led me to creating an "abstract"(*) table called "Phrase" with just an autonumber primary key. Together with a "Language" table, you could then look up the translation from the "Translation" table, which had foreign keys to both Phrase and Language. This way, any table could have a PhraseId for a column that would need translation (such as most lookup tables). Be prepared that this structure will have quite an impact on your database design and makes most records less recognizable by lack of text.

(*) A table with only one autonumber field may look funny, but there is nothing wrong about it. You can Issue queries like:
Code:
INSERT INTO Phrase() VALUES();
to insert a new phrase and use the last_insert_id mechanism to see what the (unique) ID is.

If you start your MySQL connection with a SET command, you can set the language for the whole connection as well:
Code:
SET @LanguageId=5;-- Just an example ID that should exist as a record in the Language Table. This sets the language.
SELECT RoleId, TranslatedText FROM Role INNER JOIN Translation ON Translation.LanguageId=@LanguageId AND Translation.PhraseId=Role.PhraseId;

As you see, The second query now uses the correct language ID but can be the same for all your page visits. As the Phrase entity is abstract, it guarantees that the PhraseIds are unique, but it does not show up in "concrete" queries.


+++ Despite being wrong in every important aspect, that is a very good analogy +++
Hex (in Darwin's Watch)
 
guelphdad,

Isn't it the real goal to have a website that works properly with the least amount of work to be done?

I've been using the current structure for many websites that use up to 3 languages (en,fr,ja).

As I said, the only drawback I see for the new project is the skyrocketting number of columns.

I would have changed the database structure if it worthed the hassle to do so.

Apparently, this isn't the case. There is more pros than cons to leave it the way it is.

Mostly everybody says I should normalize just because "I should' ... which isn't a valid reason for me.

I do things because I have no other choices, not because someone said "I should" without giving proper reasons.




 
Thanks DonQuichote,

Well, I have to use a DB because it makes it easier for clients to edit their content from my custom made CMS.

As for your alternative solution, this woud be difficult to implement because laguage related rows aren't only for used for content. Some are used for configuration settings (my CMS allows different settings per language).

Anyway, at this stage, I don't need alternative solutions, actually.

What I need to know is what's so bad with my DB structure and what are the problems I'm supposed to encounter if I leave it that way.

If the only one problem is to have too many columns per table, then, I can deal with it.
 
These are fairly fundamental database concepts. If you really don't 'get it' you may need to go back to basics. There is a good (very basic) article on database design at which should get you started on the best areas to research.

If you want the best response to a question, please check out FAQ222-2244 first.
'If we're supposed to work in Hex, why have we only got A fingers?'
Drive a Steam Roller
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top