Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

How to store this kind of data 3

Status
Not open for further replies.

lilboi

Programmer
Dec 22, 2003
146
0
0
CA
Hey guys!

I have a question. I want to make a website whereby the each users can choose from 1 or more different kinds of interests.

Now, I'm gonna have a PROFILE table and a INTERESTS tables.

Should I have a field in PROFILE that says "interests" and store 1 or more INTERESTS ids at delimited by a comma?

Or should I have a table MYINTERESTS and have
PROFILE-ID and INTEREST-ID in it? One row for each interest the user selected.

I'm thinking, if I have 1 million users(wishful thinking) with each having 4 interests, then MYINTERESTS will have 4 million records. That's gonna be so big! Whereas if I store the INTEREST-IDs in PROFILE under "interests" field (example: 3, 4, 8, 7) then that'll save space.

What do you guys think?

Thanks a million x 4!
 
saving space is an admirable objective, but not at the cost of performance

storing a comma-delimited string means that searches for specific interests require a table scan

go with the many-to-many table, it will also simplify the sql used for many queries

r937.com | rudy.ca
 
Holy! That's crazy. haha I'm gonna have to stare at that for an hour to sink it in.

Anyways, i'm in a ditch on this one.

I need to design a database that will hold all the different magazines, local publications, in each city of every country.

Seeing how that's gonna be that big, it's probably not advisable to have it all in 1 table.
The thing is, these magazines or local publications will have categories and subcategories.

There are 2 different aspects as well. Some magazine will be 'Area Specific' while others will not be.

I wanna separate these two.

I don't know how to design area specific database that will be organized in sub categories and be searchable by area.
(example: Magazine > Year 2007 > Furniture > New York City > Ikea) (okay fine, ikea shouldn't be there, but u get what i mean. hahaha

I'm thinking i will atleast need table for: Continent, Country, Province/State, City, Magazine Year, Magazine Types. Then i'm lost.

Should I have a table for each country and hold every magazine in all it's cities in there?
 
Actually, the thing is, it doesn't just stop at magazines. It will also be for other products. Like videos. It's also area specific because of languages and stuff....
 
Denis, I'm sorry to say this, but using a bitmap field is still use of multivalued columns, even if the operations to split those columns out are easier than text. For some people such as yourself, implementing this in the database is reasonable. But I don't think it's the best recommendation to make to people who aren't seasoned SQL masters.

lilboi,

Don't worry about the 4 million records--don't think of number of rows as directly correlating to final size or performance. Do keep the row size as small as possible, and with a proper primary key and indexes your performance should be fine, even with millions and millions of rows. Do NOT use multivalued columns. You will regret it later. It is abominably hard to query against multivalued columns. And slow. The only time a multivalued column is in order is when it is in fact a single value and the database never needs to know any of the pieces apart from the whole or query them individually. And that is almost never.

Seeing how that's gonna be that big, it's probably not advisable to have it all in 1 table.
Where are you getting this idea that it's not advisable? Can you give one good reason why it is NOT advisable to have it all in one table? Ignore the physical implementation for now. Design your database structure RIGHT.

Then, if performance is not good enough, you can row-partition your table into as many chunks as you need to. You can use multiple files or filegroups if you want. But do not create a separate table for each country unless you want to create a separate application for each country.


[COLOR=black #d0d0d0]My alarm clock causes time travel. When I hit snooze, no apparent time passes before the alarm sounds again, but the universe has in fact moved forward through time by ten minutes![/color]
 
thanks r937 and Esquared.

The reason why i'm considering of placing them in separate countries is to keep the size smaller so searches will be faster and that I will be indexing it by city or state.

If i design it like this:

t_MagazineDetail
--------------
MagazineID
Name
Publisher
NumOfPages
YearID
TypeID
CityID
ProvinceID
(and 10 more fields for magazine details)

And considering it's under this category "Magazine > Year 2007 > Furniture > New York City > Ikea" Where there's year, type of magazine, publisher in city(NYC)

which means i have, t_Year, t_Type, t_City, t_Province

So when I do a list search search, I show option of
Magazine > Choose Year
Then
Magazine > 2007 > Choose Type
Then
Magazine > 2007 > Furniture > Choose City or Choose Province
Then
Magazine > Year 2007 > Furniture > New York City

From there, I do a select * FROM t_MagazineDetail WHERE YeardID = 2007 AND TypeID = 123 AND CityID = 456
to get the magazines.

Considering what you said, to have less fields per row can I split it to this

t_MagazineDetail
--------------
MagazineID
Name
Publisher
NumOfPages
(and 10 or more fields for magazine details)

t_MagazineSearch
----------------
MagazineID (primary key)
YearID (indexed)
TypeID (indexed)
CityID (indexed)
ProvinceID (indexed)

Then do select MagazineID, name,... FROM t_MagazineSearch s, t_MagazineDetail d WHERE s.YeardID = 2007 AND s.TypeID = 123 AND s.CityID = 456 AND s.MagazineID = d.MagazineID

Will this be better?

My main concern is, this will be global. I'm only using magazine as an example but this number can be really big table if i try and fit all magazines in the world into 1 table.


Thanks!!!
 
Actually, Denis, I appreciate the Bitwise recommendation as well. Even tho i'm not that good at SQL, atleast I'm now aware of it and may employ it once I become better at it.
 
Yes that's better for performance (more rows per page) and is more normalized also, a huge improvement.

The reason why i'm considering of placing them in separate countries is to keep the size smaller so searches will be faster and that I will be indexing it by city or state.
You can get all the benefits of putting things in different tables by using partitioned views. Although I am not sure they're necessary if you have properly designed the indexes.

Any benefit you might gain from using separate tables, you can get from a partitioned view without the drawbacks of your application being designed around having to select from different tables. And you might be surprised at the performance even of one very large table. Keep your datatypes as small as possible for the largest table.

[COLOR=black #e0e0e0]For SQL and technical ideas, visit my blog, Squared Thoughts.[/color]
 
Hi ESquared,

So just to confirm, the second way of doing it, which is adding a Search table is a better design? It is OK to Index each ID in it as well?

can you elaborate what partitioned view is?

Thanks!
 
Yes, use a Search table. You can index each ID, but the optimizer may have trouble selecting the right one(s) to use. There is an index-tuning wizard which could help.

If Publisher is the same for a MagazineID no matter where it is distributed, and MagazineID can be in the Search table more than once, then Publisher should NOT be in the Magazine Search table--that is denormalized.

[COLOR=black #e0e0e0]For SQL and technical ideas, visit my blog, Squared Thoughts.[/color]
 
r937 , thanks so much for that link! it really gave me a better idea on how i should model my db

esquared, i don't think i listed publications in the search. But thanks for pointing out normalization to me. Totally forgot about the importance of that.


MUCHOS GRACIAS!!
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top