Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

One table with many columns or many tables with a few?

Status
Not open for further replies.

galorin

MIS
Nov 22, 2007
154
GB
I have a table with 48 columns in it. Most of the data is specific to each primary key, but there are several binary fields that could be pulled out and put into their own table, containing just the PK from the table where a specific binary field is true.

With a small dataset (3000 records, 1.5MB data length) with expected growth of ~1000 entries per year, does it matter?
 
if i said yes, it matters, would you change it?

why are you asking this question, anyway? if everything is working fine, what's the problem?

r937.com | rudy.ca
 
Partly for my own knowledge. Also I'd rather do the optimisation now, before going into full production, rather than later and disrupt workflow if not having the optimisation proves a real detriment to productivity.

What sort of impact would it have, considering that there are 5 of these binary-style fields, each independent of the other 4?
 
binary fields are often used for attributes such as isFeatured or isClosed or isMarried

by splitting the "1s" off into their own table, and leaving the "0s" out, you'd have to have one table for each such binary field

then in order to "reconstruct" the main row with all of its attributes, you need that many LEFT OUTER JOINs

excessive overhead, in my view

:)

r937.com | rudy.ca
 
I used to have the same questions. Then one day I was called upon to create a small app in access. It was a text-book type app that kept track of donations (orders), with forms for entering and reports to notify it's time to ask for more donations.

My point is, after 4 re-writes I had a database where I had just the right number of tables and relationships so that I could create forms and reports in no time, without a single tweak. Sounds minor, but it was the best learning experience for understanding normal forms. That's what you're looking for is a normalized database. Once you have that, any code you write around it will be much easier, no matter how small your app is.

The advantage to using access for the exercise is it forces you to have a solid table design, otherwise it makes your life miserable.

My 2 cents.
 
For a small number of binary fields use an int or bigint datatype and assign 1 bit per binary field. Use the bitwise operators to read/write the appropriate bit for each value. See
___________________________________________________________
If you want the best response to a question, please check out FAQ222-2244 first.
'If we're supposed to work in Hex, why have we only got A fingers?'
Drive a Steam Roller
Steam Engine Prints
 
yes, but doing bitwise selection perforce requires table scan, yes? because the nth bit cannot be indexed

r937.com | rudy.ca
 
On the basis of the data sizes given I didn't see that as a particular problem, but I could be wrong.

___________________________________________________________
If you want the best response to a question, please check out FAQ222-2244 first.
'If we're supposed to work in Hex, why have we only got A fingers?'
Drive a Steam Roller
Steam Engine Prints
 
I rekon I'll tweak things slightly, but leave things as-is. Thanks for all yall's input
 
With a small dataset (3000 records, 1.5MB data length) with expected growth of ~1000 entries per year, does it matter?

Let's see. That is 0.5kB or 512 Bytes per record. That is not that much. Or do you mean that each row has 1.5MB?

And I assume that by "binary fields" you mean BLOBs (Binary Large OBjects). Big fields.

Anyhow, increasing complexity out of fear will bring you nowhere. If there are problems, there are a lot of things you can do. Most of them do not involve the table structure. Why? because if anything is slow, it is probably searching through these fields or the network traffic that is generated by getting that data to your application.

Searching though the fields can be targeted with proper indexing or even extra checksum fields if that is necessary. You can imagine that it does not matter in how many tables you must search, except that more tables add more complexity.
The network load can be reduced by caching or compression (if the database is far away, not very helpful on a local network). If your database is running on the same machine as the application, a socket is probably used anyway (lookup the difference between 127.0.0.1 and localhost in the MySQL manual).

In short, premature optimization will usually bite you, especially if you don't know what you are optimizing for.


+++ Despite being wrong in every important aspect, that is a very good analogy +++
Hex (in Darwin's Watch)
 
I will say, I've started many projects as relational with multiple tables. By version two of the apps, they have been moved to single table (where applicable).

For example...An emergency room tracking system.
Main table contains patients and their info. They key is account number.
A secondary table stores timestamps of what actions occur. Registration, moved from the Waiting room, the doc saw the patient, they're admitted, they're discharged, etc.

I found it much easier to move the second table data to the main table. It eliminates the join and makes reporting so much simpler. Now instead of querying the second table twice for the length of stay (discharge minus registration), I only have to query the main table once (pat_stat_8 - pat_stat_1).

On the other hand. I started a Helpdesk application several years ago with a single table which only allowed a finite number of people you could assign a ticket to (6). I thought that was enough. When I wrote version two, I created an assignment table which allowed for as many assignments as you wanted.

Just try to think ahead. What will make sense in the long run. Trust me, it sucks changing that much code for a small database change.

As for the size of the tables you're using, speed should not be a factor either way. Just try to reduce the headaches on the reporting end.

Mark
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top