Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Database Normalization 8

Status
Not open for further replies.

TimBiesiek

Programmer
Nov 3, 2004
151
AU
Hi All,

Am currently creating a new DB. At the mo there are about 27 odd tables, not including the archive tables, and log tables we plan to have, so it is quite a large DB.

I want to normalize the tables in this DB, but a workmate is saying we should have a certain field that relates to the owner of the data in every table, and sub table. I don't see this as neccessary, as this would de-normalize the tables, and the owner can be gained by going back through the table joins anyway.

E.g.

Companies Table - This shows info about the companies
Clients table - Clients are 'owned' by companies
Properties table - Properties are supplied by clients
Meters table - Meters belong to properties

Workmate thinks that the company ID should exist in the Clients table, Properties table and Meters table. However, the property the meter belongs to can be found, as can the client the property is supplied by, as can the company the client is 'owned' by...

Thoughts on this? Which would be the best way to go?
 
which way to go? depends on whether you always need to know which company a property belongs to, but not necessarily which client, or which client a meter belongs to, but not necessarily which property

you do not normalize in a vacuum, you normalize/denormalize to do both of the following simultaneously --

1) minimize redundancy
2) maximize performance

r937.com | rudy.ca
 
TimBiesiek,

Theorecicaly you do not NEED the company link in the Property or the Meter table. Because as you mentioned yourself you can run an SQL to link from Meter table back to a Company.

However, as r937 already mentioned, if for example the Meter-Company reference is frequently accessed DIRECTLY, without additional needed client or property data, the redundancy for the repeated companyId cost might be lower than the performance cost.

Additionaly, redundancy requires a higher maintenance cost, but as long as the foreign key references a primary key, you should not worry about it too much.

General conclution:
You shuld only put the such a non-normalized reference in lower level tables if you are sure it will enhance the performance.

Regards,
Johpje
 
Normalizing is all well and good but sometimes you have to purposely brake the rules to speed things up or retrieve data faster without joining 10 tables together. However, if you are planning on building views so the data can be retrieved easily having extra fields may not be necessary. Sometimes you have to give in to certain people's requests even if they dont make sense from a Normalized point of view. Often some departments want some redundancy to separate departments data they perceive a security need or a privacy need like in the case of accounting, or payroll.

If you do not like my post feel free to point out your opinion or my errors.
 
Tim,

Stick with your view (for as long as you can) - it is correct.

The rest of the responses are not necessarily incorrect (in certain circumstances), but why on earth post them when issues that they refer to were not quoted as a problem in your post?

It seems that most people want to 'break' away from correct methodology just for the sake of not conforming.

Why on earth duplicate information, that can be reached via a single correctly normalized table? What about the added 'cost' of ensuring synchronisation and of future maintenance / development? It is not 'normal' to duplicate data, therefore future development may well miss these duplicate fields (if you don't spend extra time on clearly documenting this break from good methodology).

Of course it's 'easier' to put all fields into one table, this dispenses with the need to think about table relationships, queries etc. This is of course pathetic, as is duplicating such a field.

I do not agree with ceh with the reference to 'breaking the rules' to speed up performance. If speed is an issue, then extra resources are the solution - not breaking basic normalization rules.
(That's like saying if a car isn't fast enough - get rid of the weight factor of brakes and seat-belts).

What you 'feel' to be correct IS correct. Don't get too 'stick-in-the-mud', but neither should you dispense with correct 'practise' just to cut corners that cost more in the long-run (even though your superiors can't see that due to ignorance / lack of knowledge).

ATB

Darrylle


Never argue with an idiot, he'll bring you down to his level - then beat you with experience. darrylles@yahoo.co.uk
 
Why on earth duplicate information, that can be reached via a single correctly normalized table?

Because it's easier for me to denormalize and write the necessary protective code than it is for me to teach the muppets in marketing how to join eight tables together when they want an ad-hoc query.

Don't forget that normalizing was invented 30 years ago when disk space was phenomenally expensive and nobody went anywhere near a database without equally expensive training.

In the 70's I'd have normalized an address to the extent of putting "County" and "Town" names in seperate relations and storing TownID and CountyID in the address record. That was worth doing to save 20 bytes per record back then but nobody normalizes to that extent these days.

Geoff Franklin
 
Well guys, all very valid points! Thanks a ton! It's given me a bit of ammunition for normalizing, and also de-normalizing!

For the time being, as we are only in the design stage at the mo, I will propose my method of normalized tables and see what kind of reaction I get. Then if I need to, will back it up with the advantages...

Again, thanks!
 
Al,

We didn't have 'taut-liners' back then either, but it doesn't stop companies from designing product container shapes to be packed more efficiently in these bigger lorries - nowadays.

You talk only about 'storage-space'.
Along with your 'protective code', do you fully document this different path to default methodology?
Is it really worth it?

So, you might have 4 tables with 'Town & County' in there?
Don't you have to 'code' around the fact that those 4 tables need to be updated every time 'Town and County' changes? Don't you have to document this very clearly in 5 different places (more duplication), for any future DB admin who replaces you?
When the application changes, in relation to 'Town & County', you have to REMEMBER where else that change takes place, unless of course you have documented identical information in EVERY place that it will be affected (many-to-many reference for documentation also).
You have so much duplication (and this is ONLY for Town and County info), what about the rest of your database fields?

We ARE talking 3rd normal form here - not 5th. Would you have us have a single table in a database, because that IS where you're taking it.

If 'muppets' have access to information that they can't understand, then they don't get access - simple. They need training - from you perhaps.
If they are not capable of learning where their key data is, then someone in particular should have responsibility for producing those ad-hoc queries/reports (one person per department maybe?).

De-skill your skill like this, one 'normalization form' at a time if you wish. With your logic however; I could argue that you should be creating one table databases - like a lot of managers do - and then they say "database development is easy, what's the big deal?"

If ad-hoc users can't get their heads around the fact that clients are contained within the tblClients table, then they should not be creating ad-hoc reports - they haven't got anywhere near enough knowledge to be given this kind of task/responsibility.
Either take responsibility of ALL report creation, or organise a training structure for key employees per department.

Al, I'm not arguing with you over this, but you are almost sounding like a self-taught 'db admin'. It just does not make sense - business or otherwise.

Regards,

Darrylle
















Never argue with an idiot, he'll bring you down to his level - then beat you with experience. darrylles@yahoo.co.uk
 
Hmmm, I tend to agree with Darrylle.... If people don't know what the hell they are doing, they shouldn't be anywhere near the database!

Having one person per department responsable for ad-hoc queries seems like the way to go. We have a similar situation. We've got one database (MS Access, with Access front-end too - the one I'm currently redesigning in MS SQL and VB.NET!!) that one guy is responsible for all updates, changes, major additions etc, as well as ad-hoc queries.... he's a very busy person...

We then have another DB (MS SQL with various front-end applications!) that was created by my team and I in the last year, and our team is responsable for any changes, modifications and ad-hoc queries. No one else in the organisation has access to this database, except through what we allow them with the applications.

This may be harder to implement in a large organization, but we seem to handle it well, and there are about 40 people that have access to these databases, all needing ad-hoc reports or queries every so often...
 
Tim,

I believe that the larger the organisation, the easier it is to implement.
Centralising control of in-house applications is much more easy to justify to the board, and hence down through the management structure.

Reporting can give whatever 'result' depending on HOW you display it and in which context.
If reporting is produced centrally, then you get a 'standard' reporting style - with 'standard' results which can be measured against each other.

I think that Al was talking more about departments knocking reports together for departmental reference rather than management reporting however, and in this case, it takes very little I.T. resource to produce these simple reports centrally also.

If it's a small company, well then - there's the rub.
You HAVE to convince the owner NOT to de-skill your skill. Normalization IS correct - no matter WHAT technology is available.
If you have light-speed cpu's, light-speed network connections, light-speed memory and light-speed hard-disks - you are still creating inefficient database structures if normalisation is ignored.
Future maintainability then becomes THE killer argument.

WE can't think or type at light-speed, and WE maintain/develop the apps.

ATB

Darrylle







Never argue with an idiot, he'll bring you down to his level - then beat you with experience. darrylles@yahoo.co.uk
 
Future maintainability then becomes THE killer argument.

Search for 'Platypus Database' in the Overcoming Obstacles forum if you want to see how bad things can get when normalization rules are ignored.

The database I'm working with ANYONE who needed a table in the past 25 years has created the one they need. Doesn't matter that the information they are using is 99% already there, they have to create a new table with 99% duplication so the one new piece of information can be used.

IT'S UGLY!! Stop de-normalization now!!

leslie
 
Well, I've just learnt a new word today, Chimeric.... Thanks Lespaul! That definately sounds like one heck of a database!!!!

You don't want to send the database to me, I'll normalize it and normalize it some more, then send it back to you? For a small fee of course... [rofl2]
 
Lespaul said:
Stop de-normalization now!!

I don't think that you have a de-normalized database.... it sounds more like a database that has never actually been designed... I was under the understanding (Shoot me if I'm wrong) that a de-normalized database is a database that has been normalized, then proved to be too slow and has hence been de-normalized to improve performance.

I'm certainly not advocating de-normalizing at all!

PS. writing normalization is quite difficult for me, being a New Zealander, we spell it with an s... normalisation..... Have to get used to putting the z in there! lol!
 
There's actually a follow up thread ("The Platypus Hunter's Revenge" I think) from when I got assigned the project of designing a new table structure.

The lack of normalization in my case is referring to duplication of data. It's really frustrating to get 2/3 done only to be told that the data you are getting isn't from the correct table. Well, hell, if it's in three places how do you know which one is right?! And even if it is in three places shouldn't they all have the same information?

Unfortunately, the "meat" of the court (hearings and trial events) tables and systems are just JACKED!! Nobody actually went to see what happens in the courtroom, they just designed a system! Before I could start determining the best way to structure the tables that are going to hold the information, I kinda wanted to see what was going on. The whole current process wastes time, energy and money. Now that I have a better idea of what's going on, we're going to start on the table structure for that portion of the court.

Since I'm the only person working full time on this project (analyzing and design the tables for every part of the court: hearings, sentencing, compliance, school registration and tracking, probation, bonding, etc. - there's one other person who assists when he's not working on other stuff) I expect it to take a REALLY long time to make significant progress. Yes, I did say that I am the ONLY person working full time on this re-engineering analysis of a Criminal & Civil courthouse with 25 years worth of band-aids applied to it!

However, a bit of sunshine, my boss is leaving for another job!!! Maybe our new manager will actually manage!



Leslie

Anything worth doing is a lot more difficult than it's worth - Unknown Induhvidual
 
Lol!

At least your pretty much guarenteed (Sp?) a job for a while anyway! Job security at it's best!
 
PS. Just hope that your current boss isn't a member of Tek-Tips, and especially this database forum! [wiggle]
 
I'm a state employee, that's job security right there!!

And no, he doesn't participate in the forum, and I have no doubts that he knows exactly what I think!!

Leslie

Anything worth doing is a lot more difficult than it's worth - Unknown Induhvidual
 
alvechurchdata said:
Because it's easier for me to denormalize and write the necessary protective code than it is for me to teach the muppets in marketing how to join eight tables together when they want an ad-hoc query.

Is a key portion to Al's position. Normalized databases arenot optimized for reporting they are optimized for transactional processing. OLAP Databases are designed for reporting the methodology between an OLTP and OLAP designs are opposites of eachother. IN OLAP Denormilization is a common and almost required practice to achieve an efficient star schema that is optimized to report against. Even the index strategies are different.

And when it comes to having a system that is optimized for reporting the size of the enterprise is a large factor. If the resources do not exist to build a reporting system along with the transactional system then I see nothing wrong with building a hybrid that gives better optimizations to the system than a 3NF system would. I have seen some of my clients laugh at the idea that they need to invest in a DW to only turn around and complain that the reporting against the current system is unacceptable.

If you in an orginization were the prospects of implementing a seperate Reporting based solution then you need to design a system that is Optimized for the users and if that means a little denormilization than so be it. So it requires the developers to document a bit more and code in the pieces that make sure all records are updated when one is changed. To what degree this is done needs to be based upon user requirements and system availability.

Shoot Me! Shoot Me NOW!!!
- Daffy Duck
 
Happening across this one late...

Generally speaking, I'd build it fully normalised first time. If, later, I find that reports run unacceptably slowly, I'd consider some de-normalisation. Build it "right" first, then compromise if you have to. If you corrupt the structure from the start, to avoid imagined performance hits, you're making life difficult for yourself in othr ways.

If you are considering this route, consider this - how likely is it for a client to move between companies? That's a big cascade of updates every time you make a change.

If you're worried about muppets' ability to report on a normalised structure, create a view (or a "query" in Access) that hides the table joins and just gives end users what appears to be the giant de-normalised table that they want.

-- Chris Hunt
Webmaster & Tragedian
Extra Connections Ltd
 
Thanks Chris!

However, I fear that these 'Muppets', or Seagull Managers (Fly in, make a lot of noise, crap on everything, then bugger off!) are over-ruling me, and want to denormalise, even after I have given them plenty of articles to read on the subject....

There's just no way I could have got them to listen to me. I think they had already made up their minds before they got me to design the database...
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top