Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Database Design Option 3

Status
Not open for further replies.

MasterRacker

New member
Oct 13, 1999
3,343
US
My formal education in database design is limited so I'm looking for insight to how a pro would approach this. Say I have a table of "items" where I'm storing a qty and a unit of measure for what the quantity represents (ie. lb, oz, quart, etc.). I want the units to come from a list of allowable values so I have them in another table - I've shown the architecture below. My question is: Assuming the codes are going to be unique values, what are the tradeoffs between simply using the code as the foreign key (choice A) and using an ID code (choice B)?

In commercial products I support I've seen this handled both ways. One thing I can think of is if the Items table is large and the unit codes lengthy, storing an ID might save a little space. I certainly don't know the ins and outs of how MS SQL Server does indexing - is there a possibility that one gives better performance than the other?

As a side question, both structures would be considered 3NF would they not?

Code:
Items
-----------
ItemID    PK
ItemDesc
ItemQty
ItemUnits FK

Units(A)         Units(B)
-----------      -------------
UnitCode  PK     UnitID   PK
                 UnitCode

_____
Jeff
[small][purple]It's never too early to begin preparing for [/purple]International Talk Like a Pirate Day
"The software I buy sucks, The software I write sucks. It's time to give up and have a beer..." - Me[/small]
 
aaarrr, matey, ye be askin' several questions there

the biggest benefit to using the unitcode itself as the FK is that it's immediately obvious what the unitcode is

you gave these examples yourself -- lb, oz, quart

when you retrieve data from the item table, you don't have to do a JOIN to find out what the unitcode is, it's right there

avast, a bleedin' no-brainer, and aye, ye be wastin' a few bytes per row, but i hear disk space is cheap nowadays (capn's first PC had a humungous 1-gig hard drive)

regarding the performance, only a benchmark can say for sure, but my money's on the query without the JOIN

yes, both are 3NF

r937.com | rudy.ca
 
You've hit upon one of the perennial arguments between DB pros - should the primary key be something that means something to the application (that could possibly change) or should it be an arbitrary value that means that the value the users see could change quite easily.

Both arguments have their pros and cons, and you will find details on both approaches on TT and across the rest of the web.

Both of your potential solutions for units are normalised to 3NF however, in that neither duplicate data.

Whether its relevant to your app - I've also seen the app have a "Base Unit" defined in the system configuration, and in the units table two extra columns - one for the rate to multiply or divide by to convert to the base unit, and another to indicate whether to multiply or divide (of course, the base unit itself it doesn't matter, and just has a value of 1). So, you define a metric ton as 1000 * kilograms or a gram as kilo / 1000 etc.

This gives you an easy way of creating an SP to provide a base conversion for descriptive reporting purposes.

John
 
Cool, someone actually paid attention to my silly signature! :)

I thought about the fact I could query the Items table on unit directly with no join just after I posted.

You're right about space, even in table with millions of rows, it only adds up to a few megabytes here and there, which is trivial by today's standards. I still think that way because I'm a dinosaur - I go back to the days before hard disks... Arrrrgh!

[pirate]

_____
Jeff
[small][purple]It's never too early to begin preparing for [/purple]International Talk Like a Pirate Day
"The software I buy sucks, The software I write sucks. It's time to give up and have a beer..." - Me[/small]
 
John, That base unit idea is interesting - I may be have a use for that concept.

_____
Jeff
[small][purple]It's never too early to begin preparing for [/purple]International Talk Like a Pirate Day
"The software I buy sucks, The software I write sucks. It's time to give up and have a beer..." - Me[/small]
 
... or should it be an arbitrary value that means that the value the users see could change quite easily.
if you're speaking in the abstract, then the answer to this apparent conundrum is ON UPDATE CASCADE, end of story


if you're talking about specifics, then sure, let's be practical and not use a natural key that is going to have a lot of churn

however, that is not the case in this situation

pounds are always going to remain pounds, ounces are always going to remain ounces, quarts are always going to remain quarts, and none of them is suddenly going to change to floozles

oh, sure, an individual item might take on a new unit of measure -- butter may now come in grams instead of pounds, but that's a change to the item, not the unit of measure, and the actual unit of measure, pounds, hasn't changed and will never change

reluctance to use natural keys should never override common sense

:)



r937.com | rudy.ca
 
My pleasure Jeff - it saved a previous employer in writing a lot of very complex report code.

Have a system parameters table (or wherever you store the defaults for the app as a whole) and store the code for the base unit (as in your units table), or else you need complex validation code on the units admin screen to ensure that only one row at a time can have a default unit flag set.

As I recall (its a while since I worked there) the logic is something like (I know this doesn't work properly, written off the top of my head without proper testing):

Code:
CREATE FUNCTION CalculateRate
  @Amount DECIMAL,
  @InUnit CHAR (4),
  @BaseUnit CHAR (4) RETURNS DECIMAL
 AS
Declare @Result Decimal
Declare @operand char (1)
declare @factor decimal

 IF @InUnit = @BaseUnit Then
   set @Result = @Amount -- Supplied base so don't bother doing the rest for speed
 Else
 Begin
   -- retrieve multiply (M) or divide (D) and conversion factor
   select @operand = operand, @factor = factor from units where
   unitcode = @InUnit
   -- based on multiply/divide and conversion factor, calc result
   if @operand = 'M' 
     set @result = @amount * @factor
   else if @operand = 'D'
     set @result = @amount / @factor
   else
     set @result = NULL -- error trap for invalid data entry
 end
 return @result -- and return it.

In a system where you can define a factor as part of a third factor, you need another layer to check the main layer in case the second layer is not the base unit.

John
 
In a system where you can define a factor as part of a third factor, you need another layer to check the main layer in case the second layer is not the base unit.

Can I have that as my sig? ;-)

Christiaan Baes
Belgium

"My old site" - Me
 
Interesting cooincidence - my original question comes from my side job, but in my main job just ran across something else to watch out for if using a natural key:

In an HR product we have a table of statuses: "Active", "On Leave", "Terminated". The actual code is stored in the employment records. At some point during implementation our HR dept changed the middle code from "OnLeave" to "On Leave". The app performs cascade updates so there is no orphan data out there however I just discovered a different problem: there are stored procedures and canned reports that query against that code looking for the value "OnLeave". Oops!

_____
Jeff
[small][purple]It's never too early to begin preparing for [/purple]International Talk Like a Pirate Day
"The software I buy sucks, The software I write sucks. It's time to give up and have a beer..." - Me[/small]
 
And on the 8th day, Bob created input parameters. And, lo, Bob saw that they were good.

;-)

< M!ke >
Your right to an opinion does not obligate me to take you seriously.
- Winston Churchill
 
on the first day, god create foreign keys, so that checking the text of a code would not be necessary

and she saw that it was good

and she forgives those unenlightened who hardcode table values

although the DBA might not

:)

r937.com | rudy.ca
 
Another consideration in all this is if you need to see what the value was historically. For instance people will often create a table with the pertinent data about a part and then let the part_number be the link to the other tables. But if you need to see how much you paid for that part two years ago, this is not the best solution.

Sometimes you need the lookup table to use for limiting the data entry and the data stored in the record so the values at the time the records were entered show the correct information for reports.

In the HR example above, suppose they decided they needed Terminated (for cause) and Terminated (Layoff) instead of terminated. Then if they changed the Terminated to Terminated (Layoff) and added the second, everyone who was terminated in the past would be marked as having been terminated for cause thanks to that cascading update.

There are situations where it is best to cascade update and those where it is not and situations where the join to the lookup table is what you need and those where it will give exactly the wrong information.

So really it is not just a performance or a normalization issue. It really has alot to do with how you need to access the data in the future and exactly what historical information you need to have when items in the lookup table change.

"NOTHING is more important in a database than integrity." ESquared
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top