I'm not talking about system keys, which ARE calculated.
What kind of calculation are you going to do on an employee number.
Yes, it stores smaller and sorts faster as a number, but there are often othre problems. These are greatly amplified with identifiers like ZIP and Part "Numbers", where there are both "numbers" and alphanumeric strings. Its a mess to import and can be a problem sorting. In a string sort...
[tt]
1
10
2
20
...
[/tt]
Say, for example, that you have order IDs like this:
20060804000001
This is the year, month and day of the order followed by a 6-digit number attached to them in chronological order (yes, I know limiting it to 6 digits isn't a good idea, but that's another discussion).
It just seems so counterintuitive to store that in a VARCHAR field. I'll definitely output orders sorted by the Order ID, and the benefits to storing it as a number still seem greater. I'm not understanding why this would be a mess to import or why there would be sorting problems.
I'll do some Googling on this, but if anyone knows of some articles on it I'd be happy to hear about them.
Please, take my advice. Use varchar(10) (or any type of string storing data type). 8 years ago, I built an application that assumed zipcode were integers. A couple months ago, I sold a copy of the software to a customer in Canada. Canadian postal codes and american zipcode are VERY SIMILAR in concept (in that they assist the postal system in locating an address). Unfortunately, canadian postal codes are alpha-numeric. I'm told that *most* of the word uses alpha-numeric postal codes. Unfortunately, I am now in a situation of having to correct this problem in a VERY SHORT amount of time.
If I knew then what I know now.....
-George
Strong and bitter words indicate a weak cause. - Fortune cookie wisdom
Okay, I can follow the advice given here, but I still don't understand why. Sorry if I'm being slow about it, but why are characters more portable? Do you mean because the field can then take any value in the future? What if I absolutely know that the number will never be alphanumeric? Should that factor into the data type decision?
I have to make a recommendation to other people so I need to know what I'm talking about.
On a semi-related note, what is the best way to store boolean values? I try not to use anything proprietary, so I store these as number(1) or something equivalent.
I've been Googling this entire time and I haven't found any additional information. If anyone knows of some good guidelines or best practices for choosing data types, please let me know here. Thanks!
Assuming that you don't need to do calculations on the field, the only other reason for representing a field with a numeric data type is the ordering issue and that can usually be addressed with code of the form
Code:
ORDER BY Val([SomeField]), [SomeField]
Where "Val" is some function (varies by DBMS) that converts a character string to a Number. Many of the "license plate" (i.e. identifier) fields have other constraints such as "may contain digits or letters", must have leading zeros", "may contain spaces or special characters", etc. and none of those can be accommodated with a numeric data type.
Using a numeric data type can lead into discussions about attaching meaning to the value of the number but there is really no such meaning. The only relevance that the "value" has is that it is unique.
There are also overflow issues to consider with numerics.
If you have a field of the form "050605280069" (a typical UPCA 12-digit bar code) then it has some internal structure.
If it is numeric then the first digit (0) will be missing and you will have to convert it to a character string anyway to extract the sub-fields.
Incidently, building a substructure into such codes is usually a mistake and violates normalization rules which say that each field should be atomic and that means that no part of the field should have meaning (date fields excepted).
why is a string more portable than a number? because strings are almost always implemented as CHAR or VARCHAR datatype in whatever database system you might be porting to
however, if your column is declared as NUMBER, that will create a syntax error in some databases
for boolean fields, use CHAR(1) which is very portable or TINYINT if it's available
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.