Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Proper Data Type for Zip Codes? 2

Status
Not open for further replies.

Glowball

Programmer
Oct 6, 2001
373
US
I was wondering what everyone thinks is the best data type to use for zip codes. I've always used this for 5-digit codes:

INT(5) UNSIGNED ZEROFILL

But then I got into a discussion with another coder who uses CHAR(5).

Thoughts?
 


Hi,

NUMBER types ought to be used for calculated values.

"Numbers" like ZIP Codes, Employeed "Numbers", Invoice "Numbers" etc, are for IDENTIFICATION, and as such, ought to be STRINGS.

ZIP Codes can be 5 characters or 10 characters (xxxxx-xxxx format)

Skip,

[glasses] [red][/red]
[tongue]
 
Really? What if I want to sort by employee number or invoice number? Wouldn't that slow things down?

If you know of a place where I can read about this further I'm all ears. Thanks!
 



I'm not talking about system keys, which ARE calculated.

What kind of calculation are you going to do on an employee number.

Yes, it stores smaller and sorts faster as a number, but there are often othre problems. These are greatly amplified with identifiers like ZIP and Part "Numbers", where there are both "numbers" and alphanumeric strings. Its a mess to import and can be a problem sorting. In a string sort...
[tt]
1
10
2
20
...
[/tt]


Skip,

[glasses] [red][/red]
[tongue]
 
Say, for example, that you have order IDs like this:

20060804000001

This is the year, month and day of the order followed by a 6-digit number attached to them in chronological order (yes, I know limiting it to 6 digits isn't a good idea, but that's another discussion).

It just seems so counterintuitive to store that in a VARCHAR field. I'll definitely output orders sorted by the Order ID, and the benefits to storing it as a number still seem greater. I'm not understanding why this would be a mess to import or why there would be sorting problems.

I'll do some Googling on this, but if anyone knows of some articles on it I'd be happy to hear about them.
 
Glowball,

Please, take my advice. Use varchar(10) (or any type of string storing data type). 8 years ago, I built an application that assumed zipcode were integers. A couple months ago, I sold a copy of the software to a customer in Canada. Canadian postal codes and american zipcode are VERY SIMILAR in concept (in that they assist the postal system in locating an address). Unfortunately, canadian postal codes are alpha-numeric. I'm told that *most* of the word uses alpha-numeric postal codes. Unfortunately, I am now in a situation of having to correct this problem in a VERY SHORT amount of time.

If I knew then what I know now.....

-George

Strong and bitter words indicate a weak cause. - Fortune cookie wisdom
 
Ahh yes, good point! I'll store the Zip Codes as strings. I'm still not convinced about the other ID numbers but I'm still researching.
 
I think the consensus is:
unless you have to do math with, don't store as numeric.

Characters value are more portable than numeric.

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ181-2886
 
Okay, I can follow the advice given here, but I still don't understand why. Sorry if I'm being slow about it, but why are characters more portable? Do you mean because the field can then take any value in the future? What if I absolutely know that the number will never be alphanumeric? Should that factor into the data type decision?

I have to make a recommendation to other people so I need to know what I'm talking about.
 
I meant portable across different RDBMS.

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ181-2886
 
A string is more portable than a number if you're moving to a different RDBMS? I don't see why...
 
On a semi-related note, what is the best way to store boolean values? I try not to use anything proprietary, so I store these as number(1) or something equivalent.
 
I've been Googling this entire time and I haven't found any additional information. If anyone knows of some good guidelines or best practices for choosing data types, please let me know here. Thanks!
 
Assuming that you don't need to do calculations on the field, the only other reason for representing a field with a numeric data type is the ordering issue and that can usually be addressed with code of the form
Code:
ORDER BY Val([SomeField]), [SomeField]
Where "Val" is some function (varies by DBMS) that converts a character string to a Number. Many of the "license plate" (i.e. identifier) fields have other constraints such as "may contain digits or letters", must have leading zeros", "may contain spaces or special characters", etc. and none of those can be accommodated with a numeric data type.

Using a numeric data type can lead into discussions about attaching meaning to the value of the number but there is really no such meaning. The only relevance that the "value" has is that it is unique.

There are also overflow issues to consider with numerics.

If you have a field of the form "050605280069" (a typical UPCA 12-digit bar code) then it has some internal structure.

[li]Digit 1 - Industry Code[/li]
[li]Digits 2-6 - Manufacturer Code[/li]
[li]Digits 7-11 - Unique Number[/li]
[li]Digit 12 - Check Digit[/li]

If it is numeric then the first digit (0) will be missing and you will have to convert it to a character string anyway to extract the sub-fields.

Incidently, building a substructure into such codes is usually a mistake and violates normalization rules which say that each field should be atomic and that means that no part of the field should have meaning (date fields excepted).
 
why is a string more portable than a number? because strings are almost always implemented as CHAR or VARCHAR datatype in whatever database system you might be porting to

however, if your column is declared as NUMBER, that will create a syntax error in some databases

for boolean fields, use CHAR(1) which is very portable :) or TINYINT if it's available


r937.com | rudy.ca
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top