Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Datatype question

Status
Not open for further replies.

LarrySteele

Programmer
May 18, 2004
318
US
This is a very generic question about column data type selection.

Normally I define text fields as VARCHAR2(xx). One exception is for those fields that have fixed data lengths. A simplistic example would be US Social Security numbers, which are always 9 digits. I would define this as CHAR(9) rather than VARCHAR2(9). Any string less than 9 characters (other than 0) is invalid. Therefore, the only valid data entries would be 0 or 9.

In the example I gave, which is better:
VARCHAR(9)
or
CHAR(9)

Is one better than the other? If so, which and why?

TIA,
Larry
 
Larry,

I have not yet discovered a valid business reason to use CHAR datatype in Oracle, and your example is not an exception. Here is why:

VARCHAR(9) is preferrable since CHAR(9) offers no protection against entry of erroneous SSN values. If one incorrectly enters an SSN value, such as "2958" (into either column definition), additional checking needs to occur on the entry (e.g. "CHECK LENGTH(SSN) = 9 or LENGTH(SSN) is NULL").

Also, in either case, the storge consumption for any valid value is identical: length 9 or length 0, for both CHAR(9) and VARCHAR(9).

So there is no advantage to using CHAR(9) over VARCHAR(9).

[santa]Mufasa
(aka Dave of Sandy, Utah, USA)
[I provide low-cost, remote Database Administration services: www.dasages.com]
“Beware of those that seek to protect you from harm or risk. The cost will be your freedoms and your liberty.”
 
Thanks for the response. I recognize that CHAR(x) has nothing to do with data validation - that would be accomplished elsewhere (e.g. proc, etc). Just using it as a simplistic example.

In a case where the data is fixed, is there an advantage of using VARCHAR(x) or CHAR(x)?

How about this:
[tt]FIELD_1 VARCHAR(1) NOT NULL[/tt]

In this example, field_1 has a length of 1; no less, no more. Is there an advantage of using VARCHAR over CHAR for a column that cannot be variable length?

Thanks again,
Larry


 
TYPO's!!!

In my response, I meant VARCHAR2 not VARCHAR.
 
Here's one advantage I can think of for VARCHAR2 over CHAR.

In my SSN example, I said zero length strings are allowed. If I have a table of international people, the SSN field for the non-US people would be nine (wasted) spaces.

I suppose another advantage of using VARCHAR2 vs CHAR is that there's no *advantage* of using CHAR. In other words, if CHAR doesn't provide advantage, and VARCHAR2 works well in all occasions, then there's no need to get bent around the axle trying to figure out which data type to use.
 
Larry said:
the SSN field for the non-US people would be nine (wasted) spaces.
Actually, not...If a CHAR(x) column contains nothing (i.e., NULL), it behaves the same way as a VARCHAR2(x), taking up no data space. The only time there are wasted bytes is if there are data that exist in the column and the data are shorter than the maximum.

[santa]Mufasa
(aka Dave of Sandy, Utah, USA)
[I provide low-cost, remote Database Administration services: www.dasages.com]
“Beware of those that seek to protect you from harm or risk. The cost will be your freedoms and your liberty.”
 
I have seen a lot of people use CHAR rather than VARCHAR2 and cannot think of any situation where it has conferred an advantage. However, I can think of innumerable situations where it has caused problems because comparisons of an unnpadded varchar2 string with a padded CHAR string produces no matches.

In this case, I would have thought having VARCHAR2 rather than CHAR would give a positive advantage. That way you have an easy check for the SSN always being 9 characters. For VARCHAR2 if LENGTH(SSN) <> 9 you know it is invalid. For CHAR, even if only 8 characters are entered, the length will still be 9.
 
Dagon,

I agree that there are problems comparing VARCHAR2 to CHAR fields. Related fields should be defined the same. That's related to my statement "In other words, if CHAR doesn't provide advantage, and VARCHAR2 works well in all occasions, then there's no need to get bent around the axle trying to figure out which data type to use." If all character fields are defined as VARCHAR2, then you won't run into the problem of defining a field as VARCHAR2 in one table and a related field as CHAR in another table (been there, done that).

With regards to the ~fixed length~ fields like the SSN I used. Rules would be established to not allow entry of SSN's less than 9 characters. 8 would not exist, 7 would not exist, 6 would not exist, etc. So, if only 8 characters are available, then nothing would be entered. I stress this point because there are fixed length data elements and discussing data that violates business rules causes more problems than extraneous spaces. It also distracts from the base question: if a data element has a fixed length, what is the advantage of defining the column width as variable length.

Dave,

Thanks for the clarification of empty entries.

Interesting discussion, appreciate the responses.
 
Larry,

just to add my (or rather Tom Kyte's) three hap'orth, I consider the following to be (as near as makes no odds) definitive.


I therefore only ever use VARCHAR2, and in supported legacy systems, convert CHAR data types to VARCHAR2 whenever possible.

Regards

T
 
Thanks T. Saw that yesterday. The post reminded me of discussions I had with the DB2 DBA's. Their argument was to used VARCHAR for large fields; otherwise use CHAR. "Large fields" was very subjective, but generally around 20 chars or so. The reason I received was that CHAR's fixed width made it faster to read/write. However, for large fields, the space taken up by the padding outweighed the speed benefits and it was more important to use VARCHAR.

When I started working with Oracle, an Oracle DBA told me that when working with VARCHAR2 fields, Oracle has to read the field one character at a time to determine where the end of the field was. I've since learned that this was bum gouge and that Oracle stores the field length with the record/column for VARCHAR2.

Well, I wanted to know if there was any advantage using CHAR and it seems the consensus is that there is no advantage using CHAR. That makes things easier on me as a developer since I'm using only one data type for strings and it's good to know I'm not using a less efficient data type.

Thanks all who posted,
Larry
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top