Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

varchar and latin characters or nvarchar? 1

Status
Not open for further replies.

glgcag1991

Programmer
Oct 15, 2007
72
US
I have a need to store certain non-English characters from Spanish, French or German infrequently. They will either be in client names or client project names, but it's not like it's a multilingual database. I've read that it shouldn't be a problem with varchar since it's based on ASCII and those non-English characters exist, but if I want to include characters from languages like Chinese, then I would need to use Unicode, hence nvarchar.

Am I shooting myself in the foot by using varchar in order to save the space in not storing it in nvarchar which takes up twice the space?

Lastly, in this case, what collation is recommended?
 
If all of your special characters are in this list, then you can use varchar:

Code:
! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~  € ? ‚ ƒ „ … † ‡ ˆ ‰ Š ‹ Œ ? Ž ? ? ‘ ’ “ ” • – — ˜ ™ š › œ ? ž Ÿ   ¡ ¢ £ ¤ ¥ ¦ § ¨ © ª « ¬ ­ ® ¯ ° ± ² ³ ´ µ ¶ · ¸ ¹ º » ¼ ½ ¾ ¿ À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï Ð Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü Ý Þ ß à á â ã ä å æ ç è é ê ë ì í î ï ð ñ ò ó ô õ ö ÷ ø ù ú û ü ý þ ÿ

Collations control sorting and comparisons. If you only have infrequent use of special characters, then there is probably no need to change your collation.

For example:

[tt][blue]E e è é ê ë[/blue][/tt]

When sorting, which should come first? For comparison, should they be equal? These are the questions that collations can solve for you.





-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
Thanks George! Yes, all the ASCII characters are sufficient, but the collation question was in regards to what the best way to search for characters is, in practice.

In the case of E e è é ê ë, since I'm using the default SQL Server 2005 collation of SQL_Latin1_General_CP1_CI_AS, it is case insensitive and accent sensitive, do you recommend changing to accent insensitive so it won't care about the "e" as it sees them all as the same? If I search with "nino", it will find a project named "niño" or "Ñino", right? Or, if the user enters "niño" in the search field, it will find "nino"?

This is what I want because the users that actually know how to correctly enter non-English characters will do it, but others won't, so I want it to return the title or name no matter the case or accent above the letter. (Assuming the users can actually spell it correctly to begin with!)

I've always been reluctant to change collation due to possibly causing other unwanted side effects.
 
I've always been reluctant to change collation due to possibly causing other unwanted side effects.

Me too!

What most developers don't realize is that you can use a Collate clause on a query.

Copy/paste this to a query window and you will see that the collate clause (on the comparison) allows you to match using a case insensitive collation.

Code:
-- Dummy up some data
Declare @Temp Table(Name VarChar(20))

Insert Into @Temp Values('nino')
Insert Into @Temp Values('niño')
Insert Into @Temp Values('Ñino')

-- Without a collate clause, 1 row returned.
Select * From @Temp Where name = 'nino'

-- with a case insensitive collation, 3 rows returned.
Select * From @Temp Where name Collate SQL_Latin1_General_CP1_CI_AI = 'nino'



-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
I didn't really answer your question.

I guess what I am suggesting is that you NOT change the collation on the column. If you change the collation, and you have any code that joins to this column, you will get collation errors. If your only concern is looking up the data, then I would use the collate clause like I showed you in my previous post.

Of course, it is possible that there's other code that compares the data in this column (like importing data) that you may not be thinking about. Overall, though.... I think the safest thing is to use the collate clause wherever it is needed. The trick is.... finding all the places in code where you should add the collate clause. That will likely be the hard part.



-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
OK, that is good advice because I wasn't sure if I was unduly careful. I haven't used COLLATE in a clause before because somewhere I read that you can't use it with indexes, but in this case, the names I'm searching for aren't indexed so it should be fine. Thank you!
 
somewhere I read that you can't use it with indexes

This is not true. Not exactly.


If you have an index on a column and you search within it, you will likely get an index seek, which is very good for performance. Without an index, you will get a scan, which can be bad for performance, especially for large tables.

Now.... if you have a column that is indexed and you use the collate clause, the index will be ignored and SQL will revert back to a scan (instead of a seek).

To be completely accurate, indexes can't be used with the collate clause (not the other way around).



-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top