How to suport unicode 1

Lorey · Sep 15, 2005

Hi Experts,

We will modify our System (done in VC++, SQL 8, ODBC environment) to support unicode characters.

Please help me identify things to modify aside from putting _T in all string literals.

Im concerned in our database.

Skute · Sep 16, 2005

any functions u call in the crt may need to be replaced with the tchar version:

strcpy - _tcscpy
strcat - _tcscat
strchr - _tcschr

etc

Skute

"There are 10 types of people in this World, those that understand binary, and those that don't!"

code · Sep 16, 2005

Hi,

Check for all the windows API's that have been used. You should use the API's that have a post fix of W to them else there will be wrong outputs for unicode from ANSI API's.

cheers
C

Skute · Sep 16, 2005

u dont need todo that.

Just use TCHARs everywhere, the windows API can determine which function to use (ie, MessageBoxA or MessageBoxW) by whether you have _UNICODE defined.

You should never hard code for a specific character type unless you have a good reason to.

Skute

"There are 10 types of people in this World, those that understand binary, and those that don't!"

PerFnurt · Sep 21, 2005

>You should never hard code for a specific character type unless you have a good reason to.

I would say the opposite:

You should only have ambiguous character types if you have a good reason to.

/Per
_{www.perfnurt.se}

Skute · Sep 21, 2005

Thats not exactly what i meant by my comment Per.

What i meant was, you should never actually write in your code:

MessageBoxA(hWnd, "Hello World", "ANSI MessageBox", MB_OK);

Unless youve got a good reason to.

And in response to your comment, why would you intentionally only support ANSI and not UNICODE? It is no more effort to support UNICODE, just involves typing TCHAR instead of char.
The only area you need to be careful of is pointer arithmetic.

Skute

"There are 10 types of people in this World, those that understand binary, and those that don't!"

PerFnurt · Sep 22, 2005

As I said - only have ambiguous character types if you have a good reason to. I would define switching the codebase from ANSI to UNICODE as a good reason....

When producing new code however (targeted at UNICODE) I try to avoid the ambiguity with T stuff.

For example: Just by looking at T("Foo") I can't tell if it is ANSI or UNICODE - I have to go look in the compiler settings, while "Foo" and L"Foo" is crystal clear.

/Per
_{www.perfnurt.se}

Lorey · Sep 27, 2005

Experts,

Just want your ideas....

We're still arguing if we need to change all varchar type in our SQL Server database to nvarchar (to support unicode), or just only those values that are changing (like user inputs).

can we leave other fields that are system generated to varchar?

I want to know the impact of changing all to nvarchar to speed, memory space, or any other relevant issues.
Coz if there's no difference, It will be very easy for us to change all to nvarchar rather than selective.

please help.

xwb · Sep 28, 2005

I think it is better to be selective: just do the output strings. A global change, although easier can really mess up on silly things like dates. It can also double the size of your database. You will always end up with some bum routine which will only take varchar instead of nvarchar.

Quite often, instead of arguing about it, just try it - it is a lot quicker. Changing everying from varchar to nvarchar isn't difficult and you'll know straight away whether or not it works.

We used a lot of unsigned chars because we were using mbcs to wchar. Changing everything to TCHARs didn't really help because all the strcpys were casted from unsigned char* to char* so the compiler didn't spot any problems with the code. It wasn't until we started running the code and only got half the string that we realized that all these unsigned char* had been casted to char*.

It took 7 developers about 4 days to do about 500 files. It took about 3 months to weed out most of the casting and sizeof problems. If there is no casting, it is dead simple.

Beware of things like RTF import - they only take chars. RTF doesn't like TCHARs.

Lorey · Sep 28, 2005

Thanks xwb for such a generous and substantial ideas.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

How to suport unicode 1

Lorey

Programmer

Skute

Programmer

code

Programmer

Skute

Programmer

PerFnurt

Programmer

Skute

Programmer

PerFnurt

Programmer

Lorey

Programmer

xwb

Programmer

Lorey

Programmer

Similar threads

Part and Inventory Search

Sponsor