Any 'is doublt-byte character' function?

iamanson · May 16, 2003

Hi,
Is there any way to find doyble-byte character in C#?
I tried IsLetter but it get both alphabet and double byte characters.

Thanks!

chiph · May 17, 2003

Usually, there is no such thing as a double-byte character in .NET -- it uses UTF-16 Unicode, in which *every* character is expressed by two bytes.

However, if you're reading a file or stream that is coming in as a multi-byte character set (something like Shift-JIS or UTF-8), then you'll have to write code to convert it to UTF-16, or use one of the built-in Convert methods (in the case of UTF-8).

Chip H.

iamanson · May 17, 2003

Thanks.
I think I did't write clearly.
I mean, for example, I need to extract all double byte character (something like Shift-JIS or UTF-8) but not alphabets from a file

How can I do that?

chiph · May 17, 2003

You'll need to learn more about the particular multi-byte encoding that you'll be receiving -- they all do it differently.

I haven't done any work with Shift-JIS, but in UTF-8, the first byte in a multi-byte sequence are always "11vvvvvv", with subsequent bytes "10vvvvvv". Take a look at:

http://czyborra.com/utf/

Google will be your friend in this, I'm afraid.

Chip H.

chiph · May 17, 2003

A quick Google search found this:

http://www.xkp.or.jp/eGuide/xkpgb11.htm

for Shift-JIS.

Chip H.

iamanson · May 18, 2003

thanks very much!

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Any 'is doublt-byte character' function?

iamanson

Programmer

chiph

Programmer

iamanson

Programmer

chiph

Programmer

chiph

Programmer

iamanson

Programmer

Similar threads

Part and Inventory Search

Sponsor