Good day ppl, I require your assistance.
I am reading an rtf document, parsing it and puting text into database for further use. However that data is in ansicpg 1251
(Windows 3.1 (Cyrillic)). This poses some problems as I need to be able to either store the data in unicode or transform user's query to ascii so that I can search the database. I am however cueless how that can be done. Enlighten me someone
(exerps from rtf referense
Word 2002 is a Unicode-enabled application. Text is handled using the 16-bit Unicode character encoding scheme. Expressing this text in RTF requires a new mechanism, because until this release (version 1.6), RTF has only handled 7-bit characters directly and 8-bit characters encoded as hexadecimal. The Unicode mechanism described here can be applied to any RTF destination or body text.
\ansicpgN This keyword represents the ANSI code page used to perform the Unicode to ANSI conversion when writing RTF text. N represents the code page in decimal. This is typically set to the default ANSI code page of the run-time environment (for example, \ansicpg1252 for U.S. Windows). The reader can use the same ANSI code page to convert ANSI text back to Unicode. Possible values include the following:
I am reading an rtf document, parsing it and puting text into database for further use. However that data is in ansicpg 1251
(Windows 3.1 (Cyrillic)). This poses some problems as I need to be able to either store the data in unicode or transform user's query to ascii so that I can search the database. I am however cueless how that can be done. Enlighten me someone
(exerps from rtf referense
Word 2002 is a Unicode-enabled application. Text is handled using the 16-bit Unicode character encoding scheme. Expressing this text in RTF requires a new mechanism, because until this release (version 1.6), RTF has only handled 7-bit characters directly and 8-bit characters encoded as hexadecimal. The Unicode mechanism described here can be applied to any RTF destination or body text.
\ansicpgN This keyword represents the ANSI code page used to perform the Unicode to ANSI conversion when writing RTF text. N represents the code page in decimal. This is typically set to the default ANSI code page of the run-time environment (for example, \ansicpg1252 for U.S. Windows). The reader can use the same ANSI code page to convert ANSI text back to Unicode. Possible values include the following: