Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Problem with unicode characters - Mercator6.5

Status
Not open for further replies.

kashush

Programmer
May 2, 2003
38
US
I have a system in Mercator 6.5. It validates a file that may contain characters á, é, í, ó, ú and other european characters. The individual fields that are text have UTF-8. The delimiter between fields is literally a comma, western-native.

In one field the map cannot handle the ó and misses the next delimiter:
Trace from Before:
(Level 2: Offset 3637, len 7, comp 6 of 25, #1, DI 00000158:)
Data at offset 3637 ('Morón",') was found to be of TYPE
X'000C' (CITY Column_1 Store_Profile Data).

(Level 2: Offset 3645, len 0, comp 7 of 25, #1, DI 00000159:)
Data at offset 3645 ('","Dir Ops - J.V...') does not match DELIMITER ','
of TYPE X'0004' (Row Store_Profile Data).
___________________________________________________________
Trace from After:
(Level 2: Offset 3637, len 5, comp 6 of 25, #1, DI 00000158:)
Data at offset 3637 ('Moron') was found to be of TYPE
X'000C' (CITY Column_1 Store_Profile Data).

(Level 2: Offset 3648, len 20, comp 8 of 25, #1, DI 0000015A:)
Data at offset 3648 ('Dir Ops - J.VALL') was found to be of TYPE
X'000E' (SPACER1 Column_1 Store_Profile Data).


In another field I changed Santa Fé to Santa Fó and it went through:

(Level 2: Offset 1260, len 17, comp 21 of 25, #1, DI 0000007D:)
Data at offset 1260 ('Av. Santa Fó 195') was found to be of TYPE
X'001B' (ADDRESS_LINE1 Column_1 Store_Profile Data).

(Level 2: Offset 1289, len 16, comp 25 of 25, #1, DI 00000081:)
Data at offset 1289 ('L.SFE@ar.mcd.com') was found to be of TYPE
X'001F' (EMAIL_ADDRESS Column_1 Store_Profile Data).


Why am I having problems with the CITY field?
 
Are you sure that your data is UTF-8?

What are the hex values for Morón?

Mercator thinks that Morón", is seven bytes long, so I don't think the ó is multibyte. So I would change the UTF-8 type tree fields back to native or Latin1.

Are there any size restrictions on these fields?

 
ó=F3

Morón",=4D 6F 72 F3 6E 22 2C

The city column ranges from 0-52 characters.
It has an initiator and terminator of " in western native.
The field's languate is Western UTF-8.
 
Hi there,
I had a similar problem while locally testing a map.

As I recognized now, the cause of this problem was my Texteditor Ultra-Edit (with the option automatic recognise UTF-8). While opening the resultfile of Map A with UltraEdit, it destroyed the file and the following Map B, which used this outputfile as an input, wasn't able anymore to recognise the delimiter after a special character.

After I disabled this UltraEdit option it works as expected. Maybe this could help you...
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top