Consultor Eletrônico

Status: Unverified

FACT(s) (Environment):

OpenEdge 10.x

SYMPTOM(s):

OUTPUT TO CONVERT does not convert from UTF-8 to Simplied Chinese GB2312.

Problems converting UTF-8 characters to Simplified Chinese GB2312

output to "c:\tmp\test.txt" convert target "gb2312" source "utf-8". does not convert characters correctly.

Chinese characters do not display correctly when viewed in Notepad.

The following code outputs the wrong value instead of the correct Chinese character:

DEF VAR cChar AS CHARACTER NO-UNDO.
cChar = CHR(15311758,"UTF-8").
OUTPUT TO c:\tmp\euro.txt convert target "gb2312" source "utf-8".
display cChar with frame a1 no-labels.
output close.

CAUSE:

The output of the program is correct and so the OUTPUT CONVERT statement is working as expected.

The character 15311758 in UTF-8 is a two byte value stored at position B7 E7 in the GB2312 code page.

The problem occurs when the file is viewed in a Western European Windows client using its own single byte code page 1252. The file is written in the GB2312 code page and contains a single double byte Chinese character (B7 E7). Subsequently Notepad thinks these are two separate characters and displays them accordingly, B7 (183 - middle dot) in 1252 and E7 (231 - small letter c with cedilla) in 1252.

FIX:

View the file in a Chinese font on a Windows client with the default Chinese code page GB2312. The output will then look correct.