Consultor Eletrônico



Kbase P69661: Error 51 results after loading data in UTF-8 format
Autor   Progress Software Corporation - Progress
Acesso   Público
Publicação   11/11/2008
Status: Verified

SYMPTOM(s):

Error 51 results after loading data in UTF-8 format

SYSTEM ERROR: sizditm -- invalid type (51)

FACT(s) (Environment):

All Supported Operating Systems
Progress 9.x
OpenEdge 10.x

CAUSE:

Data files, tablename.d, with malformed UTF-8 characters are able to be loaded into the database without error, however this will cause data corruption.

For example:

UNICODE is represented by the following standard format:
110xxxxx 10xxxxxx

So, the Latin Capital Letter A With Circumflex, "Â" , will be represented as:
"Ã,"

195(10) = C3(16) = 11000011(2) = "Ã" Latin Capital Letter A With Breve
+
130(10) = 82(16) = 10000010(2) = "," Single low-9 quotation mark

in other words:

11000011 + 10000010
= 00011000010 = C2 = 194 , which is a /valid/ unicode representation.

If this same character is not in UTF-8 format, it will be represented by:

 = 192 (10) = CO (16) = 11000000 (2)

WHICH IS NOT A VALID UTF-8 CHARACTER and as such will be loaded into the database incorrectly.

FIX:

The .d file must be dumped in UTF-8 format initially.

It is not enough to only change the: cpstream=1248, that the dump file is created with to: cpstream=UTF-8 for example. This will result in malformed character representation.