Kbase P69661: Error 51 results after loading data in UTF-8 format
Autor |
  Progress Software Corporation - Progress |
Acesso |
  Público |
Publicação |
  11/11/2008 |
|
Status: Verified
SYMPTOM(s):
Error 51 results after loading data in UTF-8 format
SYSTEM ERROR: sizditm -- invalid type (51)
FACT(s) (Environment):
All Supported Operating Systems
Progress 9.x
OpenEdge 10.x
CAUSE:
Data files, tablename.d, with malformed UTF-8 characters are able to be loaded into the database without error, however this will cause data corruption.
For example:
UNICODE is represented by the following standard format:
110xxxxx 10xxxxxx
So, the Latin Capital Letter A With Circumflex, "Â" , will be represented as:
"Ã,"
195(10) = C3(16) = 11000011(2) = "Ã" Latin Capital Letter A With Breve
+
130(10) = 82(16) = 10000010(2) = "," Single low-9 quotation mark
in other words:
11000011 + 10000010
= 00011000010 = C2 = 194 , which is a /valid/ unicode representation.
If this same character is not in UTF-8 format, it will be represented by:
 = 192 (10) = CO (16) = 11000000 (2)
WHICH IS NOT A VALID UTF-8 CHARACTER and as such will be loaded into the database incorrectly.
FIX:
The .d file must be dumped in UTF-8 format initially.
It is not enough to only change the: cpstream=1248, that the dump file is created with to: cpstream=UTF-8 for example. This will result in malformed character representation.