Kbase P139227: SUBSTRING function fails to return the expected result with single byte characters in a UTF-8 client
Autor |
  Progress Software Corporation - Progress |
Acesso |
  Público |
Publicação |
  29/12/2008 |
|
Status: Unverified
SYMPTOM(s):
SUBSTRING function returns unexpected results in a UTF-8 client
Using SUBSTRING on a string of bytes encoded in another code page in a UTF-8 session.
The function returns the expected result for some bytes, but not for others. For example:
DEFINE VARIABLE c1 AS CHARACTER NO-UNDO.
DEFINE VARIABLE c2 AS CHARACTER NO-UNDO.
c1 = CHR(191,"1250","1250") + "A".
c2 = CHR(192,"1250","1250") + "A".
message
"works : " asc(substring(c1,1,1),"1250","1250") skip
"fails : " asc(substring(c2,1,1),"1250","1250") skip
view-as alert-box.
The string of bytes are single byte values that are actually characters in a single byte code page.
FACT(s) (Environment):
All Supported Operating Systems
OpenEdge 10.x
CAUSE:
SUBSTRING uses the code page of -cpinternal, so the function does not perform any conversion. Any characters are read as if they are encoded using the code page of -cpinternal and the resulting substring is determined depending on where these characters begin and end.
If single byte values above 127 are to be used, then because these are invalid values in UTF-8 this can cause SUBSTRING to interpret the beginning and ending of character values incorrectly. So if at all possible the use of such bytes without clearly indicating in which code page they represent characters, should be avoided.
FIX:
To resolve this problem use the RAW character encoding type with the SUBSTRING function. For example:
DEFINE VARIABLE c1 AS CHARACTER NO-UNDO.
DEFINE VARIABLE c2 AS CHARACTER NO-UNDO.
c1 = CHR(191,"1250","1250") + "A".
c2 = CHR(192,"1250","1250") + "A".
message
"works : " asc(substring(c1,1,1),"1250","1250") skip
"fails : " asc(substring(c2,1,1),"1250","1250") skip
"works : " asc(substring(c2,1,1,"RAW"),"1250","1250") skip
view-as alert-box.
Alternatively, encode the string to be used with SUBSTRING in the code page of -cpinternal.