Consultor Eletrônico



Kbase P126322: How to convert to Unicode UCS2 values using Progress 4GL
Autor   Progress Software Corporation - Progress
Acesso   Público
Publicação   10/12/2007
Status: Unverified

GOAL:

How to convert to Unicode UCS2 values using Progress 4GL

GOAL:

How to get a Unicode value (not UTF-8) for a character using 4GL?

FACT(s) (Environment):

All Supported Operating Systems
OpenEdge 10.x

FIX:

This can be done using the UTF-16 support that Progress provides. UTF-16 is almost identical to Unicode UCS2, with the addition that it supports surrogate pairs. For both Unicode UCS2 and UTF-16 the most common 65000 characters are assigned the same code points in the code page.

Therefore in order to determine a Unicode character it is possible to evaluate the UTF-16 value provided this doesn't involve the surrogates.

As an example, in the following code a Euro '?' character is converted to 1252, to UTF-16, to obtain the Unicode UCS2 value. Then converted from this UTF-6 value back to a single byte 1252 value, via UTF-8:


DEFINE VARIABLE c0 AS CHARACTER NO-UNDO.
DEFINE VARIABLE c1 AS CHARACTER NO-UNDO.
DEFINE VARIABLE c2 AS CHARACTER NO-UNDO.
DEFINE VARIABLE c3 AS CHARACTER NO-UNDO.

/* Euro in 1252 */
c0 = CHR(128,"1252","1252").

/* Convert Euro in 1252 (Hex 80, 128 Dec) to Unicode/UTF-16 20AC */
c1 = CODEPAGE-CONVERT(c0,"UTF-16","1252").

/* Convert 20AC to UTF-8 encoding for Euro E282AC */
c2 = CODEPAGE-CONVERT(c1,"UTF-8", "UTF-16").

/* Convert E282AC back to 128 1252 Euro value */
c3 = CODEPAGE-CONVERT(c2,"1252", "UTF-8").

MESSAGE
"Euro (1252) : " c0 SKIP
"UTF-16 : (2 bytes - one is a space)" c1 SKIP
"Decimal UTF-16 Byte1 (Hex AC): " ASC(SUBSTRING(c1,1,1)) SUBSTRING(c1,1,1) SKIP
"Decimal UTF-16 Byte2 (Hex 20): " ASC(SUBSTRING(c1,2,1)) SUBSTRING(c1,2,1) SKIP
"UTF-8 (3 bytes) : " c2 SKIP
"1252 (1 byte) : " c3 SKIP
VIEW-AS ALERT-BOX.