Consultor Eletrônico



Kbase P157693: Characters are converted incorrectly when writing to MS SQL Server via a UTF-8 schema holder
Autor   Progress Software Corporation - Progress
Acesso   Público
Publicação   21/12/2009
Status: Unverified

SYMPTOM(s):

Characters are converted incorrectly when writing to MS SQL Server via a UTF-8 schema holder

Wrong data appears in MS SQL Server database when written via a UTF-8 schema holder

Example:

ABL session enters éèàïö
MS SQL Server database stores éèà ïö

FACT(s) (Environment):

Using UTF-8 schema holder
Fields in SQL Server database table are configured as varchar
OpenEdge 10.2A
Windows
MS SQL DataServer

CAUSE:


This is expected behaviour. Progress is sending data to MS SQL Server in UTF-8 format which is multi-byte (i.e. each character is 2+ bytes in length). A varchar type field does not support Unicode (multi-byte) characters.

[From SQL Server documentation]
varchar [ ( n | max ) ]
Variable-length, non-Unicode character data. n can be a value from 1 through 8,000. max indicates that the maximum storage size is 2^31-1 bytes. The storage size is the actual length of data entered + 2 bytes. The data entered can be 0 characters in length. The SQL-2003 synonyms for varchar are char varying or character varying.

nvarchar [ ( n | max ) ]
Variable-length Unicode character data. n can be a value from 1 through 4,000. max indicates that the maximum storage size is 2^31-1 bytes. The storage size, in bytes, is two times the number of characters entered + 2 bytes. The data entered can be 0 characters in length. The SQL-2003 synonyms for nvarchar are national char varying and national character varying.
[End]

When trying to insert a multi-type character into a single-byte field, the multi-byte character is split; each byte becomes a separate character. Hence "é" becomes something like "é".

FIX:

Convert fields of type varchar to nvarchar if using a UTF-8 schema holder