Consultor Eletrônico



Kbase P163321: Norwegian characters do not sort correctly with UTF-8 database
Autor   Progress Software Corporation - Progress
Acesso   Público
Publicação   4/7/2010
Status: Unverified

SYMPTOM(s):

Norwegian characters do not sort correctly with UTF-8 database

The sorting of the data looks like:

AA
Duck
Smith
Walker
Å
Ås
Æ
Ær
Ø
Ødegaard

This is a basic ASCII sort and is not correct for Norwegian. In addition to the general sort sequence being incorrect, in Norwegian AA = Å (and aa = å) and so should sort the same.

The 4GL client is running a query that is being resolved by an index on the database, but the results are not displayed in the correct order.

FACT(s) (Environment):

The correct Norwegian sorting of the data should be:

Duck
Smith
Walker
Æ
Ær
Ø
Ødegaard
Å
AA
Ås
The database uses the UTF-8 code page and basic collation.
The 4GL client session uses -cpcinternal UTF-8 -cpcoll basic.
In Norwegian AA = Å and so should not be sorted as the ASCII string "AA".
All Supported Operating Systems
OpenEdge 10.x

CAUSE:

The database is using the wrong collation.

FIX:

For Norwegian sorting of data the correct collation would be the ICU collations ICU-nn (Norwegian Nynorsk) or ICU-nb (Norwegian Bokmal).

To correct the problem load the database with the ICU-nn or ICU-nb .df file in the prolang directory. After doing this and rebuilding the indexes the database brokers and client sessions should start with -cpinternal UTF-8 and -cpcoll ICU-nb (or -cpcoll ICU-nn).