Consultor Eletrônico



Kbase 15627: Diagnosing codepage problems with Oracle DataServer
Autor   Progress Software Corporation - Progress
Acesso   Público
Publicação   11/21/2006
Status: Verified

GOAL:

Tips to use when diagnosing code page problems related to the ORACLE DataServer.

GOAL:

Debugging tips when using different code pages with ORACLE DataServer

GOAL:

ORACLE DataServer and Code Pages.

FACT(s) (Environment):

Oracle DataServer

FIX:

When troubleshooting extended alphabet related problems with ORACLE or other DataServer product, you must remember that many of the Progress standard features do not apply and the non-Progress database product has its own ways to accomplish codepages and linguistic collations.

Progress needs to know what code page the non-Progress database is set to. This is defined in the DataServer code page that can be set in the Progress Dictionary/Data Administration.

ORACLE handles code page conversions between database and client but the behavior in cases where a conversion is not possible is very different from Progress. If you try to do an impossible conversion, ORACLE places a question mark (?) in place of that character or converts the character to "closest equivalent" in the target code page. One example is trying to display data from
Database that is using "we8dec" and the client is using 7-bit USASCII:
In this case aumlaut displays as question mark and ecircumflex becomes normal 'e'.

Another situation is when you have a client on Windows connecting to an ORACLE database on UNIX. If the ORACLE client code page is WE8ISO8859P1, and the UNIX database is US7ASCII, you might encounter problem when using extended characters on queries.

Sorting in ORACLE is handled in 2 basic ways, language specific or binary. Language specific sorting sequences are basically equivalent to Progress -cpcoll settings and they correspond to common sorting order of certain language or language group.
Binary sort sorts characters by their binary (ASCII) values. When connecting from Progress to ORACLE database, it is the ORACLE sorting that dictates in what order records appear.

What Should be done to find out what is wrong ?

If characters are displayed incorrectly, there are some things you should check. The easiest thing to do is to figure out the data path from the database to Progress client and try to pinpoint the error.

To get the code page related information from the ORACLE database, do the following with ORACLE SQL*DBA (The $ represents UNIX shell prompt):

$ unset NLS_LANG
$ sqldba lmode=y
SQLDBA> connect internal ;
SQLDBA> select * from v$nls_parameters ;

This will list the NLS parameters of the database. The two most interesting are NLS_CHARACTERSET and NLS_SORT; they tell the code page and sorting sequence of the database. NLS_LANG environment variable can be used to override these settings and it may be necessary to unset it and other NLS variables.

ORACLE clients (Including Progress DataServer) use environment variables to override the NLS settings of the database. The most important of those is NLS_LANG. The syntax is NLS_LANG=language_territory.code page

For German, for example, it might be NLS_LANG = german_germany.we8dec".

Other interesting environment variables are NLS_SORT and NLS_DATE_FORMAT. Generally the names of the variables begin with "NLS_". The variables specify how the OCI (Oracle Call Interface) layer will convert the data coming from the database and what is actually "seen" in the application. The OCI code page is what should actually be in the Progress DataServer code page.

Because Progress defaults to case-insensitive sorting, the DataServer has to create uppercase equivalents of indexed strings that are in the database. These strings generally are garbage if the data is loaded with wrong code page settings and the changing of DataServer code page setting does not help in this case, the data has to be reloaded. Symptoms are that records are found using ORACLE tools but Progress does not find anything. To verify the situation, use ORACLE tools to view the uppercase field (U##field_name) and see if the characters stored there are uppercase equivalents of the characters in the actual data field.

In the case of the re.mote DataServer it is generally easiest to set the internal code page of the DataServer executable to be the same one OCI uses. The Progress client converts between database server and its own internal code page. For many languages ORACLE recommends using "we8dec" code page (DEC Multinational). This code page is quite close to iso8859-1 and in most cases the setup works if OCI uses "we8dec" and Progress DataServer code page is set to "iso8859-1". OCI can also convert between "we8dec" and "we8iso8859p1" (Oracle's name for iso8859-1) and it is possible to set OCI to use iso8859-1 when the database is using "we8dec".

One special case of consideration is when the ORACLE database is set to the default "us7ascii" code page and no ORACLE NLS parameters are set on the DataServer. In this setup everything seems to work. You can enter data into database and if you display the data the characters are correct.

Also SQL*Plus shows that the uppercased fields look OK. However, you are not able to find these records using extended characters as index key values from 4GL. This behavior is the result of how ORACLE SQL upper() function behaves when you have the code page set to "us7ascii". In this case you should have only 7-bit USASCII characters in your data and the SQL functions might have unpredictable results with extended characters.

In order to make this setup work you have to use 8 bit code page for the ORACLE database. If you only change the OCI to use "we8iso8859p1", ORACLE translates all extended alphabet to either question marks or equivalent USASCII characters efficiently corrupting data in some cases. This really is a situation to avoid because the only way to correct it is to re-create the database with correct code page and reload the data making sure it is loaded correctly into the database.
.