Kbase 16097: Diagnosing codepage related problems
Autor |
  Progress Software Corporation - Progress |
Acesso |
  Público |
Publicação |
  5/10/1998 |
|
Diagnosing codepage related problems
DIAGNOSING CODEPAGE RELATED PROBLEMS
====================================
INTRODUCTION
============
This technical support knowledgebase entry describes
the basic checks you have to do when diagnosing codepage
related problems.
WHY DO YOU NEED TO KNOW THIS ?
==============================
Codepage related problems can be extremely complex to
diagnose because of the fact that there are many possible
points of failure. If you cannot view non-USASCII characters
correctly you cannot just assume that checking and correcting
one error makes your display correct. In many cases there
are two or more things unknown or wrong in the setup and they
interfere with each other. Before starting this investigation
you should be familiar with the Progress character processing as
documented in the manuals (V6: Programming Handbook chapter 2,
V7 and V8: System Administration Guide, appendix A). You should
also know what codepages your devices (Terminals, printers etc)
use. To keep this notebook reasonably small, no instructions
are provided on how to correct the problems you encounter. That
information can be found in the Progress documentation and in
the documentation supplied with the relevant third party products.
PROCEDURAL APPROACH
===================
As in all troubleshooting, try to narrow down the problem by
eliminating possible causes. If something has changed in the
system configuration, suspect that change first. If you only have
problems with printing and your terminal or GUI display is ok,
you should suspect the print settings. Note that this still does
not imply that the rest of the system configuration is correct.
When dealing with codepage problems you just cannot assume
that what you see really is what you see. There is always the
possibility that the terminal you are using to rectify the
situation is misconfigured. In general, try to check the
actual numeric codes of characters if possible.
The instructions in this notebook assume that you have not
modified the protermcap or convmap.cp files. If you have
special versions of these files, you should, if possible try
testing with original files supplied with the product. This
is of course impossible if you are using a codepage or collation
that is not shipped with the Progress product. In this case
you should first check the validity of your modifications.
STARTUP PARAMETERS
==================
Verify that the Progress session startup parameters on the
client and server are valid. If you are using special
startup scripts, make sure all parameters are correctly
passed to the executables. Also check for conflicting
parameters in $DLC/startup.pf (This file is read by all
executables).
DATABASE
========
With any codepage and/or collation problem one of the things
to check is the database itself. If your database contains
incorrect or corrupted data it is virtually impossible to fix
by quickly changing other settings. Start a client session
with parameters "-cpstream undefined -cpinternal undefined",
connect to your database and run the following tests.
Note: Running with -cpinternal undefined is crucial to defeat
any automatic conversions. This procedure works only with one
database connected. For multiple databases you have to do it
separately.
find first _db.
display session:charset format "x(10)" label "Internal codepage"
session:stream format "x(10)" label "Stream codepage"
_db._db-xl-name format "x(10)" label "Db codepage"
_db._db-coll-name format "x(10)" label "Db collation".
Verify that the values displayed for database codepage and
collation are correct. Also, if you don't get "undefined" displayed
for your internal and stream codepages, you should check your
Progress startup routines. The next thing is to find some extended
characters and display their respective numeric character codes.
You have to find a field that contains extended characters and use
the following procedure to display the value (Example against
customer.name field in isports database provides value 228).
find customer where cust-num = 7.
display customer.name
asc(substring(customer.name,11,1)).
Find multiple fields in different tables to make sure all tables
are using the same codepage and all characters that should be the
same really are the same. The values should match the values
listed in codepage table for the database codepage. Note that
this procedure displays the values in decimal and most codepage
tables are done in hexadecimal.
INTERNAL CODEPAGE AND COLLATION
===============================
The internal codepage is used for all 4GL comparisions and
sorting which does not use a database index. This codepage and
collation is independent from the database and can be different.
GUI input/output uses internal codepage and cannot be converted.
If in any doubt, override the defaults by explicitly specifying
the -cpinternal and -cpcoll parameters.
Check for hard-coding certain codepage values in comparisions
or assuming that the comparisions are always in ASCII order..
A common mistake is to assume that CHR(255) > "Z". This works
with ibm850 codepage but in iso8859-1 CHR(255) is ydiaeresis
which (in "basic" collation) sorts like y. In general the ASCII
value of a character does not tell you anything about the
collating weight of that character. Collating weights are
language-dependent and for one codepage there are several
collating sequences. To determine a collating weight of a
character (i.e. how it sorts) in a particular collation, refer
to $DLC/prolang/convmap.dat file.
With GUI screen codepage issues be sure to check what codepage
your windowing system is actually using. This must match the
Progress -cpinternal setting. Also try with different fonts
and check that the font encoding is correct. Many X11/Motif
distributions contain fonts in different codepages and all
MS-Windows fonts do not contain the same characters.
TERMINAL AND STREAM I/O
=======================
The definition stream I/O applies to almost anything that is
not GUI screen or database I/O (files, printers etc). Character
terminal I/O is a subcategory of stream I/O as far as codepages
are concerned. Procedure source code is read by compiler and
converted to internal codepage according to -cpstream setting.
If you have codepage problems with terminals, check the codepage
the terminal is actually using. Many terminals and emulators
have numerous language-specific setup items and their usage
and impact is rarely documented well in the manuals. In most
cases these can be overridden with software commands (Escape
sequences) so it is necessary to check what the settings are
when you are running Progress. It can well turn out that your
login scripts or Progress startup terminal initialization
changes these settings. Also, some operating systems have
terminal line codepage mappings that can sometimes really fool
you if you are not aware of them. SCO Unix is one of those
operating systems. To be sure that the characters you enter at
the keyboard are correct and the characters displayed are
correct, use the following procedure (Again, start Progress with
-cpinternal undefined -cpstream undefined).
define variable v-ch as character format "x".
update v-ch.
display asc(v-ch).
When you run this procedure, enter one extended alphabet at a
time and compare its numeric value to a codepage table. It is
good to check this for every character, upper- and lowercase.
You should get ascii values that correspond to the values
listed in codepage table. If they don't match, you can either
try to find a codepage that they match or try to find what piece
of your system is doing a conversion "behind your back". Some
(Particularly older) hardware and software implementations are
notoriously bad at stripping the 8th bit of a data stream,
effectively corrupting the data.
Input and output file codepages can be easily checked using
OS tools. In Unix, use "od -x" to view hexadecimal character
codes. In DOS/Windows you can easily check if a file is either
in the correct DOS or Windows codepage: Just load it into
DOS Edit or Windows Notepad. You can use any editor you wish
but make sure it does not do codepage conversion when it loads
the file. The codepage of these editors of course depends on
your DOS/Windows international settings and you should know
that in advance. Printing codepage in DOS/Windows must generally
be in the OS codepage and in Unix you can determine that by
printing a test file with normal OS tools.
REFERENCES TO WRITTEN DOCUMENTATION
===================================
Progress V7/V8 System Administration Guide, Appendix A
Developing International Software (Nadine Kano,Microsoft Press)
System software and hardware manuals
--
09.09.1996
Progress Software Technical Support Note # 16097