Kbase P142767: 4GL/ABL: How to get a listing of all characters in a text or data file?
Autor |
  Progress Software Corporation - Progress |
Acesso |
  Público |
Publicação |
  11/16/2009 |
|
Status: Verified
GOAL:
4GL/ABL: How to get a listing of all characters in a text or data file?
GOAL:
How to construct a list of all undesirable characters in a text or data (.d) file?
GOAL:
How to construct a character frequency table for all characters used in a file?
GOAL:
How to detect special characters in a text or data (.d) file?
GOAL:
How to remove a set of given specific characters from a text or data (.d) file?
GOAL:
How to strip a set of control or unprintable characters from a text or data (.d) file?
FACT(s) (Environment):
All Supported Operating Systems
Progress 9.x
OpenEdge 10.x
FIX:
The following procedure can be used to generates a report about the characters used in the file. The report lists the Character, its ASCII code, its Progress LABEL and FUNCTION as well as its frequency. Inspect the generated FileCharSetReport.txt report visually to define the set of undesirable characters in the file. To remove the undesirable characters from the file, see solution P142684, "4GL/ABL: How to strip a control character from a text file?" :
/* Define procedure variables */
DEFINE VARIABLE cSourceFileName AS CHARACTER NO-UNDO.
DEFINE VARIABLE mSourceMemptr AS MEMPTR NO-UNDO.
DEFINE VARIABLE iSourceCounter AS INTEGER NO-UNDO.
DEFINE VARIABLE iMemptrSize AS INTEGER NO-UNDO.
DEFINE VARIABLE iCurrentByte AS INTEGER NO-UNDO.
/* Define TEMP-TABLE to store the types of characters encountered in the file */
DEFINE TEMP-TABLE ttFileCharacters NO-UNDO
FIELD iCharCode AS INTEGER
FIELD iFrequency AS INTEGER
FIELD cCharacter AS CHARACTER
FIELD cKeyLabel AS CHARACTER
FIELD cKeyFunction AS CHARACTER
INDEX iCode IS UNIQUE PRIMARY iCharCode.
/* Initialize the file value and pump the file contents into the source MEMPTR variable */
ASSIGN
cSourceFileName = "C:\OpenEdge\WRK101B\Customer.d".
RUN LoadFileToMemptr(INPUT cSourceFileName, OUTPUT mSourceMemptr).
/* Build TEMP-TABLE with information about all the characters in the file */
ASSIGN
iMemptrSize = GET-SIZE(mSourceMemptr).
DO iSourceCounter = 1 TO iMemptrSize:
iCurrentByte = GET-BYTE( mSourceMemptr , iSourceCounter).
FIND FIRST ttFileCharacters WHERE ttFileCharacters.iCharCode = iCurrentByte EXCLUSIVE-LOCK NO-ERROR.
IF AVAILABLE ttFileCharacters THEN DO:
ttFileCharacters.iFrequency = ttFileCharacters.iFrequency + 1.
NEXT.
END.
ELSE DO:
CREATE ttFileCharacters.
ASSIGN
ttFileCharacters.iCharCode = iCurrentByte
ttFileCharacters.iFrequency = 1
ttFileCharacters.cCharacter = CHR(iCurrentByte)
ttFileCharacters.cKeyLabel = KEYLABEL(iCurrentByte)
ttFileCharacters.cKeyFunction = KEYFUNCTION(iCurrentByte).
END.
END.
/* Output Report on the characters found in the file */
OUTPUT TO FileCharSetReport.txt.
PUT UNFORMATTED
"Code" AT 1
"Char" AT 10
"Frequency" AT 25
"Label" AT 40
"Function" AT 60
SKIP.
FOR EACH ttFileCharacters NO-LOCK:
PUT UNFORMATTED
ttFileCharacters.iCharCode AT 1
ttFileCharac.ters.cCharacter AT 10
ttFileCharacters.iFrequency AT 25
ttFileCharacters.cKeyLabel AT 40
ttFileCharacters.cKeyFunction AT 60
SKIP.
END.
OUTPUT CLOSE.
PROCEDURE LoadFileToMemptr:
DEFINE INPUT PARAMETER ipcFileName AS CHARACTER NO-UNDO.
DEFINE OUTPUT PARAMETER opmMessage AS MEMPTR NO-UNDO.
FILE-INFO:FILE-NAME = ipcFileName.
SET-SIZE(opmMessage) = FILE-INFO:FILE-SIZE.
INPUT FROM VALUE(ipcFileName) BINARY NO-MAP NO-CONVERT.
IMPORT UNFORMATTED opmMessage.
INPUT CLOSE.
END PROCEDURE..