Kbase P137059: Why is the compression ratio lower after the source code has been encrypted with XCODE utility?
Autor |
  Progress Software Corporation - Progress |
Acesso |
  Público |
Publicação |
  11/7/2008 |
|
Status: Unverified
GOAL:
Why is the compression ratio lower after the source code has been encrypted with XCODE utility?
FACT(s) (Environment):
Progress/OpenEdge Product Family
All Supported Operating Systems
FIX:
ZIP or RAR-format compression is based on a dictionary coder, also known as a substitution coder. This is a class of lossless data compression algorithms which operate by searching for matches between the text to be compressed and a set of strings contained in a data structure (called the 'dictionary') maintained by the encoder. When the encoder finds such a match, it substitutes a reference to the string's position in the data structure.
The median compression ratio for non-encrypted text files is generally high due to the repetition of the same symbols. This is particularly true for source code where the built of the dictionary is mainly based on the language syntax.
On the opposite encrypted files and, to a larger extent, binary files generally obtain lower compression ratios since the probability of finding the same repeatable phrases is significantly reduced.