Sujet : Re: ASCII to ASCII compression.
De : already5chosen (at) *nospam* yahoo.com (Michael S)
Groupes : comp.lang.cDate : 17. Jun 2024, 11:10:12
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <20240617131012.00001518@yahoo.com>
References : 1 2 3 4
User-Agent : Claws Mail 3.19.1 (GTK+ 2.24.33; x86_64-w64-mingw32)
On Mon, 17 Jun 2024 04:45:04 -0000 (UTC)
Lawrence D'Oliveiro <
ldo@nz.invalid> wrote:
On Thu, 6 Jun 2024 20:02:55 +0300, Michael S wrote:
Or, if we want to make a job just a little bit more interesting, we
can convert to base94, producing ~9% smaller size than base94 :-)
You mean smaller than Base64?
Yes.
I just spent some hours yesterday implementing the ASCII85 encoding
in C code. This was something Adobe added to PostScript level 2; not
sure if anybody else used it.
By using only 85 instead of 94 printable characters, it could reserve
some for special uses. For example, four bytes of zero are
represented by a single “z” character. Also “~” is not used because
it is part of the PostScript string delimiter for strings in ASCII85
format.
I didn't look at the existing standards.
The nice thing about base 94 is that you can encode 9 arbitrary octets
into 11 isgraph() characters and that the code for encode/decode is
simple and reasonably fast even in absence of 64-bit integer types.
For base 85 you encode 8 octets to 10 characters, which is even a
little simpler (or more than a little when 64-bit integers are
available), but 2.3% less dense.