Re: ASCII to ASCII compression.

Liste des GroupesRevenir à cl c  
Sujet : Re: ASCII to ASCII compression.
De : already5chosen (at) *nospam* yahoo.com (Michael S)
Groupes : comp.lang.c
Date : 10. Jun 2024, 12:29:30
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <20240610142930.00005c8a@yahoo.com>
References : 1 2 3
User-Agent : Claws Mail 3.19.1 (GTK+ 2.24.33; x86_64-w64-mingw32)
On Mon, 10 Jun 2024 07:12:57 +0100
Malcolm McLean <malcolm.arthur.mclean@gmail.com> wrote:

On 10/06/2024 01:45, Lew Pitcher wrote:
On Thu, 06 Jun 2024 17:25:37 +0100, Malcolm McLean wrote:
 
Not strictly a C programming question, but smart people will see
the relavance to the topicality, which is portability.
>
Is there a compresiion algorthim which converts human language
ASCII text to compressed ASCII, preferably only "isgraph"
characters?
>
So "Mary had a little lamb, its fleece was white as snow".
>
Would become
>
QWE£$543GtT£$"||x|VVBB? 
 
I'm afraid that you have conflicting requirements here. In effect,
you want to take an array of values (each within the range of
0 to 127) and
a) make the array shorter ("compress it"), and
b) express the individual elements of this shorter array with
    a range of 96 values ("isgraph() characters")
 
Because you reduce the number of values each result element
can carry, each result element can only express a fraction
(96/128'ths) of the corresponding source element. Thus,
with the isgraph() requirement, the result will take /more/
elements to express the same data as the source did.
 
However, you want /compression/, which implies that you want
the result to be smaller than the source. And, therein lies
the conflict.
 
Can you help clarify this for me?
 > 
We have a fixed Huffman tree which is part of the algorithm and
optmised for ASCII. And we take each line otext, and comress it to a
binary string, using the Huffman table. The we code the binary string
six bytes ar a time using a 64 character dubset of ASCCI. And the we
append a special character which is chosen to be visually
distinctive..
 
So the inout is
 
Mary had a little lamb,
it's fleece was white as snow,
and eveywhere that Mary went,
the lamb was sure to. go.
 
And we get the output.
 
CVbGNh£-H$£*MMH&-VVdsE3w2as3-vv$G^&ggf-
 
 
And if it shorter or not depends on whether the fixed Huffman table
is any good.
 

Take something that is a little bigger than a text above. It does not
have to be much bigger. One page from any book  will do ("Alice's
Adventures in Wonderland" is used most often for that purpose).
Apply your compression procedure.
Then run automatic test that applies all possible single bit flips,
de-compresses and count # of mismatches vs original text. The test will
report the case with maximal # of mismatches.
Look at most corrupted text.
If your fixed Huffman table is any good, you'll see that output is
corrupted rather seriously, most likely at least one sentence will be
unrecognizable.
Alternatively, if your fixed Huffman table is no good, you output will
be as big or bigger than the input.

Popular corpus of samples for compression tests:
https://corpus.canterbury.ac.nz/descriptions/
http://corpus.canterbury.ac.nz/resources/cantrbry.zip


Date Sujet#  Auteur
6 Jun 24 * ASCII to ASCII compression.42Malcolm McLean
6 Jun 24 +* Re: ASCII to ASCII compression.12bart
6 Jun 24 i+* Re: ASCII to ASCII compression.3Michael S
17 Jun 24 ii`* Re: ASCII to ASCII compression.2Lawrence D'Oliveiro
17 Jun 24 ii `- Re: ASCII to ASCII compression.1Michael S
6 Jun 24 i`* Re: ASCII to ASCII compression.8Malcolm McLean
6 Jun 24 i +- Re: ASCII to ASCII compression.1Keith Thompson
7 Jun 24 i +- Re: ASCII to ASCII compression.1Mikko
7 Jun 24 i `* Re: ASCII to ASCII compression.5David Brown
7 Jun 24 i  `* Re: ASCII to ASCII compression.4Malcolm McLean
7 Jun 24 i   +- Re: ASCII to ASCII compression.1David Brown
7 Jun 24 i   `* Re: ASCII to ASCII compression.2Paul
10 Jun 24 i    `- Re: ASCII to ASCII compression.1BGB-Alt
6 Jun 24 +* Re: ASCII to ASCII compression.10Ben Bacarisse
6 Jun 24 i`* Re: ASCII to ASCII compression.9Malcolm McLean
7 Jun 24 i `* Re: ASCII to ASCII compression.8Mikko
7 Jun 24 i  `* Re: ASCII to ASCII compression.7Malcolm McLean
7 Jun 24 i   +* Re: ASCII to ASCII compression.5Mikko
7 Jun 24 i   i+- Re: ASCII to ASCII compression.1BGB
7 Jun 24 i   i`* Re: ASCII to ASCII compression.3Malcolm McLean
7 Jun 24 i   i `* Re: ASCII to ASCII compression.2Richard Harnden
8 Jun 24 i   i  `- Re: ASCII to ASCII compression.1Malcolm McLean
7 Jun 24 i   `- Re: ASCII to ASCII compression.1Chris M. Thomasson
6 Jun 24 +- Re: ASCII to ASCII compression.1Kaz Kylheku
6 Jun 24 +* Re: ASCII to ASCII compression.7Paul
6 Jun 24 i`* Re: ASCII to ASCII compression.6Malcolm McLean
6 Jun 24 i +* Re: ASCII to ASCII compression.2bart
7 Jun 24 i i`- Re: ASCII to ASCII compression.1Paul
10 Jun 24 i `* Re: ASCII to ASCII compression.3Lowell Gilbert
10 Jun 24 i  `* Re: ASCII to ASCII compression.2Malcolm McLean
10 Jun 24 i   `- Re: ASCII to ASCII compression.1bart
7 Jun 24 +* Re: ASCII to ASCII compression.4Mikko
7 Jun 24 i`* Re: ASCII to ASCII compression.3Malcolm McLean
9 Jun 24 i `* Re: ASCII to ASCII compression.2Michael S
9 Jun 24 i  `- Re: ASCII to ASCII compression.1Malcolm McLean
10 Jun 24 `* Re: ASCII to ASCII compression.7Lew Pitcher
10 Jun 24  `* Re: ASCII to ASCII compression.6Malcolm McLean
10 Jun 24   +- Re: ASCII to ASCII compression.1Michael S
10 Jun 24   `* Re: ASCII to ASCII compression.4Ben Bacarisse
10 Jun 24    `* Re: ASCII to ASCII compression.3Malcolm McLean
10 Jun 24     `* Re: ASCII to ASCII compression.2Ben Bacarisse
10 Jun 24      `- Re: ASCII to ASCII compression.1Malcolm McLean

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal