Sujet : Re: Script to conditionally find and compress files recursively
De : not (at) *nospam* telling.you.invalid (Computer Nerd Kev)
Groupes : comp.os.linux.miscDate : 14. Jun 2024, 03:25:06
Autres entêtes
Organisation : Ausics - https://newsgroups.ausics.net
Message-ID : <666baa01@news.ausics.net>
References : 1 2 3 4 5
User-Agent : tin/2.0.1-20111224 ("Achenvoir") (UNIX) (Linux/2.4.31 (i586))
Computer Nerd Kev <
not@telling.you.invalid> wrote:
Anssi Saari <anssi.saari@usenet.mail.kapsi.fi> wrote:
Well then, I believe the solution was already posted. Grab 5% of your
files with dd and see how it compresses.
The solution that I see grabs the first 1MB, but it would make more
sense to sample eg. 1% of the file size in five places within the
file. 100MB file = 1MB sample, 100MB/5 = 20MB, so use dd to grab
one 1MB sample from the start of the file then four more at an
offset that increments by 20MB each time. Store these separately,
compress them separately, then average the compression ratio of all
the samples.
Also for some types of data (if it's not all video), like text, some
more advanced compressors build a dictionary to better compress
larger files. But this requires a minimum file size, so the small
samples might not represent the compression ratio of the whole file
with a dictionary included. A solution is to pre-generate a
dictionary based on a collection of the same type of files you're
compressing, then you could compress the small samples using that
dictionary and get a more accurate result.
-- __ __#_ < |\| |< _#