Re: program to remove duplicates

Liste des GroupesRevenir à cl c  
Sujet : Re: program to remove duplicates
De : chris.m.thomasson.1 (at) *nospam* gmail.com (Chris M. Thomasson)
Groupes : comp.lang.c
Date : 22. Sep 2024, 19:47:21
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <vcponp$2aavq$2@dont-email.me>
References : 1 2 3 4 5 6
User-Agent : Mozilla Thunderbird
On 9/22/2024 12:29 AM, Paul wrote:
On Sat, 9/21/2024 10:36 PM, fir wrote:
Lawrence D'Oliveiro wrote:
On Sun, 22 Sep 2024 00:18:09 +0200, fir wrote:
>
... you just need to read all files in
folder and compare it byte by byte to other files in folder of the same
size
>
For N files, that requires N × (N - 1) ÷ 2 byte-by-byte comparisons.
That’s an O(N²) algorithm.
>
There is a faster way.
>
not quite as most files have different sizes so most binary comparsions
are discarded becouse size of files differ (and those sizes i read linearly when bulding lidt of filenames)
>
what i posted seem to work ok, it odesnt work fast but hard to say if it can be optimised or it takes as long as it should..hard to say
 The normal way to do this, is do a hash check on the
files and compare the hash. You can use MD5SUM, SHA1SUM, SHA256SUM,
as a means to compare two files. If you want to be picky about
it, stick with SHA256SUM.
[...]
That's fine.
file_0.bin
file_1.png
file_2.jpg
Say they all were identical wrt their actual bytes. The hash for them would all be the same. As long as they did not hash the file name in there for some reason... ;^)

Date Sujet#  Auteur
21 Sep 24 * program to remove duplicates28fir
21 Sep 24 +* Re: program to remove duplicates5fir
21 Sep 24 i`* Re: program to remove duplicates4fir
21 Sep 24 i `* Re: program to remove duplicates3fir
21 Sep 24 i  `* Re: program to remove duplicates2fir
22 Sep 24 i   `- Re: program to remove duplicates1fir
21 Sep 24 +* Re: program to remove duplicates19Chris M. Thomasson
22 Sep 24 i`* Re: program to remove duplicates18fir
22 Sep 24 i +- Re: program to remove duplicates1Chris M. Thomasson
22 Sep 24 i `* Re: program to remove duplicates16Lawrence D'Oliveiro
22 Sep 24 i  +* Re: program to remove duplicates14fir
22 Sep 24 i  i+- Re: program to remove duplicates1Chris M. Thomasson
22 Sep 24 i  i+- Re: program to remove duplicates1Lawrence D'Oliveiro
22 Sep 24 i  i`* Re: program to remove duplicates11Paul
22 Sep 24 i  i +* Re: program to remove duplicates9fir
22 Sep 24 i  i i`* Re: program to remove duplicates8Bart
22 Sep 24 i  i i +* Re: program to remove duplicates3fir
22 Sep 24 i  i i i`* Re: program to remove duplicates2fir
22 Sep 24 i  i i i `- Re: program to remove duplicates1fir
22 Sep 24 i  i i `* Re: program to remove duplicates4fir
22 Sep 24 i  i i  `* Re: program to remove duplicates3fir
22 Sep 24 i  i i   `* Re: program to remove duplicates2fir
22 Sep 24 i  i i    `- Re: program to remove duplicates1fir
22 Sep 24 i  i `- Re: program to remove duplicates1Chris M. Thomasson
22 Sep 24 i  `- Re: program to remove duplicates1DFS
22 Sep 24 +- Re: program to remove duplicates1Lawrence D'Oliveiro
1 Oct 24 `* Re: program to remove duplicates2Josef Möllers
1 Oct 24  `- Off Topic (Was: program to remove duplicates)1Kenny McCormack

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal