Sujet : Re: program to remove duplicates
De : nospam (at) *nospam* dfs.com (DFS)
Groupes : comp.lang.cDate : 22. Sep 2024, 22:11:02
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <vcq155$2bica$1@dont-email.me>
References : 1 2 3 4
User-Agent : Betterbird (Windows)
On 9/21/2024 10:06 PM, Lawrence D'Oliveiro wrote:
On Sun, 22 Sep 2024 00:18:09 +0200, fir wrote:
... you just need to read all files in
folder and compare it byte by byte to other files in folder of the same
size
For N files, that requires N × (N - 1) ÷ 2 byte-by-byte comparisons.
That’s an O(N²) algorithm.
for (i = 0; i < N; i++) {
for (j = i+1; j < N; j++) {
... byte-byte compare file i to file j
}
}
For N = 10, 45 byte-byte comparisons would be made (assuming all files are the same size)
> There is a faster way.
Calc the checksum of each file once, then compare the checksums as above?
Which is still an O(N^2) algorithm, but I would assume it's faster than 45 byte-byte comparisons.