Sujet : Re: program to remove duplicates
De : fir (at) *nospam* grunge.pl (fir)
Groupes : comp.lang.cDate : 21. Sep 2024, 20:27:08
Autres entêtes
Organisation : i2pn2 (i2pn.org)
Message-ID : <dadedb8538fff0ea1879112d4c41900e1e005d7c@i2pn2.org>
References : 1 2
User-Agent : Mozilla/5.0 (Windows NT 5.1; rv:27.0) Gecko/20100101 Firefox/27.0 SeaMonkey/2.24
fir wrote:
fir wrote:
>
>
i think if to write a simple comandline program
that remove duplicates in a given folder
>
i mean some should copy a program to given folder
run it and all duplicates and multiplicates (when
duplicate means a file with different name but
exact binary size and byte content) will be removed
leafting only one for multiplicate set
>
this should work for a big doze of files -
i need it for example i once recovered a hdd disk
and as i got some copies of files on this disc
the removed files are generally multiplicated
and consume a lot of disk space
>
so is there some approach i need to take to make this
proces faster?
>
probably i would need to read list of files and sizes in
current directory then sort or go thru the list and if found
exact size read it into ram tnen compare it byte by byte
>
in not sure if to do sorting as i need write it quick
also and maybe sorting will complicate a bit but not gives much
>
some thoughts?
>
couriously, i could add i once searched for program to remove duplicates
but they was not looking good..so such commandline
(or commandline less in fact as i dont even want toa dd comandline
options maybe) program is quite practically needed
assuming i got code to read in list of filanemes in given directory (which i found) what you suggest i should add to remove such duplicates
- the code to read those filenames into l;ist
(tested to work but not tested for being 100% errorless)
#include<windows.h>
#include<stdio.h>
void StrCopyMaxNBytes(char* dest, char* src, int n)
{
for(int i=0; i<n; i++) { dest[i]=src[i]; if(!src[i]) break; }
}
//list of file names
const int FileNameListEntry_name_max = 500;
struct FileNameListEntry { char name[FileNameListEntry_name_max]; };
FileNameListEntry* FileNameList = NULL;
int FileNameList_Size = 0;
void FileNameList_AddOne(char* name)
{
FileNameList_Size++;
FileNameList = (FileNameListEntry*) realloc(FileNameList, FileNameList_Size * sizeof(FileNameListEntry) );
StrCopyMaxNBytes((char*)&FileNameList[FileNameList_Size-1].name, name, FileNameListEntry_name_max);
return ;
}
// collect list of filenames
WIN32_FIND_DATA ffd;
void ReadDIrectoryFileNamesToList(char* dir)
{
HANDLE h = FindFirstFile(dir, &ffd);
if(!h) { printf("error reading directory"); exit(-1);}
do {
if (!(ffd.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY))
FileNameList_AddOne(ffd.cFileName);
}
while (FindNextFile(h, &ffd));
}
int main()
{
ReadDIrectoryFileNamesToList("*");
for(int i=0; i< FileNameList_Size; i++)
printf("\n %d %s", i, FileNameList[i].name );
return 'ok';
}