Lars, On Sat, Jul 2, 2011 at 1:03 PM, Lars Wirzenius <l...@liw.fi> wrote: > It seems you didn't Cc the bug, or debian-devel. Just in case that > was intentional, I'm not doing it either.
My mistake, thanks for pointing it out. > On Sat, Jul 02, 2011 at 12:36:51PM +0100, Tomasz Muras wrote: >> It is different as it (tries to) solve the problem of not just on >> finding the duplicates but also what should be done with them once >> they are found (e.g. which file should be considered original and >> which duplicate). My original motivation behind first looking for and >> then creating this utility was cleaning up my photos: imagine >> thousands of files in hundreds of directories that needed to be clean >> up. I had a preference to leave some files in sorted directories, >> while removing the duplicates from all those "dump", "backup", etc >> ones in the automated fashion. And my top priority: I could not allow >> for any mistakes, so I've put significant effort into testing the >> tool. >> >> The second problem it solves is finding and acting on files that are >> partial files of some other, presumably full file (e.g. not completed >> FTP download). >> >> Before I started working on it I looked for similar utilities and >> documented it [1]. Also see [2] for other usages. >> >> [1] http://pmatch.rubyforge.org/competition.html >> [2] http://pmatch.rubyforge.org/usage.html >> >> I welcome any comments and criticism. >> Tomek > > That does make pmatch seem like a very useful tool! You should add > some summary of that information from the usage page to your long > package description. Agreed. I guess I did a poor job at "advertising" the package. > Your description said you use a hash to compare files. Is that > a hash of the complete file? I found, when developing my tool, > that it's much faster to compare just a little bit of data from > the beginning of the file, and since my data set had several quite > large files, this had a big impact. (Obviously, check file size first.) > > I quite like your approach of writing out shell commands instead of > doing any changes directly. > > Looking forward to seeing pmatch in Debian. Agreed again, I'm planning to do more work on pmatch soon - at the moment getting it into Debian is my priority. Comparing the initial size may be a very good idea, especially for my use case (photos) as most of the files are of similar size. Thank you for your review Lars, Tomek -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org