On Wed, Feb 23, 2011 at 12:10 PM, Matthias Fuchs <ma...@gmx.net> wrote: > Am Dienstag 08 Februar 2011, 23:22:49 schrieb todd rme: >> On Sun, Oct 3, 2010 at 3:58 AM, todd rme <toddrme2...@gmail.com> wrote: >> > On Thu, Sep 30, 2010 at 7:30 AM, Matthias Fuchs <ma...@gmx.net> wrote: >> >> Hi, >> >> >> >> When moving and copying multiple files it can be quite tedious to make >> >> out if there are differences for all these files. >> >> >> >> >> >> >> >> ====Use Case:==== >> >> You copy hundreds of text files, knowing that most are the same, but not >> >> all. Now you are greated with multiple "Do you want to overwrite XY >> >> size Z with XY size W" dialogs. >> >> >> >> ====Proposal==== >> >> What I propose is to not show this dialogs if both files are identical, >> >> in the case of copying nothing should happen then, while in the case of >> >> moving the source file should be deleted. >> >> >> >> To check if a file is identical this should happen in a two step >> >> process: 1. Both file sizes equal and smaller a fixed size >> >> 2. Calculating the checksums for both files, the check for the fixed >> >> size above avoids long lasting calculations >> >> >> >> If 1. turns out to be false a dialog should be shown. >> >> >> >> >> >> This could be either opt-in (via a checkbox) or always on with just an >> >> information text in the dialog. The hash function should be one that is >> >> very fast to calculate and if the file system supports and stores >> >> checksums for files those should be used. >> >> >> >> ====Open Questions + Discussion==== >> >> What do you think of this idea, should something like that be >> >> implemented? Also what do you think of the Nepomuk Ressources >> >> associated with the files? Imagine both files have a different rating, >> >> what should happen then? >> >> >> >>>> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to >> >>>> unsubscribe << >> > >> > I posted brainstorm forum idea about this last year: >> > >> > http://forum.kde.org/brainstorm.php#idea39563_page1 >> > >> > I proposed a three-stage process, similar to yours but with an extra >> > stage, an optional byte-by-byte check. >> > >> > Checksums are fast for small files, but they can take longer on large >> > files and on older systems. They also, as I understand it, are not >> > perfect. So I think that a better approach is that, for files under a >> > certain size, an automatic three-stage approach is used. First the >> > file size check, then checksum, then byte-by-byte. If all of those >> > pass, then the file is just deleted. >> > >> > For slightly bigger files, where the checksum is fast enough but the >> > byte-by-byte is not, only the first two stages are used. If they both >> > pass, the "File Already Exists" dialog box should be changes to tell >> > the user that the files are "probably" the same, and gives them the >> > additional option (on top of renaming, overwriting, and skipping) of >> > doing an "Exact check" (or something along those lines), which then >> > does the byte-by-byte check. If that passes, then the file is >> > deleted. >> > >> > If the file is really big, then even the checksum is not done >> > automatically. If the files have the same size, the user is told the >> > files have the same size, and the user has the additional options of >> > doing a "Quick check" and "Exact check" (checksum and byte-by-byte, >> > respectively). If the checksums match, you are back to to the >> > previous situation where the user is given the option to do the exact >> > check or do one of the standard actions. If the detailed check >> > passes, then the file is deleted. >> > >> > The issue with the nepomuk data is an issue even without this. When >> > you are moving files and decide to overwrite conflicting files, even >> > if they aren't the same. A simple check box for "merge nepomuk data" >> > or "merge tags" or something like that (if they both have data, of >> > course) would be very useful independent of this. >> >> Sorry for dredging up such an old topic, but I was wondering if this >> might this make a good GSOC project. >> >> -Todd >> >> >> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to >> >> unsubscribe << > > I just saw your reply now. > Personally I don't really think that this should be a GSOC since I believe it > would be quite easy to realise. >
What about as part of a larger duplicate file-finding tool? There is no good KDE GUI that I am aware of. Someone could make a general duplicate file check library for kdelibs, include this in the file overwrite dialog but also make a GUI to allow scanning directories for duplicate files. -Todd >> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to unsubscribe <<