Am Donnerstag 24 Februar 2011, 16:43:09 schrieb todd rme: > On Wed, Feb 23, 2011 at 12:10 PM, Matthias Fuchs <ma...@gmx.net> wrote: > > Am Dienstag 08 Februar 2011, 23:22:49 schrieb todd rme: > >> On Sun, Oct 3, 2010 at 3:58 AM, todd rme <toddrme2...@gmail.com> wrote: > >> > On Thu, Sep 30, 2010 at 7:30 AM, Matthias Fuchs <ma...@gmx.net> wrote: > >> >> Hi, > >> >> > >> >> When moving and copying multiple files it can be quite tedious to > >> >> make out if there are differences for all these files. > >> >> > >> >> > >> >> > >> >> ====Use Case:==== > >> >> You copy hundreds of text files, knowing that most are the same, but > >> >> not all. Now you are greated with multiple "Do you want to overwrite > >> >> XY size Z with XY size W" dialogs. > >> >> > >> >> ====Proposal==== > >> >> What I propose is to not show this dialogs if both files are > >> >> identical, in the case of copying nothing should happen then, while > >> >> in the case of moving the source file should be deleted. > >> >> > >> >> To check if a file is identical this should happen in a two step > >> >> process: 1. Both file sizes equal and smaller a fixed size > >> >> 2. Calculating the checksums for both files, the check for the fixed > >> >> size above avoids long lasting calculations > >> >> > >> >> If 1. turns out to be false a dialog should be shown. > >> >> > >> >> > >> >> This could be either opt-in (via a checkbox) or always on with just > >> >> an information text in the dialog. The hash function should be one > >> >> that is very fast to calculate and if the file system supports and > >> >> stores checksums for files those should be used. > >> >> > >> >> ====Open Questions + Discussion==== > >> >> What do you think of this idea, should something like that be > >> >> implemented? Also what do you think of the Nepomuk Ressources > >> >> associated with the files? Imagine both files have a different > >> >> rating, what should happen then? > >> >> > >> >>>> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to > >> >>>> unsubscribe << > >> > > >> > I posted brainstorm forum idea about this last year: > >> > > >> > http://forum.kde.org/brainstorm.php#idea39563_page1 > >> > > >> > I proposed a three-stage process, similar to yours but with an extra > >> > stage, an optional byte-by-byte check. > >> > > >> > Checksums are fast for small files, but they can take longer on large > >> > files and on older systems. They also, as I understand it, are not > >> > perfect. So I think that a better approach is that, for files under a > >> > certain size, an automatic three-stage approach is used. First the > >> > file size check, then checksum, then byte-by-byte. If all of those > >> > pass, then the file is just deleted. > >> > > >> > For slightly bigger files, where the checksum is fast enough but the > >> > byte-by-byte is not, only the first two stages are used. If they both > >> > pass, the "File Already Exists" dialog box should be changes to tell > >> > the user that the files are "probably" the same, and gives them the > >> > additional option (on top of renaming, overwriting, and skipping) of > >> > doing an "Exact check" (or something along those lines), which then > >> > does the byte-by-byte check. If that passes, then the file is > >> > deleted. > >> > > >> > If the file is really big, then even the checksum is not done > >> > automatically. If the files have the same size, the user is told the > >> > files have the same size, and the user has the additional options of > >> > doing a "Quick check" and "Exact check" (checksum and byte-by-byte, > >> > respectively). If the checksums match, you are back to to the > >> > previous situation where the user is given the option to do the exact > >> > check or do one of the standard actions. If the detailed check > >> > passes, then the file is deleted. > >> > > >> > The issue with the nepomuk data is an issue even without this. When > >> > you are moving files and decide to overwrite conflicting files, even > >> > if they aren't the same. A simple check box for "merge nepomuk data" > >> > or "merge tags" or something like that (if they both have data, of > >> > course) would be very useful independent of this. > >> > >> Sorry for dredging up such an old topic, but I was wondering if this > >> might this make a good GSOC project. > >> > >> -Todd > >> > >> >> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to > >> >> unsubscribe << > > > > I just saw your reply now. > > Personally I don't really think that this should be a GSOC since I > > believe it would be quite easy to realise. > > What about as part of a larger duplicate file-finding tool? There is > no good KDE GUI that I am aware of. Someone could make a general > duplicate file check library for kdelibs, include this in the file > overwrite dialog but also make a GUI to allow scanning directories for > duplicate files. > > -Todd > > >> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to > >> unsubscribe <<
This sounds imo interesting. I am not sure who you should contact though to add thisas GSOC idea. >> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to unsubscribe <<