On Wed, Feb 23, 2011 at 12:10 PM, Matthias Fuchs <ma...@gmx.net> wrote:
> Am Dienstag 08 Februar 2011, 23:22:49 schrieb todd rme:
>> On Sun, Oct 3, 2010 at 3:58 AM, todd rme <toddrme2...@gmail.com> wrote:
>> > On Thu, Sep 30, 2010 at 7:30 AM, Matthias Fuchs <ma...@gmx.net> wrote:
>> >> Hi,
>> >>
>> >> When moving and copying multiple files it can be quite tedious to make
>> >> out if there are differences for all these files.
>> >>
>> >>
>> >>
>> >> ====Use Case:====
>> >> You copy hundreds of text files, knowing that most are the same, but not
>> >> all. Now you are greated with multiple "Do you want to overwrite XY
>> >> size Z with XY size W" dialogs.
>> >>
>> >> ====Proposal====
>> >> What I propose is to not show this dialogs if both files are identical,
>> >> in the case of copying nothing should happen then, while in the case of
>> >> moving the source file should be deleted.
>> >>
>> >> To check if a file is identical this should happen in a two step
>> >> process: 1. Both file sizes equal and smaller a fixed size
>> >> 2. Calculating the checksums for both files, the check for the fixed
>> >> size above avoids long lasting calculations
>> >>
>> >> If 1. turns out to be false a dialog should be shown.
>> >>
>> >>
>> >> This could be either opt-in (via a checkbox) or always on with just an
>> >> information text in the dialog. The hash function should be one that is
>> >> very fast to calculate and if the file system supports and stores
>> >> checksums for files those should be used.
>> >>
>> >> ====Open Questions + Discussion====
>> >> What do you think of this idea, should something like that be
>> >> implemented? Also what do you think of the Nepomuk Ressources
>> >> associated with the files? Imagine both files have a different rating,
>> >> what should happen then?
>> >>
>> >>>> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to
>> >>>> unsubscribe <<
>> >
>> > I posted  brainstorm forum idea about this last year:
>> >
>> > http://forum.kde.org/brainstorm.php#idea39563_page1
>> >
>> > I proposed a three-stage process, similar to yours but with an extra
>> > stage, an optional byte-by-byte check.
>> >
>> > Checksums are fast for small files, but they can take longer on large
>> > files and on older systems.  They also, as I understand it, are not
>> > perfect.  So I think that a better approach is that, for files under a
>> > certain size, an automatic three-stage approach is used.  First the
>> > file size check, then checksum, then byte-by-byte.  If all of those
>> > pass, then the file is just deleted.
>> >
>> > For slightly bigger files, where the checksum is fast enough but the
>> > byte-by-byte is not, only the first two stages are used.  If they both
>> > pass, the "File Already Exists" dialog box should be changes to tell
>> > the user that the files are "probably" the same, and gives them the
>> > additional option (on top of renaming, overwriting, and skipping) of
>> > doing an "Exact check" (or something along those lines), which then
>> > does the byte-by-byte check.  If that passes, then the file is
>> > deleted.
>> >
>> > If the file is really big, then even the checksum is not done
>> > automatically.  If the files have the same size, the user is told the
>> > files have the same size, and the user has the additional options of
>> > doing a "Quick check" and "Exact check" (checksum and byte-by-byte,
>> > respectively).  If the checksums match, you are back to to the
>> > previous situation where the user is given the option to do the exact
>> > check or do one of the standard actions.  If the detailed check
>> > passes, then the file is deleted.
>> >
>> > The issue with the nepomuk data is an issue even without this.  When
>> > you are moving files and decide to overwrite conflicting files, even
>> > if they aren't the same.  A simple check box for "merge nepomuk data"
>> > or "merge tags" or something like that (if they both have data, of
>> > course) would be very useful independent of this.
>>
>> Sorry for dredging up such an old topic, but I was wondering if this
>> might this make a good GSOC project.
>>
>> -Todd
>>
>> >> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to
>> >> unsubscribe <<
>
> I just saw your reply now.
> Personally I don't really think that this should be a GSOC since I believe it
> would be quite easy to realise.
>

What about as part of a larger duplicate file-finding tool?  There is
no good KDE GUI that I am aware of.  Someone could make a general
duplicate file check library for kdelibs, include this in the file
overwrite dialog but also make a GUI to allow scanning directories for
duplicate files.

-Todd
 
>> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to unsubscribe <<

Reply via email to