On Sun, Oct 3, 2010 at 3:58 AM, todd rme <toddrme2...@gmail.com> wrote:
> On Thu, Sep 30, 2010 at 7:30 AM, Matthias Fuchs <ma...@gmx.net> wrote:
>> Hi,
>>
>> When moving and copying multiple files it can be quite tedious to make out if
>> there are differences for all these files.
>>
>>
>>
>> ====Use Case:====
>> You copy hundreds of text files, knowing that most are the same, but not all.
>> Now you are greated with multiple "Do you want to overwrite XY size Z with XY
>> size W" dialogs.
>>
>> ====Proposal====
>> What I propose is to not show this dialogs if both files are identical, in 
>> the
>> case of copying nothing should happen then, while in the case of moving the
>> source file should be deleted.
>>
>> To check if a file is identical this should happen in a two step process:
>> 1. Both file sizes equal and smaller a fixed size
>> 2. Calculating the checksums for both files, the check for the fixed size
>> above avoids long lasting calculations
>>
>> If 1. turns out to be false a dialog should be shown.
>>
>>
>> This could be either opt-in (via a checkbox) or always on with just an
>> information text in the dialog. The hash function should be one that is very
>> fast to calculate and if the file system supports and stores checksums for
>> files those should be used.
>>
>> ====Open Questions + Discussion====
>> What do you think of this idea, should something like that be implemented?
>> Also what do you think of the Nepomuk Ressources associated with the files?
>> Imagine both files have a different rating, what should happen then?
>>
>>>> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to unsubscribe 
>>>> <<
>>
>
> I posted  brainstorm forum idea about this last year:
>
> http://forum.kde.org/brainstorm.php#idea39563_page1
>
> I proposed a three-stage process, similar to yours but with an extra
> stage, an optional byte-by-byte check.
>
> Checksums are fast for small files, but they can take longer on large
> files and on older systems.  They also, as I understand it, are not
> perfect.  So I think that a better approach is that, for files under a
> certain size, an automatic three-stage approach is used.  First the
> file size check, then checksum, then byte-by-byte.  If all of those
> pass, then the file is just deleted.
>
> For slightly bigger files, where the checksum is fast enough but the
> byte-by-byte is not, only the first two stages are used.  If they both
> pass, the "File Already Exists" dialog box should be changes to tell
> the user that the files are "probably" the same, and gives them the
> additional option (on top of renaming, overwriting, and skipping) of
> doing an "Exact check" (or something along those lines), which then
> does the byte-by-byte check.  If that passes, then the file is
> deleted.
>
> If the file is really big, then even the checksum is not done
> automatically.  If the files have the same size, the user is told the
> files have the same size, and the user has the additional options of
> doing a "Quick check" and "Exact check" (checksum and byte-by-byte,
> respectively).  If the checksums match, you are back to to the
> previous situation where the user is given the option to do the exact
> check or do one of the standard actions.  If the detailed check
> passes, then the file is deleted.
>
> The issue with the nepomuk data is an issue even without this.  When
> you are moving files and decide to overwrite conflicting files, even
> if they aren't the same.  A simple check box for "merge nepomuk data"
> or "merge tags" or something like that (if they both have data, of
> course) would be very useful independent of this.

Sorry for dredging up such an old topic, but I was wondering if this
might this make a good GSOC project.

-Todd
 
>> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to unsubscribe <<

Reply via email to