Hi Lee,

On Thu, 13 Jun 2013 22:51:24 +0200
lee <l...@yun.yagibdah.de> wrote:

> "Dr.Ruud" <rvtol+use...@isolution.nl> writes:
> 
> > On 12/06/2013 10:27, lee wrote:
> >
> >> File sizes do not reliably indicate whether a file has been modified or
> >> not.
> >
> > If the file size has changed, then your file has changed. That is 100%
> > reliable, and is a quick and cheap check.
> 
> It works only one way: different size --> file has changed.  What
> doesn't work is: file has changed --> different size.

Right.

> 
> > But if the size hasn't changed, then you still need to check something
> > else. You can do another light check, or decide to do the heavy one.
> >
> > This is also important because a hash-value is only a fingerprint, so
> > different files have (a small chance on having) the same hash value.
> >
> > The file size check makes the chance even smaller that you don't
> > detect the change.
> 
> Hm ok, this kinda sucks ...  Imagine I check size and mtime and I have
> a SHA-256 hash.  Now there are the following cases:
> 
> 
> + mtime AND size changed: file has changed, put into report, update index
> 
> + EITHER mtime changed, size is same, OR size changed, mtime is same:

If the size changed, then the file definitely changed.

>     * compute hash
>     * hash is different: file has changed,  put into report, update index
>     * hash is the same: manual intervention is required to decide
>       whether the file should be in the report or not
> 
> + NEITHER mtime, NOR size changed: do nothing
> 
> 
> How likely is it that the hash is the same though the file did change?

Well, if you take SHA-256 for example, then its hash has 256 bits so you have a
chance of 1 / (2**256) that two non-identical byte vectors will have the same
contents.

shlomif@telaviv1:~$ perl -Mbigint -E 'say 2 ** 256'
115792089237316195423570985008687907853269984665640564039457584007913129639936
shlomif@telaviv1:~$ 

> And when you think of it, it's even possible (yet unlikely) that the
> file changed _and_ has still the same size and same mtime.
> 
> So how do you reliably detect whether a file has changed or not?  Create
> a copy of the file as it went into the report and when generating
> another report, compare the current file with the backup?  Maybe not a
> bad idea, it would be pretty reliable without any complicated ado.
> 

Yes, that's the only 100% fail-proof way that I can think about, but relying on
a hash is not a bad idea.

Regards,

        Shlomi Fish

-- 
-----------------------------------------------------------------
Shlomi Fish       http://www.shlomifish.org/
Best Introductory Programming Language - http://shlom.in/intro-lang

COBOL is the old Java.

Please reply to list if it's a mailing list post - http://shlom.in/reply .

-- 
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to