Hi Lee, On Thu, 13 Jun 2013 22:51:24 +0200 lee <l...@yun.yagibdah.de> wrote:
> "Dr.Ruud" <rvtol+use...@isolution.nl> writes: > > > On 12/06/2013 10:27, lee wrote: > > > >> File sizes do not reliably indicate whether a file has been modified or > >> not. > > > > If the file size has changed, then your file has changed. That is 100% > > reliable, and is a quick and cheap check. > > It works only one way: different size --> file has changed. What > doesn't work is: file has changed --> different size. Right. > > > But if the size hasn't changed, then you still need to check something > > else. You can do another light check, or decide to do the heavy one. > > > > This is also important because a hash-value is only a fingerprint, so > > different files have (a small chance on having) the same hash value. > > > > The file size check makes the chance even smaller that you don't > > detect the change. > > Hm ok, this kinda sucks ... Imagine I check size and mtime and I have > a SHA-256 hash. Now there are the following cases: > > > + mtime AND size changed: file has changed, put into report, update index > > + EITHER mtime changed, size is same, OR size changed, mtime is same: If the size changed, then the file definitely changed. > * compute hash > * hash is different: file has changed, put into report, update index > * hash is the same: manual intervention is required to decide > whether the file should be in the report or not > > + NEITHER mtime, NOR size changed: do nothing > > > How likely is it that the hash is the same though the file did change? Well, if you take SHA-256 for example, then its hash has 256 bits so you have a chance of 1 / (2**256) that two non-identical byte vectors will have the same contents. shlomif@telaviv1:~$ perl -Mbigint -E 'say 2 ** 256' 115792089237316195423570985008687907853269984665640564039457584007913129639936 shlomif@telaviv1:~$ > And when you think of it, it's even possible (yet unlikely) that the > file changed _and_ has still the same size and same mtime. > > So how do you reliably detect whether a file has changed or not? Create > a copy of the file as it went into the report and when generating > another report, compare the current file with the backup? Maybe not a > bad idea, it would be pretty reliable without any complicated ado. > Yes, that's the only 100% fail-proof way that I can think about, but relying on a hash is not a bad idea. Regards, Shlomi Fish -- ----------------------------------------------------------------- Shlomi Fish http://www.shlomifish.org/ Best Introductory Programming Language - http://shlom.in/intro-lang COBOL is the old Java. Please reply to list if it's a mailing list post - http://shlom.in/reply . -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/