On Wed, Jun 06, 2007 at 12:04:35AM +0200, Jean-Marc Lasgouttes wrote: > >>>>> "Mael" == Mael Hilléreau <[EMAIL PROTECTED]> writes: > > >> I am not sure we have evidence that the checksum is costing us too > >> much. > > Mael> Perhaps true for regular files (O(n) complexity), but not really > Mael> for packages(O(n^2) complexity, supposing that there are no > Mael> subdirectories)... > > The complexity depends on the total size of the data. A big file is > worse than a small directory.
Not necessarily true. Reading large files is fairly cheap on Windows, reading lots of small files is not, even if the small files comprise one a fraction of the total size of the big size. > Mael> But the first question to answer is: do we need it?? > > I think we do, but I do not remember whiy it is better than our old > time-base version. One advantage is that you detect when a file has > been updated without change: typical of the .aux file written by LaTeX. > > Mael> Further, if we need it for files, do we need it for directories? > > Yes if we want directories to masquerade as files. > > It should be possible to feed in order all the bytes of all the files > in the directory to the crc checker. If we have a proper directory > iterator in boost, it should be fairly easy. It would btw be sufficient not to crc all the files in a directory but only the _checksums_ of the files contained in the directory. Andre'