tags 16361 + notabug wontfix close 16361 thanks Zefram <zef...@fysh.org> writes:
> The automatic cache of compiled versions of scripts in guile-2.0.9 > identifies scripts mainly by name, and partially by mtime. This is not > actually sufficient: it is easily misled by a pathname that refers to > different files at different times. Test case: > > $ echo '(display "aaa\n")' >t13 > $ echo '(display "bbb\n")' >t14 > $ guile-2.0 t13 > ;;; note: auto-compilation is enabled, set GUILE_AUTO_COMPILE=0 > ;;; or pass the --no-auto-compile argument to disable. > ;;; compiling /home/zefram/usr/guile/t13 > ;;; compiled > /home/zefram/.cache/guile/ccache/2.0-LE-8-2.0/home/zefram/usr/guile/t13.go > aaa > $ mv t14 t13 > $ guile-2.0 t13 > aaa > > You can see that the mtime is not fully used here: the cache is misapplied > even if there is a delay of seconds between the creations of the two > script files. The cache's mtime check will only notice a mismatch if > the script currently seen under the supplied name was modified later > than when the previous script was *compiled*. > > Obviously, in this test case the cache could trivially distinguish the > two script files by looking at the inode numbers. On its own the inode > number isn't sufficient, but exact match on device, inode number, and > mtime would be far superior to the current behaviour, only going wrong > in the presence of deliberate timestamp manipulation. As a bonus, if > the cache were actually *keyed* by inode number and device, rather than > by pathname, it would retain the caching of compilation across renamings > of the script. > > Or, even better, the cache could be keyed by a cryptographic hash of the > file contents. This would be immune even to timestamp manipulation, and > would preserve the cached compilation even across the script being copied > to a fresh file or being edited and reverted. This would be a cache > worthy of the name. The only downside is the expense of computing the > hash, but I expect this is small compared to the expense of compilation. You could make the same complaint about 'make', 'rsync', or any number of other programs. It's true that a cryptographic hash would be more robust, but it would also be considerably more expensive in the common case where the .go file is already in the cache. I don't think it's worth paying this cost every time a .go file is loaded, to guard against the unlikely scenario you outlined above. The mtime check is very widely used, and accepted practice. I'm closing this ticket. Mark