On 11/11/2010 04:32 PM, Bruno Haible wrote: > Hi Eric, Paul, > > I'm seeing a test failure in test-utimens and test-futimens on a Linux/x86 > machine, and have a hard time understanding the reason.
NFS seems to often be a culprit in exposing timing bugs :( (Have I ever mentioned how much NFS disobeys the rules assumed by the standard, and how many broken NFS implementations exist?) > (Side note here: On a machine with a very similar configuration, > same kernel version, same gcc version, but with a significative > difference between NFS server time and NFS client time - ca. 7 minutes -, > the autoconf test in m4/utimes.m4 gave the result > checking whether the utimes function works... no > because at m4/utimes.m4:63 the values were > now = 1289514192 > sbuf.st_atime = 1289513839 > sbuf.st_mtime = 1289513839 > therefore now - sbuf.st_atime <= 2 is false and now - sbuf.st_mtime <= 2 is > also false. As a consequence, on this other machine HAVE_WORKING_UTIMES > does not get defined, and the tests pass. It therefore looks to me like > the code in m4/utimes.m4 lines 57..64 is unreliable. End of side note.) Yeah, NFS does weird things with timestamps when the server and client are way out of sync. Some timestamps are created by the client, others are created by the server, and I've seen systems where client-only actions that are not flushed back to the server use monotonically increasing (but completely bogus) timestamps to track that the server will eventually supply a real timestamp in that field. I thought I nailed all of those issues in the test-utime* files back when I went through several iterations of fine-tuning the tests, by carefully isolating which actions caused a difference between server and client timestamps, and only comparing events that should have had the timestamp generated from the same machine. But there may yet be another fix to make for the .m4 file. > > Analyzing the test-utimens failure: > two sets of values get printed: > > 946684800 946684800 946684800 > 946684800 946684801 946684800 > > The first line is from > test_utimens (utimens, true); > The second line is from > test_utimens (do_fdutimens, true); > > So, it uses fdutimens() from gnulib and futimes() from glibc. > > Adding HAVE_BUGGY_NFS_TIME_STAMPS=1 does not help. > > Here are the corresponding library calls (ltrace): > > open64("test-utimens.tfile", 1, 027777721520) = 3 > __errno_location() = 0xb75e3060 > fsync(3, 0xbfffa1e0, 320, 0x804b74d, 1) = 0 > futimes(3, 0xbfffa1e0, 320, 0x804b74d, 1) = 0 > close(3) = 0 > __xstat64(3, "test-utimens.tfile", 0xbfffa370) = 0 > > At the moment of the futimes() function gets called, here are its arguments: > > Breakpoint 1, fdutimens (fd=6, file=0x804f8c9 "test-utimens.tfile", > timespec=0x0) at utimens.c:337 > 337 if (futimes (fd, t) == 0) > (gdb) print t[0] > $2 = {tv_sec = 946684800, tv_usec = 499999} > (gdb) print t[1] > $3 = {tv_sec = 946684800, tv_usec = 999999} > > So, you can see, the times passed have been truncated (not rounded) to > microsecond resolution. Correct - per POSIX 2008, timestamps should always be truncated, never rounded up. > > And the corresponding system calls (strace): > > open("test-utimens.tfile", O_WRONLY|O_LARGEFILE) = 3 > fsync(3) = 0 > utime("/proc/self/fd/3", [2000/01/01-00:00:00, 2000/01/01-00:00:01]) = 0 > close(3) = 0 > stat64("test-utimens.tfile", {st_mode=S_IFREG|0600, st_size=0, ...}) = 0 > > As you can see, the futimes() call resulted in an utime() call - and the > mtime has been *rounded* to the nearest second. This comes from code > at glibc/sysdeps/unix/sysv/linux/futimes.c. The rounding in there is present > since the very first version of this file. Ouch - that's a bug in glibc. > > Questions: > - Is glibc's futimes() implementation correct? Is futimes() allowed to round > up by as much as half a second? No. > - If not, shouldn't gnulib work around it? Yes. > - If yes, is the code that invokes futimes in lib/utimens.c correct? Probably not yet, given the newly exposed glibc bug (the addition of utimensat in newer kernels and glibcs hides the bug on modern machines, since those correctly truncate). > - Is the test correct, or should it allow a rounded-up mtime? In this case, the test seems to be correct. -- Eric Blake ebl...@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org
signature.asc
Description: OpenPGP digital signature