On 24/05/15 12:33, Ludovic Courtès wrote: > (Please keep 20...@debbugs.gnu.org Cc'd.) > (Gnulib: please scroll further down for the ‘unlinkat’ issue.) > > Andy Patterson <ajpat...@uwaterloo.ca> skribis: > >>> I suppose this is Guix 0.8.2 on top of another distribution, right? Did >>> you install from source or from the binary tarball? Did you enable >>> substitutes (info "(guix) Substitutes")? >> >> I was using the USB install medium in a live environment. > > So this is on GuixSD 0.8.2. ‘test-suite.log’ indeed mentions > Linux-libre 4.0.2. > >> I had substitutes enabled (I'm pretty sure they're enabled by default >> here, but I also enabled them manually just to be sure). I wasn't able >> to install anything with substitutes enabled; it would always stall >> while trying to update the substitutes list from hydra. When my >> network went down briefly, it informed me that it was still at 0.0% >> before exiting. I think that this is probably a separate issue, but >> which which I was less concerned about since I didn't want to use >> substitutes anyway. > > OK. > > hydra.gnu.org is unfortunately too often overloaded these days, so you > probably arrived on a bad day. Nevertheless, the solution to this > specific issue is for you to use substitutes to circumvent the bug > described below. > >>> Does the build succeed if you run it another time with: >>> >>> guix build tar -K -c 1 >> >> I tried this (with --no-substitutes), but I don't think the test suite >> actually runs in parallel. I didn't notice any difference in that regard >> when it was running; it seemed to take up the same amount of time with >> or without -c 1. I had the same tests fail with the flag enabled. > > Oh you must be right. Looking at tests/Makefile.in, I see: > > --8<---------------cut here---------------start------------->8--- > check-local: atconfig atlocal $(TESTSUITE) > $(SHELL) $(TESTSUITE) $(TESTSUITEFLAGS) > --8<---------------cut here---------------end--------------->8--- > > ... which shows that ./testsuite is not automatically passed -j, > contrary to what I thought. > > <http://lists.gnu.org/archive/html/bug-tar/2014-08/msg00010.html> > reports a similar issue but on a different OS. > > I just tried this in a GuixSD VM with Linux-libre 4.0.2: > > --8<---------------cut here---------------start------------->8--- > mkdir foo > mkdir bar > echo foo/foo_file > foo/foo_file > echo bar/bar_file > bar/bar_file > tar -cvf foo.tar --remove-files -C foo . -C ../bar . > find . > stat bar > --8<---------------cut here---------------end--------------->8--- > > And indeed, it fails (that is, ‘bar’ is left behind.) It works fine on > 4.0.4-gnu though. > > On 4.0.2-gnu, I strace’d the ‘tar’ command above: > > --8<---------------cut here---------------start------------->8--- > openat(AT_FDCWD, "foo", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = > 4 > > [...] > > openat(4, ".", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_NOFOLLOW|O_CLOEXEC) = 5 > > [...] > > openat(5, "foo_file", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_NOFOLLOW|O_CLOEXEC) = 6 > > [...] > > openat(4, "../bar", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 5 > newfstatat(5, ".", {st_mode=S_IFDIR|0755, st_size=60, ...}, > AT_SYMLINK_NOFOLLOW) = 0 > openat(5, ".", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_NOFOLLOW|O_CLOEXEC) = 6 > > [...] > > openat(6, "bar_file", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_NOFOLLOW|O_CLOEXEC) = 7 > fstat(7, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0 > write(1, "./bar_file\n", 11) = 11 > read(7, "x\n", 2) = 2 > fstat(7, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0 > close(7) = 0 > fstat(6, {st_mode=S_IFDIR|0755, st_size=60, ...}) = 0 > brk(0x1a34000) = 0x1a34000 > close(6) = 0 > write(3, "./\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., > 10240) = 10240 > close(3) = 0 > unlinkat(4, "foo_file", 0) = 0 > unlinkat(AT_FDCWD, "foo", AT_REMOVEDIR) = 0 > unlinkat(5, "bar_file", 0) = 0 > unlinkat(4, "../bar", AT_REMOVEDIR) = -1 ENOENT (No such file or > directory) > --8<---------------cut here---------------end--------------->8--- > > Contrast this with the same thing on 4.0.4-gnu: > > --8<---------------cut here---------------start------------->8--- > unlinkat(4, "foo_file", 0) = 0 > unlinkat(AT_FDCWD, "foo", AT_REMOVEDIR) = 0 > unlinkat(5, "bar_file", 0) = 0 > unlinkat(4, "../bar", AT_REMOVEDIR) = 0 > --8<---------------cut here---------------end--------------->8--- > > So this looks like a 4.0.2 kernel bug that Gnulib’s unlinkat should > perhaps work around. > > Thoughts?
Maybe. How widely deployed was 4.0.2 (It's not used in Red Hat land for example). How many versions was the bug present for? If it was just a fleeting issue, then there is less incentive to workaround. cheers, Pádraig