Hi Mark, Mark H Weaver <m...@netris.org> skribis:
> Ludovic Courtès <l...@gnu.org> writes: > >> Mark H Weaver <m...@netris.org> skribis: >> >>> The source checkout currently being transferred for build 3432472 >>> (/gnu/store/…-font-google-material-design-icons-3.0.1-checkout) is 176 >>> megabytes uncompressed, as measured by "du -s --si", which is not >>> precisely same as NAR size, but hopefully close enough for a rough >>> estimate. As I write this, build 3432472 been stuck here for 24 hours >>> 15 minutes. Even if the average transfer rate were 4 kilobytes per >>> second, it should have been done in half that time. >> >> This is weird, could it be that data transfers get stuck somehow? > > As far as I can tell, that's what seems to happen. > >> Did you try to check the status of the ‘nix-store’ and ‘guix offload’ >> processes on the head node? > > Here are the corresponding 'guix offload' processes: > > hydra@20121227-hydra:~$ ps auxwwf | head -1; ps auxwwf | egrep -B1 'off()load' [...] > root 14769 0.0 0.2 145668 10912 ? SLsl Apr07 0:16 | | > \_ /gnu/store/yihvhxv3xyyvl1m2cy1lnf1lyi9h76fk-guile-2.2.2/bin/guile > --no-auto-compile > /gnu/store/fkkjhida23k612naa9d4q6avqj5v3b28-guix-0.13.0-8.357ab93/bin/.guix-real > offload x86_64-linux 3600 1 72000 The problem is that this is an ancient Guix. In the meantime, offloading has seen relevant changes, in particular things like commit ed7b44370f71126087eb953f36aad8dc4c44109f which address stability issues with Guile-SSH (ssh dist node) that was previously used. I think we should upgrade Guix on hydra.gnu.org otherwise we’re likely to end up chasing old bugs. > The 'nix-store' processes seem to be stuck sleeping in 'read', if I'm > interpreting the 'strace' output correctly: > > root@20121227-hydra:~# strace -p 8983 > Process 8983 attached - interrupt to quit > read(3, ^C <unfinished ...> > Process 8983 detached > root@20121227-hydra:~# strace -p 14767 > Process 14767 attached - interrupt to quit > read(3, ^C <unfinished ...> > Process 14767 detached > > > "netstat --inet --program" shows that the SSH connections are still > open: > > root@20121227-hydra:~# netstat --inet --program | grep 'hydra\.net\.in\.tum\.' > tcp 0 0 20121227-hydra.gn:53216 hydra.net.in.tum.de:ssh > ESTABLISHED 14769/guile > tcp 0 0 20121227-hydra.gn:52434 hydra.net.in.tum.de:ssh > ESTABLISHED 8985/guile > tcp 0 0 20121227-hydra.gnu.:www hydra.net.in.tum.:52104 TIME_WAIT > - > tcp 0 0 20121227-hydra.gnu.:www hydra.net.in.tum.:52103 TIME_WAIT > - This could be the kind of issue that we had with (ssh dist node). It’s hard to tell. > I could easily believe that this problem is specific to > hydra.gnunet.org, but even if that's the case, it would be good if > offloading would reliably time out before days have passed. That’s the case with commit a708de151c255712071e42e5c8284756b51768cd, but again, the Guix installation on hydra may predate that. :-/ Thanks, Ludo’.