Hi, On berlin.guixsd.org, offloading would sometimes hang in the middle of an offloaded build: no more build log output showing up, nothing happening (this is with guix-0.14.0-6.0dcf675).
On the build machine side, the guile process that forwards data between the sshd and guix-daemon¹ is stuck on: read(0, …) with this stack trace: --8<---------------cut here---------------start------------->8--- (gdb) bt #0 0x00007f09d6068aed in read () from /gnu/store/3h31zsqxjjg52da5gp3qmhkh4x8klhah-glibc-2.25/lib/libpthread.so.0 #1 0x00007f09d653fc47 in fport_read () from /gnu/store/0v539yjmdqhjm1xcpvndmagkgjz5fvh2-guile-2.2.2/lib/libguile-2.2.so.1 #2 0x00007f09d656cd77 in scm_i_read_bytes () from /gnu/store/0v539yjmdqhjm1xcpvndmagkgjz5fvh2-guile-2.2.2/lib/libguile-2.2.so.1 #3 0x00007f09d65705fe in scm_fill_input () from /gnu/store/0v539yjmdqhjm1xcpvndmagkgjz5fvh2-guile-2.2.2/lib/libguile-2.2.so.1 #4 0x00007f09d6577897 in scm_get_bytevector_some () from /gnu/store/0v539yjmdqhjm1xcpvndmagkgjz5fvh2-guile-2.2.2/lib/libguile-2.2.so.1 #5 0x00007f09d65abc4d in vm_regular_engine () from /gnu/store/0v539yjmdqhjm1xcpvndmagkgjz5fvh2-guile-2.2.2/lib/libguile-2.2.so.1 #6 0x00007f09d65af2aa in scm_call_n () from /gnu/store/0v539yjmdqhjm1xcpvndmagkgjz5fvh2-guile-2.2.2/lib/libguile-2.2.so.1 #7 0x00007f09d65338d7 in scm_primitive_eval () from /gnu/store/0v539yjmdqhjm1xcpvndmagkgjz5fvh2-guile-2.2.2/lib/libguile-2.2.so.1 --8<---------------cut here---------------end--------------->8--- In theory this “cannot happen” because it reads from stdin iff ‘select’ said stdin is ready. On the server side (on berlin itself), the corresponding ‘guix offload’ process is stuck here: --8<---------------cut here---------------start------------->8--- (gdb) bt #0 0x00007ff49b3590bd in poll () from target:/gnu/store/3h31zsqxjjg52da5gp3qmhkh4x8klhah-glibc-2.25/lib/libc.so.6 #1 0x00007ff48f4db377 in ssh_poll_ctx_dopoll () from target:/gnu/store/3phbrya78gpk7rg6flqyqzf53y3x9zv9-libssh-0.7.5/lib/libssh.so.4 #2 0x00007ff48f4dc319 in ssh_handle_packets () from target:/gnu/store/3phbrya78gpk7rg6flqyqzf53y3x9zv9-libssh-0.7.5/lib/libssh.so.4 #3 0x00007ff48f4dc3ed in ssh_handle_packets_termination () from target:/gnu/store/3phbrya78gpk7rg6flqyqzf53y3x9zv9-libssh-0.7.5/lib/libssh.so.4 #4 0x00007ff48f4c8eff in ssh_channel_read_timeout () from target:/gnu/store/3phbrya78gpk7rg6flqyqzf53y3x9zv9-libssh-0.7.5/lib/libssh.so.4 #5 0x00007ff48f930803 in read_from_channel_port () from target:/gnu/store/xfaqdvk060yz7ddc9isk3wkybqmcfj3w-guile-ssh-0.11.2/lib/libguile-ssh.so.11 #6 0x00007ff49cea7d77 in scm_i_read_bytes () from target:/gnu/store/swyipr8smrd5bc72n92sdfxzx0p4cjpi-guile-2.2.2/lib/libguile-2.2.so.1 #7 0x00007ff49ceac3fc in scm_c_read_bytes () from target:/gnu/store/swyipr8smrd5bc72n92sdfxzx0p4cjpi-guile-2.2.2/lib/libguile-2.2.so.1 #8 0x00007ff49ceb2838 in scm_get_bytevector_n () from target:/gnu/store/swyipr8smrd5bc72n92sdfxzx0p4cjpi-guile-2.2.2/lib/libguile-2.2.so.1 #9 0x00007ff49cee6c4d in vm_regular_engine () from target:/gnu/store/swyipr8smrd5bc72n92sdfxzx0p4cjpi-guile-2.2.2/lib/libguile-2.2.so.1 #10 0x00007ff49ceea2aa in scm_call_n () from target:/gnu/store/swyipr8smrd5bc72n92sdfxzx0p4cjpi-guile-2.2.2/lib/libguile-2.2.so.1 #11 0x00007ff49ce6e8d7 in scm_primitive_eval () --8<---------------cut here---------------end--------------->8--- Presumably the ‘scm_get_bytevector_n’ call comes from (guix serialization) or ‘process-stderr’. IOW we have a deadlock where both sides are waiting for input data. Ludo’. ¹ https://git.savannah.gnu.org/cgit/guix.git/tree/guix/ssh.scm?id=0362e5820ab6a1eb8eaf33bc47e592857c25f765#n102