Ludovic Courtès <ludovic.cour...@inria.fr> skribis: > Inside the childhurd, we get: > > root@childhurd ~# tail -1 /var/log/guix-daemon.log > 2024-09-09 21:09:03 unexpected build daemon error: stoi > > Last time we got that error was in commit > 21deb89e287b5821975544118bf137562a91d4e1: guix-daemon was running with > incorrect locale data, and thus ‘std::stoi’ would throw (‘std::stoi’ in > guix-daemon is used for the communication with the ‘guix authenticate’ > process, see ‘readAuthenticateReply’).
I wanted to ‘rpctrace’ to see what ‘guix authenticate’ and ‘guix-daemon’ are telling each other. The problem is that ‘rpctrace’ is kinda broken, or at least it wrecks havoc when trying to follow forks or something. But I found a trick that allowed me to trace just ‘guix authenticate’: --8<---------------cut here---------------start------------->8--- root@childhurd ~# cat intercept-guix.sh #!/bin/sh /usr/bin/env > /tmp/env.log echo >> /tmp/env.log echo "$@" >> /tmp/env.log exec rpctrace -o /tmp/rpctrace.log -s 200 /gnu/store/cg64w7mv1v2r188rzcgksdva9cvj8vir-guix-1.4.0-24.9a2ddcc/bin/guix "$@" root@childhurd ~# LC_ALL=fr_FR.utf8 GUIX=$PWD/intercept-guix.sh /gnu/store/cg64w7mv1v2r188rzcgksdva9cvj8vir-guix-1.4.0-24.9a2ddcc/bin/guix-daemon --build-users-group guixbuild --max-silent-time 3600 --timeout 86400 --log-compression gzip --discover=no --disable-chroot --disable-deduplication --8<---------------cut here---------------end--------------->8--- (Here “LC_ALL=fr_FR.utf8” allows me to reproduce the stoi bug.) We get this: --8<---------------cut here---------------start------------->8--- root@childhurd ~# tail /tmp/rpctrace.log 5<--35(pid166)->dir_lookup ("gnu/store/9ghq6s4mq5sff9cwqrmn26ivycn3p8ql-guile-3.0.9/lib/guile/3.0/ccache/system/foreign-library.go" 4194305 0) = 0 1 "" 67<--70(pid166) 67<--70(pid166)->io_seek_request (0 2) = 0 98141 67<--70(pid166)->io_map_request () = 0 69<--66(pid166) (null) task30(pid166)-> 2089 (0 98141 0 1 69<--66(pid166) 0 1 1 7 1) = 0 61452288 task30(pid166)-> 3206 (pn{ 28}) = 0 task30(pid166)-> 3206 (pn{ 27}) = 0 task30(pid166)-> 2024 (61517824 18824 0 3) = 0 14<--31(pid166)->io_write_request ("GC Warning: Repeated allocation of very large block (appr. size 112 KiB):\n\tMay lead to memory leak and poor performance\n" -1) = 0 120 task30(pid166)-> 2012 (1 22) = 0 {0 25 195772416 15151104 0 0 0 0 1726072790 570000 0 0 0 0 0 0 0 0 1726072790 0 570000000 0} task30(pid166)-> 2012 (3 12) = 0 {0 30000 0 20000 0 0 30000000 0 0 0 20000000 0} --8<---------------cut here---------------end--------------->8--- That warning comes from libgc and there’s no such warning when LC_ALL is left unset, which is why it works in that case. The warning goes to stderr, but that’s what guix-daemon ends up reading, hence the ‘stoi’ error. One can reproduce that by printing something on (current-error-port) from ‘guix authenticate’ and running, say, “make check TESTS=tests/store.scm”: a bunch of tests fail. It’s not clear to me why libgc keeps printing that warning on i586-gnu (I think it’s fine on i686-linux so there might be a portability issue), and it’s not clear either why changing the locale triggers the warning (it might have to do with the so-called GC “black lists”, since loading a locale effectively changes the address space layout somewhat). I see several possible things to do: 1. Disable GC warnings altogether in Guix by setting ‘GC_warn_proc’. 2. Make sure guix-daemon discards stderr from agents such as ‘guix authenticate’. 3. Figure out why those “Repeated allocation” messages are so frequent on i586-gnu and Do Something About It™. Thoughts? Ludo’.