l...@gnu.org (Ludovic Courtès) skribis: > Besides, commit e8c9f04 is interesting: ‘substitute*’ will now break > non-UTF-8 files by defaults (replacing invalid UTF-8 sequences with > question marks in the output.)
Based on that observation, commit dd0a8ef forced the ‘patch-*’ procedures to treat files as if they were ISO-8859-1–i.e., leaving their byte sequence uninterpreted, and thus avoiding multibyte sequence decoding errors. Then, as Mark suggested, commit 4db8716 forces strict encoding/decoding errors. The problem then is that we’re getting things like <http://hydra.gnu.org/build/263170/nixlog/1/raw>: --8<---------------cut here---------------start------------->8--- phase `unpack' succeeded after 0 seconds starting phase `patch-usr-bin-file' patch-/usr/bin/file: ./configure: changing `/usr/bin/file' to `/gnu/store/a31g38iykai59jqmcwknxyjddc5zxm9b-file-5.22/bin/file' patch-/usr/bin/file: ./configure: changing `/usr/bin/file' to `/gnu/store/a31g38iykai59jqmcwknxyjddc5zxm9b-file-5.22/bin/file' patch-/usr/bin/file: ./configure: changing `/usr/bin/file' to `/gnu/store/a31g38iykai59jqmcwknxyjddc5zxm9b-file-5.22/bin/file' patch-/usr/bin/file: ./configure: changing `/usr/bin/file' to `/gnu/store/a31g38iykai59jqmcwknxyjddc5zxm9b-file-5.22/bin/file' patch-/usr/bin/file: ./configure: changing `/usr/bin/file' to `/gnu/store/a31g38iykai59jqmcwknxyjddc5zxm9b-file-5.22/bin/file' patch-/usr/bin/file: ./configure: changing `/usr/bin/file' to `/gnu/store/a31g38iykai59jqmcwknxyjddc5zxm9b-file-5.22/bin/file' patch-/usr/bin/file: ./configure: changing `/usr/bin/file' to `/gnu/store/a31g38iykai59jqmcwknxyjddc5zxm9b-file-5.22/bin/file' patch-/usr/bin/file: ./configure: changing `/usr/bin/file' to `/gnu/store/a31g38iykai59jqmcwknxyjddc5zxm9b-file-5.22/bin/file' patch-/usr/bin/file: ./configure: changing `/usr/bin/file' to `/gnu/store/a31g38iykai59jqmcwknxyjddc5zxm9b-file-5.22/bin/file' Backtrace: [...] 745: 10 [patch-/usr/bin/file "./configure" #:file-command ...] In ice-9/boot-9.scm: 171: 9 [with-throw-handler #t ...] 867: 8 [call-with-input-file "./configure" ...] In /gnu/store/wcrp88qjv5bfhwcsxhbiqfh29da8pg81-module-import/guix/build/utils.scm: 474: 7 [#<procedure 1998e80 at /gnu/store/wcrp88qjv5bfhwcsxhbiqfh29da8pg81-module-import/guix/build/utils.scm:473:10 (in)> #<input: ./configure 11>] 500: 6 [#<procedure 1a092c0 at /gnu/store/wcrp88qjv5bfhwcsxhbiqfh29da8pg81-module-import/guix/build/utils.scm:496:6 (in out)> #<input: ./configure 11> ...] In srfi/srfi-1.scm: 465: 5 [fold #<procedure 17b41c0 at /gnu/store/wcrp88qjv5bfhwcsxhbiqfh29da8pg81-module-import/guix/build/utils.scm:500:32 (r+p line)> ...] In /gnu/store/wcrp88qjv5bfhwcsxhbiqfh29da8pg81-module-import/guix/build/utils.scm: 503: 4 [#<procedure 17b41c0 at /gnu/store/wcrp88qjv5bfhwcsxhbiqfh29da8pg81-module-import/guix/build/utils.scm:500:32 (r+p line)> # ...] In ice-9/regex.scm: 189: 3 [list-matches # ...] 176: 2 [fold-matches # ...] In unknown file: ?: 1 [regexp-exec # ...] In ice-9/boot-9.scm: 106: 0 [#<procedure 1998ec0 at ice-9/boot-9.scm:97:6 (thrown-k . args)> encoding-error ...] ice-9/boot-9.scm:106:20: In procedure #<procedure 1998ec0 at ice-9/boot-9.scm:97:6 (thrown-k . args)>: ice-9/boot-9.scm:106:20: Throw to key `encoding-error' with args `("scm_to_stringn" "cannot convert narrow string to output locale" 84 #f #f)'. --8<---------------cut here---------------end--------------->8--- The failure here occurs when using ‘guile-final’ (which has full iconv support.) When it stumbles upon the © sign in ‘configure’, it reads it, with ‘read-line’, as the sequence #\302 #\251. However, when passing that line back to ‘regexp-exec’, ‘regex-exec’ calls ‘scm_to_locale_string’ on it, which fails with the error above: this is because, in this build, we’re running on the C locale and #\302 aka. #\Â cannot be represented in ASCII (the encoding of the C locale.) To solve that problem, commit 87c8b92 makes UTF-8 locales available right after ‘guile-final’ is built. That way, calls to ‘scm_to_locale_string’ actually convert to UTF-8, which always work. (Note that the bootstrap Guile doesn’t have this problem because it uses UTF-8 for everything and ignores locale settings.) Hopefully we can enable full builds of ‘core-updates’ very soon now. Ludo’.