Ludovic Courtès <l...@gnu.org> writes: > commit ca1e3ad2faa59d5b32289f84e0937fa476e21a1a > Author: Ludovic Courtès <l...@gnu.org> > Date: Sat Feb 28 01:01:51 2015 +0100 > > utils: Change 'patch-shebangs' to use binary input. > > * guix/build/utils.scm (get-char*): New procedure. > (patch-shebang): Use it instead of 'read-char'. > (fold-port-matches): Remove local 'get-char' and use 'get-char*' > instead. > --- > guix/build/utils.scm | 22 +++++++++++----------- > 1 files changed, 11 insertions(+), 11 deletions(-) > > diff --git a/guix/build/utils.scm b/guix/build/utils.scm > index a3f8911..c98c4ca 100644 > --- a/guix/build/utils.scm > +++ b/guix/build/utils.scm > @@ -618,6 +618,14 @@ transferred and the continuation of the transfer as a > thunk." > (stat:atimensec stat) > (stat:mtimensec stat))) > > +(define (get-char* p) > + ;; We call it `get-char', but that's really a binary version > + ;; thereof. (The real `get-char' cannot be used here because our > + ;; bootstrap Guile is hacked to always use UTF-8.) > + (match (get-u8 p) > + ((? integer? x) (integer->char x)) > + (x x))) > +
This is equivalent to reading with the ISO-8859-1 encoding. The problem is that the procedures that use 'get-char*' will then typically use UTF-8 to write these characters back, so all non-ASCII characters will get corrupted by these filters. For now, I would suggest just using ISO-8859-1 for all of these build utilities that filter or substitute existing files, and then use the textual I/O procedures. A better solution going forward would be to implement and use a permissive UTF-8 encoding in Guile. What do you think? Mark