Re: [PATCH] Improve handling of Unicode byte-order marks (BOMs)

2013-04-05 Thread Mark H Weaver
l...@gnu.org (Ludovic Courtès) writes: > Mike Gran skribis: > >> It would be a trivial function to write, of course, but there is a >> c-strcasecmp func in gnulib. > > Yes, better use that one. > > (Just add ‘c-strcase’ in m4/gnulib-cache.m4, run ‘gnulib-tool --update’ > with Gnulib v0.0-7865-ga8

Re: [PATCH] Improve handling of Unicode byte-order marks (BOMs)

2013-04-05 Thread Ludovic Courtès
Mike Gran skribis: +      /* If the specified encoding is UTF-16 or UTF-32, then make +        that more precise by deciding what endianness to use.  */ +      if (strcasecmp (pt->encoding, "UTF-16") == 0) +        precise_encoding = decide_utf16_encoding (port, mode); >>

Re: [PATCH] Improve handling of Unicode byte-order marks (BOMs)

2013-04-05 Thread Mike Gran
>>> +      /* If the specified encoding is UTF-16 or UTF-32, then make >>> +        that more precise by deciding what endianness to use.  */ >>> +      if (strcasecmp (pt->encoding, "UTF-16") == 0) >>> +        precise_encoding = decide_utf16_encoding (port, mode); >>> +      else if (strcas

Re: [PATCH] Improve handling of Unicode byte-order marks (BOMs)

2013-04-05 Thread Mark H Weaver
Hi Andy, Andy Wingo writes: > On Wed 03 Apr 2013 22:33, Mark H Weaver writes: > >> + /* If we just read a BOM in an encoding that recognizes them, >> + then silently consume it and read another code point. */ >> + if (SCM_UNLIKELY (*codepoint == SCM_UNICODE_BOM >>

Re: [PATCH] Improve handling of Unicode byte-order marks (BOMs)

2013-04-04 Thread Andy Wingo
Hi. The following review applies to the wrong version of this patch. I'll go ahead and post it anyway. On Wed 03 Apr 2013 22:33, Mark H Weaver writes: > + /* If we just read a BOM in an encoding that recognizes them, > + then silently consume it and read another code point.

Re: [PATCH] Improve handling of Unicode byte-order marks (BOMs)

2013-04-03 Thread Mark H Weaver
Here's the latest revision of the patch. The only thing that has changed is the documentation. Mark >From a3f2c379f11782f0440d9beb2b40601146ee14ea Mon Sep 17 00:00:00 2001 From: Mark H Weaver Date: Wed, 3 Apr 2013 04:22:04 -0400 Subject: [PATCH] Improve handling of Unicode by

Re: [PATCH] Improve handling of Unicode byte-order marks (BOMs)

2013-04-03 Thread Mark H Weaver
Thanks for the review, Mike. I've attached a new patch with those problems (and a few others) fixed. Mark >From a373927201028915f7b8cd5a1c72c5819cb4797c Mon Sep 17 00:00:00 2001 From: Mark H Weaver Date: Wed, 3 Apr 2013 04:22:04 -0400 Subject: [PATCH] Improve handling of Unic

Re: [PATCH] Improve handling of Unicode byte-order marks (BOMs)

2013-04-03 Thread Mike Gran
Hi Mark >>> Here's the new patch.  Any more suggestions? There are a couple of lines in your doc patch that aren't quite right. "@code{UTF-16BE}, @code{UTF-16LE}, @code{UTF-16BE}, or @code{UTF-16LE}" I assume that two of these should be UTF-32. Also "This is intended to multiple logical te

Re: [PATCH] Improve handling of Unicode byte-order marks (BOMs)

2013-04-03 Thread Mark H Weaver
e tweaks. Thanks, Mark >From f849f9a3f6babd87088d39369442a7f429762cec Mon Sep 17 00:00:00 2001 From: Mark H Weaver Date: Wed, 3 Apr 2013 04:22:04 -0400 Subject: [PATCH] Improve handling of Unicode byte-order marks (BOMs). * libguile/ports-internal.h (struct

Re: [PATCH] Improve handling of Unicode byte-order marks (BOMs)

2013-04-03 Thread Ludovic Courtès
Mark H Weaver skribis: > l...@gnu.org (Ludovic Courtès) writes: >> Woow, well thought out. The semantics seem good. (It’s interesting to >> see how BOMs complicate things, but that’s life, I guess.) >> >> The patch looks good to me. The test suite is nice. It doesn’t seem to >> cover all the

Re: [PATCH] Improve handling of Unicode byte-order marks (BOMs)

2013-04-03 Thread Mark H Weaver
precise_encoding = decide_utf32_encoding (port, mode); > > Shouldn’t it be strcasecmp? (Actually there are other uses of strcmp > already, but I think it’s a mistake.) Ouch, good catch! Indeed, we already had some bugs because of this. I pushed a fix for the existing bugs to stable

Re: [PATCH] Improve handling of Unicode byte-order marks (BOMs)

2013-04-03 Thread Ludovic Courtès
Hello, Mark! Mark H Weaver skribis: > * All kinds of streams are supported in a uniform way: files, pipes, > sockets, terminals, etc. > > * As specified in Unicode 6.2, BOMs are only handled specially at the > start of a stream, and only if the encoding is set to "UTF-16" or > "UTF-32". B

Re: [PATCH] Improve handling of Unicode byte-order marks (BOMs)

2013-04-03 Thread Mark H Weaver
. Mark >From d8d37d5519ca61961b70cb3051ccca2be7d4affa Mon Sep 17 00:00:00 2001 From: Mark H Weaver Date: Wed, 3 Apr 2013 04:22:04 -0400 Subject: [PATCH] Improve handling of Unicode byte-order marks (BOMs). * libguile/ports-internal.h (struct scm_port_internal): Add new members 'at_st

[PATCH] Improve handling of Unicode byte-order marks (BOMs)

2013-04-03 Thread Mark H Weaver
re's the patch. Comments and suggestions solicited. Mark >From 008b89c7ba4637e2d6323f02b6b8b6284a533857 Mon Sep 17 00:00:00 2001 From: Mark H Weaver Date: Wed, 3 Apr 2013 04:22:04 -0400 Subject: [PATCH] Improve handling of Unicode byte-order marks (BOMs). * libguile/ports-internal.h