Hi Andy, Andy Wingo <wi...@pobox.com> writes:
> On Sat 28 Jan 2012 11:21, Mark H Weaver <m...@netris.org> writes: > >> The R5RS specifies that if 'char-ready?' returns #t, then the next >> 'read-char' operation is guaranteed not to hang. This is not currently >> the case for ports using a multibyte encoding. >> >> 'char-ready?' currently returns #t whenever at least one _byte_ is >> available. This is not correct in general. It should return #t only if >> there is a complete _character_ available. > > This procedure is omitted in the R6RS because it is not a good > interface. Besides its semantic difficulties, can you think of a sane > implementation for multibyte characters? Maybe I'm missing something, but I don't see any semantic problem here, and it seems straightforward to implement. 'char-ready?' should simply read bytes until either a complete character is available, or no more bytes are ready. In either case, all the bytes should then be 'unget' before returning. What's the problem? The only reason I haven't yet fixed this is because it will require some refactoring in ports.c. I guess the most straightforward approach is to generalize 'get_codepoint', 'get_utf8_codepoint', and 'get_iconv_codepoint' to support a non-blocking mode of operation. What do you think? Regards, Mark