Re: Unicode ports patch

Mike Gran Tue, 01 Sep 2009 12:20:04 -0700

----- Original Message ----
> From: Andy Wingo <wi...@pobox.com>
> To: Ludovic Courtès <l...@gnu.org>
> Cc: guile-devel@gnu.org
> Sent: Tuesday, September 1, 2009 11:25:26 AM
> Subject: Re: Unicode ports patch
> 
> Hi,
> 
> On Tue 01 Sep 2009 10:19, l...@gnu.org (Ludovic Courtès) writes:
> 
> > Mike Gran writes:
> >
> >> The latest commit 'Add full Unicode capability to ports and the default
> >> reader' 889975e51accb80491af76fc5db980aeb3edd342 adds the majority of
> >> the functionality for non-ASCII strings.  
> >
> > This patch adds a few functions related to string ports:
> >
> >  * libguile/strports.c: store string ports in locale encoding
> >    (scm_strport_to_locale_u8vector, scm_call_with_output_locale_u8vector)
> >    (scm_open_input_locale_u8vector, scm_get_output_locale_u8vector):
> >    new functions
> >
> > I think it would be nicer if these used bytevectors instead of u8vectors
> > and were locale-independent (which would match the `string->utf8' &
> > co. API).  Also I would make `scm_strport_to_locale_u8vector ()'
> > private.  And finally, it'd be even better if it were documented in the
> > manual.  :-)


I don't understand.  "it would be nicer if *these* ..."

To what does *these* refer: string ports?  It would be nicer if we replace
string ports with bytevector ports?  Or it would be nicer if 
scm_get_output_locale_u8vector was scm_get_output_bytevector?

"it would be nicer if these used bytevectors ... and were *locale-independent*"

It would be nicer if string ports were actually bytevector ports, and that 
they were locale-independent?  Or that scm_get_output_bytevector returned a 
locale-independent (ergo 8-bit or 32-bit) vector?

> >
> > Actually I'm not convinced that `call-with-output-locale-*' and
> > `open-input-locale-*' are useful, precisely because we can use a string
> > port to get a string and then `string->utf8' to get at the string bits.

"We can use a string port to get a string"

If we write to a string port and pop a result string?

"And then use string->utf8 to get at the string bits"

And then convert the result string to a UTF-8 encoded bytevector?

> 
> FWIW, I think I agree with all of Ludovic's comments; though if there is
> a way that we can simply arrange to output bytes to an R6RS binary
> output port, I think there are already efficient means to collect the
> bytes from such a port in a bytevector.

Thanks,
Mike

Re: Unicode ports patch

Reply via email to