Hello!
Mike Gran writes:
> On Tue, 2009-04-21 at 23:37 +0200, Ludovic Courtès wrote:
>> You seem to imply that `scm_getc ()' will now return a Unicode
>> codepoint, is that right? What about `scm_c_{read,write} ()', and
>> `scm_{get,put}s ()'?
>>
>
> I vacillate on this, but, I think the most
On Tue, 2009-04-21 at 23:37 +0200, Ludovic Courtès wrote:
> > This is all going to be slower than before because of the string
> > conversion operations, but, I didn't want to do any premature
> > optimization. First, I wanted to get it working, but, there is plenty
> > of room for optimization l
Hello!
Mike Gran writes:
> Strings are internally encoded either as "narrow" 8-bit ISO-8859-1
> strings or as "wide" UTF-32 strings. Strings are usually created as
> narrow strings. Narrow strings get automatically widened to wide
> strings if non-8-bit characters are set! or appended to them.
l...@gnu.org (Ludovic Courtès) writes:
>>> Do we need to talk more about what needs to be accomplished? Do we
>>> need a complete specification? Do we need a vote on if it is a good
>>> idea?
>>
>> I think you're going in the right direction. More importantly, although
>> I can't speak for them, N
Hi,
Andy Wingo writes:
> On Wed 28 Jan 2009 17:44, Mike Gran writes:
>
>> Since I need this functionality taken care of, and since I have some
>> time to play with it, what's the procedure here?
>
> The best thing IMO would be to hack on it on a Git branch, with small
> and correct patches. We
Hello,
Clinton Ebadi writes:
> The `scm_{to|from}_locale_string' functions provide enough abstraction
> to make this doable without breaking anything that doesn't use
> `scm_take_locale_string' (and even then Guile can detect when the locale
> is not UCS-4, revert to `scm_from_locale_string' and
Mike Gran writes:
> Hi,
>
> Let's say that one possible goal is to add wide strings
> * using Gnulib functions
> * with minimal changes to the public Guile API
> * where chars become 4-byte codepoints and strings are internally
> either UTF-32 or ISO-8859-1
>
> Since I need this functionalit
Hi,
On Wed 28 Jan 2009 17:44, Mike Gran writes:
> Since I need this functionality taken care of, and since I have some
> time to play with it, what's the procedure here?
The best thing IMO would be to hack on it on a Git branch, with small
and correct patches. We could get you commit access if
Hi,
Let's say that one possible goal is to add wide strings
* using Gnulib functions
* with minimal changes to the public Guile API
* where chars become 4-byte codepoints and strings are internally
either UTF-32 or ISO-8859-1
Since I need this functionality taken care of, and since I have so
Hi!
Mike Gran writes:
> Gnulib works for me. Bruno is the maintainer of those funcs, so I'm
> sure they work great.
Good!
> So really the first questions to answer are the encoding question and
> whether the R6RS string API is the goal.
SRFI-1[34] (i.e., status quo in terms of supported AP
On Tue 27 Jan 2009 06:52, Mike Gran writes:
> I said
>
>> (Though, such a scheme would force scm_take_locale_string to become
>> scm_take_iso88591_string.)
>
> which is incorrect. Under the proposed scheme, scm_take_locale_string
> would only be able to use that storage directly if it happened
I said
> (Though, such a scheme would force scm_take_locale_string to become
> scm_take_iso88591_string.)
which is incorrect. Under the proposed scheme, scm_take_locale_string
would only be able to use that storage directly if it happened to be
ASCII or 8859-1.
Hello,
> Ludo' sez
>> Mike Gran writes:
> BTW, Gnulib has a wealth of modules that could be helpful here:
> http://www.gnu.org/software/gnulib/MODULES.html#posix_ext_unicode
> I used a few of them in Guile-R6RS-Libs to implement `string->utf8'
> and such like.
The Gnulib routines seem perfe
Hello,
Mike Gran writes:
> There are 3 good, actively developed solutions of which I am aware.
>
> 1. Use GNU libc functionality. Encode wide strings as wchar_t.
That'd be POSIX functionality, actually.
> 2. Use GLib functionality. Encode wide strings as UTF-8. Possibly
> give up on O(1).
Hello!
Neil Jerram writes:
> But what about the other possible debate, about the API? Are you
> thinking that we should accept R6RS's choice?
No, I think we have SRFI-1[34] to start with, both of which are well
defined in the context of Unicode.
> (I really haven't read up on all this enough
> > Ludo sez,
> Mike sez,
> > 1. IMO it'd be nice to have ASCII strings special-cased so that they
> >are always encoded in ASCII. This would allow for memory savings
> >since, e.g., most symbols are expected to contain only ASCII
> >characters. It might also simplify interaction wit
> From: Ludovic Courtès l...@gnu.org
I believe that we should aim for R6RS strings.
I think the most important thing is to have humility in the face of an
impossible problem: how to encode all textual information. It is
important to "stand on the shoulders of giants" here. It becomes a
matter o
2009/1/25 Ludovic Courtès :
>
> I agree it would be really nice to have Unicode support, but I'm not
> aware of any "plan", so please go ahead! :-)
Indeed.
> A few considerations regarding the inevitable debate about the internal
> string representation:
[...]
But what about the other possible
Hello!
Mike Gran writes:
> Hi. I know there has been a lot of talk about wide characters and
> Unicode over the years. I'd like to see it happen because how the are
> implemented will determine the future of a couple of my side-projects.
> I could pitch in, if you needed some help.
Indeed, it
19 matches
Mail list logo