> From:Alex Shinn
> > Keep in mind that the UTF-8 forward iterator operation has conditional
> > branches. Merely the act of advancing from one character to another
> > could take one of four paths, or more if you include the possibility
> > of invalid UTF-8 sequences.
>
> No, technically you d
> (string-upcase "Straße") => "STRAßE" (should
> be "STRASSE")
> (string-downcase "ΧΑΟΣΣ") => "χαοσσ" (should
> be "χαoσς")
> (string-downcase "ΧΑΟΣ Σ") => "χαοσ σ" (should
> be "χαoς σ")
Well, yes and no. R6RS yes. SRFI-13 no.
Mike Gran writes:
>> The reason I am still arguing this point is because I have looked
>> seriously at what I would need to do to (A) fix our i18n problems and
>> (B) make the code efficient. I very much want to fix these things,
>> but the pain of trying to do this with our current scheme is too
On Wed, Mar 16, 2011 at 5:39 AM, Mike Gran wrote:
>> From:Mark H Weaver
>>
>> Mike Gran writes:
>> > We do, in a matter of speaking, have a single string representation:
>> > UTF-32. The 'narrow' encoding is UTF-32 with the initial 3 bytes
>> of
>> > zero removed.
>>
>> Despite the similarity o
On Wed, Mar 16, 2011 at 12:46 AM, Mark H Weaver wrote:
> Alex Shinn wrote:
>> On Sun, Mar 13, 2011 at 1:05 PM, Mark H Weaver wrote:
>>> I just realized that it is possible to implement O(1) accessors for
>>> UTF-8 backed strings.
>>
>> It's possible with several approaches, but not necessarily w
> The reason I am still arguing this point is because I have looked
> seriously at what I would need to do to (A) fix our i18n problems and
> (B) make the code efficient. I very much want to fix these things,
> but the pain of trying to do this with our current scheme is too much
> for me to bear.
Mike Gran writes:
>> From:Mark H Weaver
>> Despite the similarity of these two representations, they are
>> sufficiently different that they cannot be handled by the same machine
>> code. That means you must either implement multiple inner loops, one
>> for each combination of string parameter r
> From:Mark H Weaver
>
> Mike Gran writes:
> > We do, in a matter of speaking, have a single string representation:
> > UTF-32. The 'narrow' encoding is UTF-32 with the initial 3 bytes
> of
> > zero removed.
>
> Despite the similarity of these two representations, they are
> sufficiently diff
Mike Gran writes:
> We do, in a matter of speaking, have a single string representation:
> UTF-32. The 'narrow' encoding is UTF-32 with the initial 3 bytes of
> zero removed.
Despite the similarity of these two representations, they are
sufficiently different that they cannot be handled by the s
Alex Shinn wrote:
> On Sun, Mar 13, 2011 at 1:05 PM, Mark H Weaver wrote:
>> I just realized that it is possible to implement O(1) accessors for
>> UTF-8 backed strings.
>
> It's possible with several approaches, but not necessarily worth it:
>
> http://trac.sacrideo.us/wg/wiki/StringRepresentati
10 matches
Mail list logo