Re: [dev] [sbase][RFC] Add a simplistic version of tr

Strake Thu, 28 Nov 2013 10:25:17 -0800

On 28/11/2013, Silvan Jegen <s.je...@gmail.com> wrote:
> On Thu, Nov 28, 2013 at 11:45:33AM -0500, Strake wrote:
>> > (either using UTF-8 or UTF-32 indices), right?
>>
>> I meant Unicodepoints; those are just Unicodecs.
>
> UTF-32 is an encoding that is identical to the unicode point as far as
> I know. So what I am thinking is that one would either use the UTF-8
> representation of the Unicode point as an index, or the unicode point
> itself. Since using UTF-8 would not require any conversion (on UTF-8
> locales) I think it would be preferrable.


UTF-8 has variable width, so one must find the length of the sequence
anyhow and shift it bytewise into an integer, so one may as well just
use fgetwc or the like and work with codepoints.

Re: [dev] [sbase][RFC] Add a simplistic version of tr

Reply via email to