this just popped up when i was searching the archive.

On Mon 15 Mar 2010 18:44:41 EST 2010, quans...@quanstro.net wrote:
> On Mon Mar 15 17:46:11 EDT 2010, aim0s...@lav... wrote:
> > Yes, but why wc utility counts runes (wc(1) call them runes) manually
> > using huge table instead of using functions from rune(3) such as utflen? 
> 
> i didn't write wc, but i would imagine that it's for speed.

i took some time a few weeks ago to extend wc to handle runes
up to 0x10ffff which ment adding 3 states for 4-byte runes and
adding an additional table.  with that perspective ...

wc is a big state machine.  using the rune functions would hide
a good deal of the state machine, which would make the states
harder to understand and some of this work would need to be redone.
the tables are actually really easy to understand and generate.
wikipedia has a discussion of the bit patterns which can help.

- erik

Reply via email to