But my first reaction to the question is to ask whether Unicode character blocks are the most useful thing to map. They're rather arbitrary, with things like Latin script split up across numerous blocks largely due to historical accident.

Mapping the Script property to char classes would seem much more useful, IMO.

Yeah, there's some discussion in https://github.com/Pomax/ucharclasses/pull/12 to consolidate several of the blocks that come with "...extended..." blocks, although even then that might be passing the buck: Unicode seems to have picked up in terms of landing new versions with more and more languages (as well as non-languages) leading to more and more blocks, many of which count as new scripts, so it might simply be a matter of a few years until we cross the charclass limit again even if we switch to scripts rather than pure blocks.

- Pomax


--------------------------------------------------
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex

Reply via email to