On Fri, Jun 15, 2001 at 11:50:49AM -0400, Dan Sugalski wrote: > Unless I'm missing something (Simon? Hong?) Japanese (and potentially all > the languages that use the Han characters) can interpret a particular > character as either a number or not a number, depending on context.
Uh, don't think so, no. The numerals are, ooh, let's see: U+4E00, U+4E8C, U+4E09, U+56DB, U+4E94, U+4E03, U+516B, U+5341, U+5343, U+4E07 and two more I can't find. The rest aren't (usually) treated as numbers, no. It's certainly not the case that a given character is both non-number and number. > >module Locale::Hawaiian; > >use re 'class (\w => [aeiouâêîôûhklmnpw`])'; > >... > > Sure. I expect Damian will write us something that lets you specify them > upside-down in Klingon or something by the time this is done. :) This is handy, but this means the regexp engine needs to be *VERY* dynamic at runtime. -- When your hammer is C++, everything begins to look like a thumb. -- Steve Haflich, comp.lang.c++