In message <[EMAIL PROTECTED]>
        James Mastros <[EMAIL PROTECTED]> wrote:

> Right.  Unfornatly, after starting on this, I relized that that's the easy
> part.  Unicode has a fairly-well defined way of figuring out if a character
> is a digit (see if it's category is Nd (Number/digit), and if so what it's
> value is (the value of the "decimal" property.)

Can it also tell you the base used for digit strings in that 
character set... Actually I don't know if there are any modern
writing systems that don't use base ten but certainly if you
were dealing with some ancient scripts that used sexagesimal
numbers that might be a problem ;-)

> However, there appears to be no good way of determining if somthing is a
> decimal point, a sign indicator, or an E/e (exponent signifier).

I suspected there wouldn't be.

> The attached patch will let the chartype layer decide if a character is a
> digit, and what it's value is.  

The patch seems to be missing though...

> Note also that is_digit should now return the value of the digit if it is a
> digit, or 42 if it isn't.  (I had to use somthing, and ~0 sometimes wanted
> to be (char)~0, and sometimes (INTVAL)~0, so I decided not to use ~0.  0, of
> course, can't be used for not-a-digit, since is_digit('0')==0.

I was assuming there would a separate digit_value() routine to avoid
that problem. Apart from anything else there will doubtless me many
other is_xxx() routines in due course which will be simple boolean
tests.

Tom

-- 
Tom Hughes ([EMAIL PROTECTED])
http://www.compton.nu

Reply via email to