[issue6632] Include more fullwidth chars in the decimal codec

Marc-Andre Lemburg Tue, 22 Sep 2009 09:55:48 -0700

Marc-Andre Lemburg <[email protected]> added the comment:

Martin v. Löwis wrote:
> 
> Martin v. Löwis <[email protected]> added the comment:
> 
>> int()/float() use the decimal codec for numbers - this only supports
>> base-10 numbers. For hex numbers, we'd need a new hex codec (only
>> the encoder part, actually), otherwise, int('a') would start to return
>> 10.
> 
> That's not true. PyUnicode_EncodeDecimal could happily accept hexdigits,
> and int(u'a') would still be rejected. In fact, PyUnicode_EncodeDecimal
> *already* accepts arbitrary Latin-1 characters, whether they represent
> digits or not. I suppose this is to support non-decimal bases, so it
> would only be consequential to widen this to all characters that
> reasonably have the Hex_Digit property (although I'm unsure which ones
> are excluded at the moment).


The codec currently doesn't look at the base at all - and shouldn't
need to:

It simply converts input characters that have a decimal digit value
associated with them, to the usual ASCII digits in preparation
for parsing them using the standard number parsing tools we have in
Python.

This is to support number representations using non-ASCII code
points for digits (e.g. Japanese or Sanskrit numbers)

http://sp.cis.iwate-u.ac.jp/sp/lessonj/doc/numbers.html
http://veda.wikidot.com/sanskrit-numbers

All other Latin-1 characters are passed through as-is, so you
can already use the codec to e.g. prepare parsing of hex
values.

Also note that we already have a hex codec in Python 2.x
which converts between the hex representations of a string
and its regular form. This was removed in 3.x for some reason
I don't understand (probably just an oversight).

----------

_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue6632>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue6632] Include more fullwidth chars in the decimal codec

Reply via email to