Marc-Andre Lemburg <m...@egenix.com> added the comment: Martin v. Löwis wrote: > > Martin v. Löwis <mar...@v.loewis.de> added the comment: > >> int()/float() use the decimal codec for numbers - this only supports >> base-10 numbers. For hex numbers, we'd need a new hex codec (only >> the encoder part, actually), otherwise, int('a') would start to return >> 10. > > That's not true. PyUnicode_EncodeDecimal could happily accept hexdigits, > and int(u'a') would still be rejected. In fact, PyUnicode_EncodeDecimal > *already* accepts arbitrary Latin-1 characters, whether they represent > digits or not. I suppose this is to support non-decimal bases, so it > would only be consequential to widen this to all characters that > reasonably have the Hex_Digit property (although I'm unsure which ones > are excluded at the moment).
The codec currently doesn't look at the base at all - and shouldn't need to: It simply converts input characters that have a decimal digit value associated with them, to the usual ASCII digits in preparation for parsing them using the standard number parsing tools we have in Python. This is to support number representations using non-ASCII code points for digits (e.g. Japanese or Sanskrit numbers) http://sp.cis.iwate-u.ac.jp/sp/lessonj/doc/numbers.html http://veda.wikidot.com/sanskrit-numbers All other Latin-1 characters are passed through as-is, so you can already use the codec to e.g. prepare parsing of hex values. Also note that we already have a hex codec in Python 2.x which converts between the hex representations of a string and its regular form. This was removed in 3.x for some reason I don't understand (probably just an oversight). ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue6632> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com