Mark Dickinson <dicki...@gmail.com> added the comment:

> What do you think about adding number parsers that operate directly on
> Py_UNICODE* strings?

I think that might make some sense.  It's not without difficulties, though.  
One issue is that we'd still need the char* -> double operations, partly 
because PyOS_string_to_double is part of the public API, and partly to continue 
to support creation of a float from a bytes instance.

The other issue is that for floats, it's difficult to separate the parser from 
the base conversion;  to be useful, we'd probably end up making the whole of 
dtoa.c Py_UNICODE aware.  (One of the return values from the dtoa.c parser is a 
pointer to the significant digits in the original input string;  so the 
base-conversion calculation itself needs access to portions of the original 
string.)

Ideally, for float(string), we'd have a zero-copy setup that operated directly 
on the unicode input (read-only);  but I think that achieving that right now is 
going to be messy, and involve dtoa.c knowing far more about Unicode that I'd 
be comfortable with.

N.B. If we didn't have to deal with alternative digits, it *really* would be 
much simpler.

Perhaps a compromise option is available, that does a preliminary pass on the 
Unicode string and only makes a copy if non-European digits are discovered.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue10557>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to