[Python-Dev] Virus on python-3.1.2.msi?
Issue 1050 claims that the 3.1.2 installer has the virus Palevo.DZ. Can somebody with a virus scanner please confirm or contest that claim? Thanks, Martin http://bugs.python.org/issue10500 ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] constant/enum type in stdlib
On 28/11/2010 03:20, Terry Reedy wrote: On 11/27/2010 6:26 PM, Raymond Hettinger wrote: Can I suggest that an enum-maker be offered as a third-party module Possibly with competing versions for trial and testing ;-) rather than prematurely adding it into the standard library. I had same thought. There are already *several* enum packages for Python available. The implementation by Ben Finney, associated with the previous PEP, is on PyPI and the most recent release has over 4000 downloads making it reasonably popular: http://pypi.python.org/pypi/enum/ Other contenders include flufl.enum and lazr.enum. The Twisted guys would like a named constant type, and have a ticket for it, and PyQt has its own implementation (subclassing int) providing this functionality. In terms of assessing *general* usefulness in the wider community that step has already been done. This discussion came out of yet-another-set-of-integer-constants being added to the Python standard library (since changed to strings). We have integer constants, with the associated inscrutability when used from the interactive interpreter or debugging, in *many* standard library modules. The particular features and use cases being discussed have use *within* the standard library in mind. Releasing yet-another-enum-library-that-the-standard-library-can't-use would be a particularly pointless outcome of this discussion. The decision is whether or not to use named constants in the standard library, otherwise we can just point people at one of the several existing packages. All the best, Michael Foord -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Question about GDB bindings and 32/64 bits
On 26.11.2010 05:11, Jesus Cea wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I have installed GDB 7.2 32 bits and 32 bits buildslaves are green. Nevertheless 64 bits buildslaves are failing test_gdb. Is there any expectation that a 32 bits GDB be able to debug a 64 bits python?. If not, gdb test should compare "platform.architecture()" (for python and gdb in the system) and run only when they are the same. that would be too restrictive, as an 64bit gdb is able to handle 32bit binaries too. If this should work, I would open a bug and maybe spend some time with it. But before thinking about investing time, I would like to know if this mix is actually expected or not to work. If not, I would consider to install a 64 bits GDB too and do some tricks (like using an "/usr/local/bin/gdb" script wrapper to choose 32/64 "real" gdb version) to actually execute "test_gdb" in both buildslaves (they are running in the same physical machine). yes, and then you should be able to use this gdb for both 32 and 64bit builds. No need for a wrapper (Such a gdb is available in the gdb64 package on Debian/Ubuntu). Matthias ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] constant/enum type in stdlib
On 28/11/2010 02:38, Nick Coghlan wrote:
On Sun, Nov 28, 2010 at 9:26 AM, Raymond Hettinger
wrote:
On Nov 27, 2010, at 12:56 PM, Glenn Linderman wrote:
On 11/27/2010 2:51 AM, Nick Coghlan wrote:
Not quite. I'm suggesting a factory function that works for any value,
and derives the parent class from the type of the supplied value.
Nick, thanks for the much better implementation than I achieved; you seem to
have the same goals as my implementation. I learned a bit making mine, and
more understanding yours to some degree. What I still don't understand about
your implementation, is that when adding one additional line to your file, it
fails:
w = named_value("ABC", z )
Now I can understand why it might not be a good thing to make a named value of
a named value (confusing, at least), but I was surprised, and still do not
understand, that it failed reporting the __new__() takes exactly 3 arguments (2
given).
Can I suggest that an enum-maker be offered as a third-party module rather than
prematurely adding it into the standard library.
Indeed. Glenn's failing example suggests to me that using a new
metaclass is probably going to be a cleaner option than trying to
dance around type's default behaviour within an ordinary class
definition (if nothing else, a separate metaclass makes it much easier
to detect when you're dealing with an instance of a named type).
Yep, for representing a group of names a single class with a metaclass
seems like a reasonable approach. See my note below about agreeing
minimal feature-set and minimal-api before we discuss implementation
though.
Regardless, I still see value in approaching this whole discussion as
a two-level design problem, with "named values" as the more
fundamental concept, and then higher level grouping APIs to get
enum-style behaviour.
It seems like using the term "enum" provokes a strong negative reaction
in some of the core-devs who are basically in favour named constants and
not actively against grouping. I'm happy with NamedConstant and
GroupedNames (or similar) and dropping the use of the term enum.
There are also valid concerns about over-engineering (and not so valid
concerns...). Simplicity in creating them and no additional burden in
using them are fundamental, but in the APIs / implementations suggested
so far I think we are keeping that in mind.
Eventually attaining "One Obvious Way" for the
former seems achievable to me, while the diversity of use cases for
grouping APIs suggests to me that "one-size-fits-all" isn't going to
work unless that "one size" is a Frankenstein API with more options
than anyone could reasonably hope to keep in their head at once.
Well... yes - treating it as a two level design problem is fine.
I don't think there are *many* competing features, in fact as far as
feature requests on python-dev go I think this is a relatively
straightforward one with a lot of *agreement* on the basic functionality.
We have had various discussions about what the API should look like, or
what the implementation should look like, but I don't think there is a
lot of disagreement about basic features. There are some 'optional
features'. Many of these can be added later without backwards
compatibility issues, so those can profitably be omitted from an initial
implementation.
Features as I see them:
Named constant
--
* Nice repr
* Subclass of the type it represents
* Trivially easy to convert either to a string (name) and the value it
represents
* If an integer type, can be OR'd with other named constants and retains
a useful repr
Grouped constants
* Easy to create a group of named constants, accessible as attributes on
group object
* Capability to go from name or value to corresponding constants
Optional Features
---
* Ability to dynamically add new named values to a group. (Suggested by
Guido)
* Ability to test if a name or value is in a group
* Ability to list all names in a group
* ANDing as well as ORing
* Constants are unique
* OR'ing with an integer will look up the name (or calculate it if the
int itself represents flags that have already been OR'd) and return a
named value (with useful repr) instead of just an integer
* Named constants be named values that can wrap *any* type and not just
immutable values. (Note that wrapping mutable types makes providing
"from_value" functionality harder *unless* we guarantee that named
values are unique. If they aren't unique named values for a mutable type
can have different values and there is no single definition of what the
named value actually is.)
Requiring that values only have one name - or alternatively that values
on a group could have multiple names (obviously incompatible features).
* Requiring all names in a group to be of the same type
* Allow names to be set automatically in a namespace, for example in a
class namespace or on a module
* Allow subclassing and adding of new
Re: [Python-Dev] constant/enum type in stdlib
On 28/11/2010 16:28, Michael Foord wrote:
[snip...]
I don't think there are *many* competing features, in fact as far as
feature requests on python-dev go I think this is a relatively
straightforward one with a lot of *agreement* on the basic functionality.
We have had various discussions about what the API should look like,
or what the implementation should look like, but I don't think there
is a lot of disagreement about basic features. There are some
'optional features'. Many of these can be added later without
backwards compatibility issues, so those can profitably be omitted
from an initial implementation.
Features as I see them:
Named constant
--
* Nice repr
* Subclass of the type it represents
* Trivially easy to convert either to a string (name) and the value it
represents
* If an integer type, can be OR'd with other named constants and
retains a useful repr
Note that having an OR repr is meaningless *unless* the constants are
intended to be flags, OR'ing should be specified.
name = NamedValue('name', value, flags=True)
Where flags defaults to False. Typically you will use this through the
grouping API anyway - where it can either be a keyword argument
(slightly annoying because the suggestion is to create the named values
through keyword arguments) or we can have two group-factory functions:
Group = make_constants('Group', name1=value1, name2=value2)
Flags = make_flags('Flags', name1=value1, name2=value2)
It is sensible if flag values are only powers of 2; we could enforce
that or not... (Another one for the optional feature list.)
I forgot auto-enumeration (specifying names only and having values
autogenerated) from the optional feature set by the way. I think Antoine
strongly disapproves of this feature because it reminds him of C enums.
Mark Dickinson thinks that the flags feature could be an optional
feature too. If we have ORing it makes sense to have ANDing, so I guess
they belong together. I think there is value in it though.
I realise that the optional feature list is now not small, and
implementing all of it would create the "franken-api" Nick is worried
about. The minimal feature list is nicely small though and provides
useful functionality.
All the best,
Michael
Grouped constants
* Easy to create a group of named constants, accessible as attributes
on group object
* Capability to go from name or value to corresponding constants
Optional Features
---
* Ability to dynamically add new named values to a group. (Suggested
by Guido)
* Ability to test if a name or value is in a group
* Ability to list all names in a group
* ANDing as well as ORing
* Constants are unique
* OR'ing with an integer will look up the name (or calculate it if the
int itself represents flags that have already been OR'd) and return a
named value (with useful repr) instead of just an integer
* Named constants be named values that can wrap *any* type and not
just immutable values. (Note that wrapping mutable types makes
providing "from_value" functionality harder *unless* we guarantee that
named values are unique. If they aren't unique named values for a
mutable type can have different values and there is no single
definition of what the named value actually is.)
Requiring that values only have one name - or alternatively that
values on a group could have multiple names (obviously incompatible
features).
* Requiring all names in a group to be of the same type
* Allow names to be set automatically in a namespace, for example in a
class namespace or on a module
* Allow subclassing and adding of new values only present in subclass
I'd rather we agree a suitable (minimal) API and feature set and go to
implementation from that.
For wrapping mutable types I'm tempted to say YAGNI. For the standard
library wrapping integers meets almost all our use-cases except for
one float. (At work we have a decimal constant as it happens.) Perhaps
we could require immutable types for groups but allow arbitrary values
for individual named values?
For the named values api:
name = NamedValue('name', value)
For the grouping (tentatively accepted as reasonable by Antoine):
Group = make_constants('Group', name1=value1, name2=value2)
name1, name2 = Group.name1, Group.name1
flag = name1 | name2
value = int(Group.name1)
name = Group('name1')
# alternatively: value = Group.from_name('name1')
name = Group.from_value(value1)
# Group(value1) could work only if values aren't strings
# perhaps: name = Group(value=value1)
Group.new_name = value3 # create new value on the group
names = Group.all_names()
# further bikeshedding on spelling of all_names required
# correspondingly 'all_values' I guess, returning the constants
themselves
Some of the optional features couldn't later be added without
backwards compatibility concerns (I think the type checking features
and requiring unique values for example). We should at le
Re: [Python-Dev] constant/enum type in stdlib
On 28/11/2010 17:05, Michael Foord wrote:
[snip...]
It is sensible if flag values are only powers of 2; we could enforce
that or not... (Another one for the optional feature list.)
Another 'optional' feature I omitted was Phillip J. Eby's suggestion /
requirement that named values be pickleable. Email is clunky for
handling this, is there enough support (there is still some objection
that is sure) to revive the PEP or create a new one?
I also didn't include Nick's suggested API, which is slightly different
from the one I suggested:
silly = Namegroup.from_names("Silly", "FOO", "BAR", "BAZ")
>>> silly.FOO
Silly.FOO=0
>>> int(silly.FOO)
0
>>> silly(0)
Silly.FOO=0
x = named_value("FOO", 1)
y = named_value("BAR", "Hello World!")
z = named_value("BAZ", dict(a=1, b=2, c=3))
set_named_values(globals(), foo=x._raw(), bar=y._raw(), baz=z._raw())
Where a named value created from an integer is an int subclass, from a
dict a dict subclass and so on.
Michael
I forgot auto-enumeration (specifying names only and having values
autogenerated) from the optional feature set by the way. I think
Antoine strongly disapproves of this feature because it reminds him of
C enums.
Mark Dickinson thinks that the flags feature could be an optional
feature too. If we have ORing it makes sense to have ANDing, so I
guess they belong together. I think there is value in it though.
I realise that the optional feature list is now not small, and
implementing all of it would create the "franken-api" Nick is worried
about. The minimal feature list is nicely small though and provides
useful functionality.
All the best,
Michael
Grouped constants
* Easy to create a group of named constants, accessible as attributes
on group object
* Capability to go from name or value to corresponding constants
Optional Features
---
* Ability to dynamically add new named values to a group. (Suggested
by Guido)
* Ability to test if a name or value is in a group
* Ability to list all names in a group
* ANDing as well as ORing
* Constants are unique
* OR'ing with an integer will look up the name (or calculate it if
the int itself represents flags that have already been OR'd) and
return a named value (with useful repr) instead of just an integer
* Named constants be named values that can wrap *any* type and not
just immutable values. (Note that wrapping mutable types makes
providing "from_value" functionality harder *unless* we guarantee
that named values are unique. If they aren't unique named values for
a mutable type can have different values and there is no single
definition of what the named value actually is.)
Requiring that values only have one name - or alternatively that
values on a group could have multiple names (obviously incompatible
features).
* Requiring all names in a group to be of the same type
* Allow names to be set automatically in a namespace, for example in
a class namespace or on a module
* Allow subclassing and adding of new values only present in subclass
I'd rather we agree a suitable (minimal) API and feature set and go
to implementation from that.
For wrapping mutable types I'm tempted to say YAGNI. For the standard
library wrapping integers meets almost all our use-cases except for
one float. (At work we have a decimal constant as it happens.)
Perhaps we could require immutable types for groups but allow
arbitrary values for individual named values?
For the named values api:
name = NamedValue('name', value)
For the grouping (tentatively accepted as reasonable by Antoine):
Group = make_constants('Group', name1=value1, name2=value2)
name1, name2 = Group.name1, Group.name1
flag = name1 | name2
value = int(Group.name1)
name = Group('name1')
# alternatively: value = Group.from_name('name1')
name = Group.from_value(value1)
# Group(value1) could work only if values aren't strings
# perhaps: name = Group(value=value1)
Group.new_name = value3 # create new value on the group
names = Group.all_names()
# further bikeshedding on spelling of all_names required
# correspondingly 'all_values' I guess, returning the constants
themselves
Some of the optional features couldn't later be added without
backwards compatibility concerns (I think the type checking features
and requiring unique values for example). We should at least consider
these if we are to make adding them later difficult. I would be fine
with not having these features.
All the best,
Michael
Cheers,
Nick.
--
http://www.voidspace.org.uk/
READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (”BOGUS AGREEMENTS”) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
Re: [Python-Dev] constant/enum type in stdlib
Michael Foord wrote: Another 'optional' feature I omitted was Phillip J. Eby's suggestion / requirement that named values be pickleable. Email is clunky for handling this, is there enough support (there is still some objection that is sure) to revive the PEP or create a new one? I think it definitely needs a PEP. I don't care whether you revive the old PEP or write a new one. -- Steven ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] constant/enum type in stdlib
On 28/11/2010 18:05, Steven D'Aprano wrote: Michael Foord wrote: Another 'optional' feature I omitted was Phillip J. Eby's suggestion / requirement that named values be pickleable. Email is clunky for handling this, is there enough support (there is still some objection that is sure) to revive the PEP or create a new one? I think it definitely needs a PEP. I don't care whether you revive the old PEP or write a new one. Well, "if it were to be accepted it would need a PEP" and "the next step should be a PEP" are slightly different statements. :-) As I agree with the former *anyway* at the worst starting a PEP will waste time, so I guess I'll get that underway when I get a chance... Thanks Michael -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Python and the Unicode Character Database
Two recently reported issues brought into light the fact that Python
language definition is closely tied to character properties maintained
by the Unicode Consortium. [1,2] For example, when Python switches to
Unicode 6.0.0 (planned for the upcoming 3.2 release), we will gain two
additional characters that Python can use in identifiers. [3]
With Python 3.1:
>>> exec('\u0CF1 = 1')
Traceback (most recent call last):
File "", line 1, in
File "", line 1
ೱ = 1
^
SyntaxError: invalid character in identifier
but with Python 3.2a4:
>>> exec('\u0CF1 = 1')
>>> eval('\u0CF1')
1
Of course, the likelihood is low that this change will affect any
user, but the change in str.isspace() reported in [1] is likely to
cause some trouble:
Python 2.6.5:
>>> u'A\u200bB'.split()
[u'A', u'B']
Python 2.7:
>>> u'A\u200bB'.split()
[u'A\u200bB']
While we have little choice but to follow UCD in defining
str.isidentifier(), I think Python can promise users more stability in
what it treats as space or as a digit in its builtins. For example,
I don't think that supporting
>>> float('١٢٣٤.٥٦')
1234.56
is more important than to assure users that once their program
accepted some text as a number, they can assume that the text is
ASCII.
[1] http://bugs.python.org/issue10567
[2] http://bugs.python.org/issue10557
[3] http://www.unicode.org/versions/Unicode6.0.0/#Database_Changes
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
On Sun, 28 Nov 2010 15:24:37 -0500
Alexander Belopolsky wrote:
> While we have little choice but to follow UCD in defining
> str.isidentifier(), I think Python can promise users more stability in
> what it treats as space or as a digit in its builtins.
Well, if "unicode support" means "support the latest version of the
Unicode standard", I'm not sure we have a choice.
We can make exceptions, but that would only confuse users even more,
wouldn't it?
> For example,
> I don't think that supporting
>
> >>> float('١٢٣٤.٥٦')
> 1234.56
>
> is more important than to assure users that once their program
> accepted some text as a number, they can assume that the text is
> ASCII.
Why would they assume the text is ASCII?
Regards
Antoine.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
On Sun, Nov 28, 2010 at 3:43 PM, Antoine Pitrou wrote:
..
>> For example,
>> I don't think that supporting
>>
>> >>> float('١٢٣٤.٥٦')
>> 1234.56
>>
>> is more important than to assure users that once their program
>> accepted some text as a number, they can assume that the text is
>> ASCII.
>
> Why would they assume the text is ASCII?
def deposit(self, amountstr):
self.balance += float(amountstr)
audit_log("Deposited: " + amountstr)
Auditor:
$ cat numbered-account.log
Deposited: ?.??
...
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
On Sun, 28 Nov 2010 15:58:33 -0500
Alexander Belopolsky wrote:
> On Sun, Nov 28, 2010 at 3:43 PM, Antoine Pitrou wrote:
> ..
> >> For example,
> >> I don't think that supporting
> >>
> >> >>> float('١٢٣٤.٥٦')
> >> 1234.56
> >>
> >> is more important than to assure users that once their program
> >> accepted some text as a number, they can assume that the text is
> >> ASCII.
> >
> > Why would they assume the text is ASCII?
>
> def deposit(self, amountstr):
> self.balance += float(amountstr)
> audit_log("Deposited: " + amountstr)
>
> Auditor:
>
> $ cat numbered-account.log
> Deposited: ?.??
I'm not sure that's how banking applications are written :)
Antoine.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
On Sun, Nov 28, 2010 at 7:04 PM, Antoine Pitrou wrote:
> On Sun, 28 Nov 2010 15:58:33 -0500
> Alexander Belopolsky wrote:
>
>> On Sun, Nov 28, 2010 at 3:43 PM, Antoine Pitrou wrote:
>> ..
>> >> For example,
>> >> I don't think that supporting
>> >>
>> >> >>> float('١٢٣٤.٥٦')
>> >> 1234.56
>> >>
>> >> is more important than to assure users that once their program
>> >> accepted some text as a number, they can assume that the text is
>> >> ASCII.
>> >
>> > Why would they assume the text is ASCII?
>>
>> def deposit(self, amountstr):
>> self.balance += float(amountstr)
>> audit_log("Deposited: " + amountstr)
>>
>> Auditor:
>>
>> $ cat numbered-account.log
>> Deposited: ?.??
>
>
> I'm not sure that's how banking applications are written :)
>
+1 for this being bogus - I see no correlation whatsoever in numbers
inside unicode having to be "ASCII" if we have surpassed all technical
barriers for needing to behave like that. ASCII is an
oversimplification of human communication needed for computing devices
not complex enough to represent it fully.
Let novice C programmers in English speaking countries deal with the
fact that 1 character is not 1 byte anymore. We are past this point.
js
-><-
> Antoine.
> ___
> Python-Dev mailing list
> [email protected]
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/jsbueno%40python.org.br
>
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
On Sun, Nov 28, 2010 at 4:12 PM, Joao S. O. Bueno wrote: .. > Let novice C programmers in English speaking countries deal with the > fact that 1 character is not 1 byte anymore. We are past this point. If you are, please contribute your expertise here: http://bugs.python.org/issue2382 ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] constant/enum type in stdlib
Rob Cliffe wrote: But couldn't they be presented to the Python programmer as a single type, with the implementation details hidden "under the hood"? Not in CPython, because tuple items are kept in the same block of memory as the object header. Because CPython can't move objects, this means that the size of the tuple must be known when the object is created. -- Greg ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
> float('١٢٣٤.٥٦')
>> 1234.56
I think it's a bug that this works. The definition of the float builtin says
Convert a string or a number to floating point. If the argument is a
string, it must contain a possibly signed decimal or floating point
number, possibly embedded in whitespace. The argument may also be
'[+|-]nan' or '[+|-]inf'.
Now, one may wonder what precisely a "possibly signed floating point
number" is, but most likely, this refers to
floatnumber ::= pointfloat | exponentfloat
pointfloat::= [intpart] fraction | intpart "."
exponentfloat ::= (intpart | pointfloat) exponent
intpart ::= digit+
fraction ::= "." digit+
exponent ::= ("e" | "E") ["+" | "-"] digit+
digit ::= "0"..."9"
Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
On Sun, Nov 28, 2010 at 5:17 PM, "Martin v. Löwis" wrote:
>> float('١٢٣٤.٥٦')
>>> 1234.56
>
> I think it's a bug that this works. The definition of the float builtin says
>
> Convert a string or a number to floating point. If the argument is a
> string, it must contain a possibly signed decimal or floating point
> number, possibly embedded in whitespace. The argument may also be
> '[+|-]nan' or '[+|-]inf'.
>
This definition fails long before we get beyond 127-th code point:
>>> float('infinity')
inf
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
"Martin v. Löwis" wrote:
>> float('١٢٣٤.٥٦')
>>> 1234.56
>
> I think it's a bug that this works. The definition of the float builtin says
>
> Convert a string or a number to floating point. If the argument is a
> string, it must contain a possibly signed decimal or floating point
> number, possibly embedded in whitespace. The argument may also be
> '[+|-]nan' or '[+|-]inf'.
>
> Now, one may wonder what precisely a "possibly signed floating point
> number" is, but most likely, this refers to
>
> floatnumber ::= pointfloat | exponentfloat
> pointfloat::= [intpart] fraction | intpart "."
> exponentfloat ::= (intpart | pointfloat) exponent
> intpart ::= digit+
> fraction ::= "." digit+
> exponent ::= ("e" | "E") ["+" | "-"] digit+
> digit ::= "0"..."9"
I don't see why the language spec should limit the wealth of number
formats supported by float().
It is not uncommon for Asians and other non-Latin script users to
use their own native script symbols for numbers. Just because these
digits may look strange to someone doesn't mean that they are
meaningless or should be discarded.
Please also remember that Python3 now allows Unicode names for
identifiers for much the same reasons.
Note that the support in float() (and the other numeric constructors)
to work with Unicode code points was explicitly added when Unicode
support was added to Python and has been available since Python 1.6.
It is not a bug by any definition of "bug", even though the feature
may bug someone occasionally to go read up a bit on what else
the world has to offer other than Arabic numerals :-)
http://en.wikipedia.org/wiki/Numeral_system
--
Marc-Andre Lemburg
eGenix.com
Professional Python Services directly from the Source (#1, Nov 28 2010)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/
::: Try our new mxODBC.Connect Python Database Interface for free !
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
Alexander Belopolsky wrote:
> Two recently reported issues brought into light the fact that Python
> language definition is closely tied to character properties maintained
> by the Unicode Consortium. [1,2] For example, when Python switches to
> Unicode 6.0.0 (planned for the upcoming 3.2 release), we will gain two
> additional characters that Python can use in identifiers. [3]
>
> With Python 3.1:
>
exec('\u0CF1 = 1')
> Traceback (most recent call last):
> File "", line 1, in
> File "", line 1
>ೱ = 1
> ^
> SyntaxError: invalid character in identifier
>
> but with Python 3.2a4:
>
exec('\u0CF1 = 1')
eval('\u0CF1')
> 1
Such changes are not new, but I agree that they should probably
be highlighted in the "What's new in Python x.x".
> Of course, the likelihood is low that this change will affect any
> user, but the change in str.isspace() reported in [1] is likely to
> cause some trouble:
>
> Python 2.6.5:
u'A\u200bB'.split()
> [u'A', u'B']
>
> Python 2.7:
u'A\u200bB'.split()
> [u'A\u200bB']
That's a classical bug fix.
> While we have little choice but to follow UCD in defining
> str.isidentifier(), I think Python can promise users more stability in
> what it treats as space or as a digit in its builtins.
Why should we divert from the work done by the Unicode Consortium ?
After all, most of their changes are in fact bug fixes as well.
> For example,
> I don't think that supporting
>
float('١٢٣٤.٥٦')
> 1234.56
>
> is more important than to assure users that once their program
> accepted some text as a number, they can assume that the text is
> ASCII.
Sorry, but I don't agree.
If ASCII numerals are an important aspect of an application, the
application should make sure that only those numerals are used
(e.g. by using a regular expression for checking).
In a Unicode world, not accepting non-Arabic numerals would be
a limitation, not a feature. Besides Python has had this support
since Python 1.6.
> [1] http://bugs.python.org/issue10567
> [2] http://bugs.python.org/issue10557
> [3] http://www.unicode.org/versions/Unicode6.0.0/#Database_Changes
--
Marc-Andre Lemburg
eGenix.com
Professional Python Services directly from the Source (#1, Nov 28 2010)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/
::: Try our new mxODBC.Connect Python Database Interface for free !
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
On Sun, Nov 28, 2010 at 5:42 PM, M.-A. Lemburg wrote: .. > I don't see why the language spec should limit the wealth of number > formats supported by float(). > The Language Spec (whatever it is) should not, but hopefully the Library Reference should. If you follow http://docs.python.org/dev/py3k/library/functions.html#float link and the references therein, you'll end up with digit ::= "0"..."9" http://docs.python.org/dev/py3k/reference/lexical_analysis.html#grammar-token-digit ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
Am 28.11.2010 23:31, schrieb Alexander Belopolsky:
> On Sun, Nov 28, 2010 at 5:17 PM, "Martin v. Löwis" wrote:
>>> float('١٢٣٤.٥٦')
1234.56
>>
>> I think it's a bug that this works. The definition of the float builtin says
>>
>> Convert a string or a number to floating point. If the argument is a
>> string, it must contain a possibly signed decimal or floating point
>> number, possibly embedded in whitespace. The argument may also be
>> '[+|-]nan' or '[+|-]inf'.
>>
>
> This definition fails long before we get beyond 127-th code point:
>
float('infinity')
> inf
What do infer from that? That the definition is wrong, or the code is wrong?
Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
On 11/28/2010 3:58 PM, Alexander Belopolsky wrote:
On Sun, Nov 28, 2010 at 3:43 PM, Antoine Pitrou wrote:
..
For example,
I don't think that supporting
float('١٢٣٤.٥٦')
1234.56
Even if this is somehow an accident or something that someone snuck in,
I think it a good idea that *users* be able to input amounts with their
native digits. That is different from requiring *programmers* to write
literals with euro-ascii-digits
is more important than to assure users that once their program
accepted some text as a number, they can assume that the text is
ASCII.
Why would they assume the text is ASCII?
def deposit(self, amountstr):
self.balance += float(amountstr)
audit_log("Deposited: " + amountstr)
If the programmer want to assure ascii, he can produce a string,
possible formatted, from the amount
depform = "Deposited: ${:14.2f}".format
def deposit(self, amountstr):
amount = float(amountstr)
self.balance += amount
# audit_log("Deposited: " + str(amount) # simple version
audit_log(depform(amount))
Given that amountstr could be something like '182.33', I
think programmer should plan to format it.
--
Terry Jan Reedy
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
On Sun, Nov 28, 2010 at 5:56 PM, "Martin v. Löwis" wrote:
..
>> This definition fails long before we get beyond 127-th code point:
>>
> float('infinity')
>> inf
>
> What do infer from that? That the definition is wrong, or the code is wrong?
The development version of the reference manual is more detailed, but
as far as I can tell, it still defines digit as 0-9.
http://docs.python.org/dev/py3k/library/functions.html#float
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
>> Now, one may wonder what precisely a "possibly signed floating point
>> number" is, but most likely, this refers to
>>
>> floatnumber ::= pointfloat | exponentfloat
>> pointfloat::= [intpart] fraction | intpart "."
>> exponentfloat ::= (intpart | pointfloat) exponent
>> intpart ::= digit+
>> fraction ::= "." digit+
>> exponent ::= ("e" | "E") ["+" | "-"] digit+
>> digit ::= "0"..."9"
>
> I don't see why the language spec should limit the wealth of number
> formats supported by float().
If it doesn't, there should be some other specification of what
is correct and what is not. It must not be unspecified.
> It is not uncommon for Asians and other non-Latin script users to
> use their own native script symbols for numbers. Just because these
> digits may look strange to someone doesn't mean that they are
> meaningless or should be discarded.
Then these users should speak up and indicate their need, or somebody
should speak up and confirm that there are users who actually want
'١٢٣٤.٥٦' to denote 1234.56. To my knowledge, there is no writing
system in which '١٢٣٤.٥٦e4' means 12345600.0.
> Please also remember that Python3 now allows Unicode names for
> identifiers for much the same reasons.
No no no. Addition of Unicode identifiers has a well-designed,
deliberate specification, with a PEP and all. The support for
non-ASCII digits in float appears to be ad-hoc, and not founded
on actual needs of actual users.
> Note that the support in float() (and the other numeric constructors)
> to work with Unicode code points was explicitly added when Unicode
> support was added to Python and has been available since Python 1.6.
That doesn't necessarily make it useful. Alexander's complaint is that
it makes Python unstable (i.e. changing as the UCD changes).
> It is not a bug by any definition of "bug"
Most certainly it is: the documentation is either underspecified,
or deviates from the implementation (when taking the most plausible
interpretation). This is the very definition of "bug".
Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
On 11/28/2010 5:51 PM, Alexander Belopolsky wrote: The Language Spec (whatever it is) should not, but hopefully the Library Reference should. If you follow http://docs.python.org/dev/py3k/library/functions.html#float link and the references therein, you'll end up with digit ::= "0"..."9" http://docs.python.org/dev/py3k/reference/lexical_analysis.html#grammar-token-digit So fix the doc for builtin float() and perhaps int(). -- Terry Jan Reedy ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
+1 on all point below.
On Sun, Nov 28, 2010 at 6:03 PM, "Martin v. Löwis" wrote:
>>> Now, one may wonder what precisely a "possibly signed floating point
>>> number" is, but most likely, this refers to
>>>
>>> floatnumber ::= pointfloat | exponentfloat
>>> pointfloat ::= [intpart] fraction | intpart "."
>>> exponentfloat ::= (intpart | pointfloat) exponent
>>> intpart ::= digit+
>>> fraction ::= "." digit+
>>> exponent ::= ("e" | "E") ["+" | "-"] digit+
>>> digit ::= "0"..."9"
>>
>> I don't see why the language spec should limit the wealth of number
>> formats supported by float().
>
> If it doesn't, there should be some other specification of what
> is correct and what is not. It must not be unspecified.
>
>> It is not uncommon for Asians and other non-Latin script users to
>> use their own native script symbols for numbers. Just because these
>> digits may look strange to someone doesn't mean that they are
>> meaningless or should be discarded.
>
> Then these users should speak up and indicate their need, or somebody
> should speak up and confirm that there are users who actually want
> '١٢٣٤.٥٦' to denote 1234.56. To my knowledge, there is no writing
> system in which '١٢٣٤.٥٦e4' means 12345600.0.
>
>> Please also remember that Python3 now allows Unicode names for
>> identifiers for much the same reasons.
>
> No no no. Addition of Unicode identifiers has a well-designed,
> deliberate specification, with a PEP and all. The support for
> non-ASCII digits in float appears to be ad-hoc, and not founded
> on actual needs of actual users.
>
>> Note that the support in float() (and the other numeric constructors)
>> to work with Unicode code points was explicitly added when Unicode
>> support was added to Python and has been available since Python 1.6.
>
> That doesn't necessarily make it useful. Alexander's complaint is that
> it makes Python unstable (i.e. changing as the UCD changes).
>
>> It is not a bug by any definition of "bug"
>
> Most certainly it is: the documentation is either underspecified,
> or deviates from the implementation (when taking the most plausible
> interpretation). This is the very definition of "bug".
>
> Regards,
> Martin
>
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
Am 29.11.2010 00:01, schrieb Alexander Belopolsky:
> On Sun, Nov 28, 2010 at 5:56 PM, "Martin v. Löwis" wrote:
> ..
>>> This definition fails long before we get beyond 127-th code point:
>>>
>> float('infinity')
>>> inf
>>
>> What do infer from that? That the definition is wrong, or the code is wrong?
>
> The development version of the reference manual is more detailed, but
> as far as I can tell, it still defines digit as 0-9.
>
> http://docs.python.org/dev/py3k/library/functions.html#float
>
I wasn't asking about 0..9, but about "infinity". According to the
spec, it shouldn't accept that (and neither should it accept
'infinitY'). However, whether that's a spec bug or an implementation
bug - it seems like a minor issue to me (i.e. easily fixed).
Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
On Sun, Nov 28, 2010 at 6:03 PM, "Martin v. Löwis" wrote: .. >> Note that the support in float() (and the other numeric constructors) >> to work with Unicode code points was explicitly added when Unicode >> support was added to Python and has been available since Python 1.6. > > That doesn't necessarily make it useful. Alexander's complaint is that > it makes Python unstable (i.e. changing as the UCD changes). > What makes it worse, is that while superficially, Unicode versions follow the same X.Y.Z format as Python versions, the stability promises are completely different. For example, it appears that the general category for the ZERO WIDTH SPACE was changed in Unicode 4.0.1. I don't think a change affecting str.split(), int(), float() and probably numerous other library functions would be acceptable in a Python micro release. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
On Sun, Nov 28, 2010 at 6:08 PM, "Martin v. Löwis" wrote:
> Am 29.11.2010 00:01, schrieb Alexander Belopolsky:
>> On Sun, Nov 28, 2010 at 5:56 PM, "Martin v. Löwis"
>> wrote:
>> ..
This definition fails long before we get beyond 127-th code point:
>>> float('infinity')
inf
>>>
>>> What do infer from that? That the definition is wrong, or the code is wrong?
>>
>> The development version of the reference manual is more detailed, but
>> as far as I can tell, it still defines digit as 0-9.
>>
>> http://docs.python.org/dev/py3k/library/functions.html#float
>>
>
> I wasn't asking about 0..9, but about "infinity". According to the
> spec, it shouldn't accept that (and neither should it accept
> 'infinitY').
According to the link that I mentioned,
infinity ::= "Infinity" | "inf"
and "Case is not significant, so, for example, “inf”, “Inf”,
“INFINITY” and “iNfINity” are all acceptable spellings for positive
infinity."
I completely agree with your arguments and the reference manual has
been improved a lot in the recent years.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
On Sun, Nov 28, 2010 at 6:19 PM, "Martin v. Löwis" wrote: .. > You can see the Unicode Consortium's stability policy at > > http://unicode.org/policies/stability_policy.html > >From the link above: """ As more experience is gathered in implementing the characters, adjustments in the properties may become necessary. Examples of such properties include, but are not limited to, the following: General_Category ... """ > In a sense, this is stronger than Python's backwards compatibility > promises (which allow for certain incompatible changes to occur > over time, whereas Unicode makes promises about all future versions). I would say it is *different* and should be taken into account when tying language features to Unicode specifications. This was done in PEP 3131. Note that one of the stated objections was "Unicode is young; its problems are not yet well understood and solved;" (It is still true.) ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
>>> float('١٢٣٤.٥٦')
1234.56
>
> Even if this is somehow an accident or something that someone snuck in,
> I think it a good idea that *users* be able to input amounts with their
> native digits. That is different from requiring *programmers* to write
> literals with euro-ascii-digits
So one question is what kind of data float() is aimed at. I claim that
it is about "programmer" data, not "user" data. If it supported "user"
data, it probably would have to support "1,000" to denote 1e3 in the
U.S., and denote 1e0 in Germany. Our users are generally confused
on whether they should use th full stop or the comma as the decimal
separator.
As not even the locale-dependent issues are considered in float(),
it is clear to me that entering local numbers cannot possibly be
the objective of the function.
Instead, following a wide-spread Python convention, it is meant to be
the reverse of repr().
Can you name a single person who actually wants to write '١٢٣٤.٥٦'
as a number? I'm fairly skeptical that users of arabic-indic digits.
Instead,
http://en.wikipedia.org/wiki/Decimal_separator
suggests that they would rather U+066B, i.e. '١٢٣٤٫٥٦', which isn't
supported by Python.
Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
> What makes it worse, is that while superficially, Unicode versions > follow the same X.Y.Z format as Python versions, the stability > promises are completely different. For example, it appears that the > general category for the ZERO WIDTH SPACE was changed in Unicode > 4.0.1. I don't think a change affecting str.split(), int(), float() > and probably numerous other library functions would be acceptable in a > Python micro release. Well, we managed to completely break Unicode normalization between 2.6.5 and 2.6.6, due to a bug. You can see the Unicode Consortium's stability policy at http://unicode.org/policies/stability_policy.html In a sense, this is stronger than Python's backwards compatibility promises (which allow for certain incompatible changes to occur over time, whereas Unicode makes promises about all future versions). Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
2010/11/28 M.-A. Lemburg :
>
>
> "Martin v. Löwis" wrote:
>>> float('١٢٣٤.٥٦')
1234.56
>>
>> I think it's a bug that this works. The definition of the float builtin says
>>
>> Convert a string or a number to floating point. If the argument is a
>> string, it must contain a possibly signed decimal or floating point
>> number, possibly embedded in whitespace. The argument may also be
>> '[+|-]nan' or '[+|-]inf'.
>>
>> Now, one may wonder what precisely a "possibly signed floating point
>> number" is, but most likely, this refers to
>>
>> floatnumber ::= pointfloat | exponentfloat
>> pointfloat ::= [intpart] fraction | intpart "."
>> exponentfloat ::= (intpart | pointfloat) exponent
>> intpart ::= digit+
>> fraction ::= "." digit+
>> exponent ::= ("e" | "E") ["+" | "-"] digit+
>> digit ::= "0"..."9"
>
> I don't see why the language spec should limit the wealth of number
> formats supported by float().
>
> It is not uncommon for Asians and other non-Latin script users to
> use their own native script symbols for numbers. Just because these
> digits may look strange to someone doesn't mean that they are
> meaningless or should be discarded.
That's different. Python doesn't assign any semantic meaning to the
characters in identifiers. The non-latin support for numerals, though,
could change the meaning of a program dramatically and needs to be
well-specified. Whether int() should do this is debatable. I, for one,
think this kind of support belongs in the locale module.
--
Regards,
Benjamin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] PEP 384 final review
I have now completed http://www.python.org/dev/peps/pep-0384/ Benjamin has volunteered to rule on this PEP. Please comment with any changes you want to see, or speak in favor or against this PEP. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
On 28/11/2010 23:33, "Martin v. Löwis" wrote:
float('١٢٣٤.٥٦')
1234.56
Even if this is somehow an accident or something that someone snuck in,
I think it a good idea that *users* be able to input amounts with their
native digits. That is different from requiring *programmers* to write
literals with euro-ascii-digits
So one question is what kind of data float() is aimed at. I claim that
it is about "programmer" data, not "user" data. If it supported "user"
data, it probably would have to support "1,000" to denote 1e3 in the
U.S., and denote 1e0 in Germany. Our users are generally confused
on whether they should use th full stop or the comma as the decimal
separator.
FWIW the C# equivalent is locale aware *unless* you pass in a specific
culture.
(System.Double.Parse):
http://msdn.microsoft.com/en-us/library/fd84bdyt.aspx
If you're not aware that your code may be run on non-US computers this
is a trap for the unwary. If you *are* aware then it is very useful.
An alternative overload allows you to specify the culture used to do the
conversion:
http://msdn.microsoft.com/en-us/library/t9ebt447.aspx
Michael
As not even the locale-dependent issues are considered in float(),
it is clear to me that entering local numbers cannot possibly be
the objective of the function.
Instead, following a wide-spread Python convention, it is meant to be
the reverse of repr().
Can you name a single person who actually wants to write '١٢٣٤.٥٦'
as a number? I'm fairly skeptical that users of arabic-indic digits.
Instead,
http://en.wikipedia.org/wiki/Decimal_separator
suggests that they would rather U+066B, i.e. '١٢٣٤٫٥٦', which isn't
supported by Python.
Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk
--
http://www.voidspace.org.uk/
READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (”BOGUS AGREEMENTS”) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
On Sun, Nov 28, 2010 at 6:03 PM, "Martin v. Löwis" wrote:
..
> No no no. Addition of Unicode identifiers has a well-designed,
> deliberate specification, with a PEP and all. The support for
> non-ASCII digits in float appears to be ad-hoc, and not founded
> on actual needs of actual users.
>
I wonder how carefully right-to-left scripts were considered when PEP
3131 was discussed.
Try the following on the python prompt:
>>> ڦ= int('١٢٣')
>>> ڦ
123
In my OSX Terminal window, entering ڦ flips the >>> prompt and the
session looks like this:
('???')int = ? <<<
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
> FWIW the C# equivalent is locale aware *unless* you pass in a specific > culture. > (System.Double.Parse): That's not quite the equivalent of float(), I would say: this one apparently is locale-aware, so it is more the equivalent of locale.atof. The next question then is if it supports indo-arabic digits in any locale (or more specifically in an arabic locale). Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
On Sun, 28 Nov 2010 17:23:01 -0600
Benjamin Peterson wrote:
> 2010/11/28 M.-A. Lemburg :
> >
> >
> > "Martin v. Löwis" wrote:
> >>> float('١٢٣٤.٥٦')
> 1234.56
> >>
> >> I think it's a bug that this works. The definition of the float builtin
> >> says
> >>
> >> Convert a string or a number to floating point. If the argument is a
> >> string, it must contain a possibly signed decimal or floating point
> >> number, possibly embedded in whitespace. The argument may also be
> >> '[+|-]nan' or '[+|-]inf'.
> >>
> >> Now, one may wonder what precisely a "possibly signed floating point
> >> number" is, but most likely, this refers to
> >>
> >> floatnumber ::= pointfloat | exponentfloat
> >> pointfloat ::= [intpart] fraction | intpart "."
> >> exponentfloat ::= (intpart | pointfloat) exponent
> >> intpart ::= digit+
> >> fraction ::= "." digit+
> >> exponent ::= ("e" | "E") ["+" | "-"] digit+
> >> digit ::= "0"..."9"
> >
> > I don't see why the language spec should limit the wealth of number
> > formats supported by float().
> >
> > It is not uncommon for Asians and other non-Latin script users to
> > use their own native script symbols for numbers. Just because these
> > digits may look strange to someone doesn't mean that they are
> > meaningless or should be discarded.
>
> That's different. Python doesn't assign any semantic meaning to the
> characters in identifiers. The non-latin support for numerals, though,
> could change the meaning of a program dramatically and needs to be
> well-specified. Whether int() should do this is debatable.
Perhaps int(), float(), Decimal() and friends could take an optional
parameter indicating whether non-ascii digits are considered. It would
then satisfy all parties.
Antoine.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
Am 29.11.2010 00:56, schrieb Alexander Belopolsky: > On Sun, Nov 28, 2010 at 6:03 PM, "Martin v. Löwis" wrote: > .. >> No no no. Addition of Unicode identifiers has a well-designed, >> deliberate specification, with a PEP and all. The support for >> non-ASCII digits in float appears to be ad-hoc, and not founded >> on actual needs of actual users. >> > > I wonder how carefully right-to-left scripts were considered when PEP > 3131 was discussed. IIRC, some Hebrew users have spoken in favor of the PEP, despite the obvious difficulties it would create. I may misremember, but I think someone pointed out that they had these difficulties all the time, and that it wasn't really a burden. Unicode specifies that one should always use "logical order" in memory, and that's what the PEP does. Rendering is then a tool issue. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
On Sun, Nov 28, 2010 at 6:59 PM, "Martin v. Löwis" wrote: .. > The next question then is if it supports indo-arabic digits in any > locale (or more specifically in an arabic locale). And once you answered that question, does it support Devanagari or Bengali digits? And if so, an arbitrary mix of those and indo-arabic digits? ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
On Sun, Nov 28, 2010 at 7:01 PM, Antoine Pitrou wrote: .. >> That's different. Python doesn't assign any semantic meaning to the >> characters in identifiers. The non-latin support for numerals, though, >> could change the meaning of a program dramatically and needs to be >> well-specified. Whether int() should do this is debatable. > > Perhaps int(), float(), Decimal() and friends could take an optional > parameter indicating whether non-ascii digits are considered. It would > then satisfy all parties. What parties? I don't think anyone has claimed to actually have used non-ASCII digits with float(). Of course it is fun that Python can process Bengali numerals, but so would be allowing Roman numerals. There is a reason why after careful consideration, PEP 313 was ultimately rejected. BTW, it is common in Russia to specify months using roman numerals. Maybe we should consider allowing datetime.date() accept '1.IV.2011'. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
On 28/11/2010 23:59, "Martin v. Löwis" wrote: FWIW the C# equivalent is locale aware *unless* you pass in a specific culture. (System.Double.Parse): That's not quite the equivalent of float(), I would say: this one apparently is locale-aware, so it is more the equivalent of locale.atof. Right. It is *the* standard way of getting a float from a string though, whereas in Python we have two depending on whether or not you want to be locale aware. The standard way in C# is locale aware. To be non-locale aware you pass in a specific culture or number format. The next question then is if it supports indo-arabic digits in any locale (or more specifically in an arabic locale). I don't think so actually. The float parse formatting rules are defined like this: [ws][$][sign][integral-digits[,]]integral-digits[.[fractional-digits]][E[sign]exponential-digits][ws] (From http://msdn.microsoft.com/en-us/library/7yd1h1be.aspx ) integral-digits, fractional-digits and exponential-digits are all defined as "A series of digits ranging from 0 to 9". Arguably this is not be conclusive. In fact the NumberFormatInfo class seems to hint that it may be otherwise: http://msdn.microsoft.com/en-us/library/system.globalization.numberformatinfo.aspx See DigitSubstitution on that page. I would have to try it to be sure and I don't have a Windows VM in convenient reach right now. All the best, Michael Regards, Martin -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
On 29/11/2010 00:04, Alexander Belopolsky wrote: On Sun, Nov 28, 2010 at 6:59 PM, "Martin v. Löwis" wrote: .. The next question then is if it supports indo-arabic digits in any locale (or more specifically in an arabic locale). And once you answered that question, does it support Devanagari or Bengali digits? And if so, an arbitrary mix of those and indo-arabic digits? Haha. Go and try it yourself. :-) Michael -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
> > Perhaps int(), float(), Decimal() and friends could take an optional > > parameter indicating whether non-ascii digits are considered. It would > > then satisfy all parties. > > What parties? I don't think anyone has claimed to actually have used > non-ASCII digits with float(). Have you done a poll of all Python 3 users? > Of course it is fun that Python can > process Bengali numerals, but so would be allowing Roman numerals. > There is a reason why after careful consideration, PEP 313 was > ultimately rejected. That's mostly irrelevant. This feature exists and someone, somewhere, may be using it. We normally don't remove stuff without deprecation. Antoine. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] constant/enum type in stdlib
On Mon, Nov 29, 2010 at 2:28 AM, Michael Foord wrote: > For wrapping mutable types I'm tempted to say YAGNI. For the standard > library wrapping integers meets almost all our use-cases except for one > float. (At work we have a decimal constant as it happens.) Perhaps we could > require immutable types for groups but allow arbitrary values for individual > named values? Whereas my opinion is that "immutable vs mutable" is such a blurry distinction that we shouldn't try to make it at the lowest level. Would it be possible to name frozenset instances? Tuples? How about objects that are conceptually immutable, but don't close all the loopholes allowing you to mutate them? (e.g. Decimal, Fraction) Better to design a named value API that doesn't care about mutability, and then leave questions of reverse mappings from values back to names to the grouping API level. At that level, it would be trivial (and natural) to limit names to referencing Hashable values so that a reverse lookup table would be easy to implement. For standard library purposes, we could even reasonably provide an int-only grouping API, since the main use case is almost certainly to be in managing translation of OS-level integer constants to named values. Cheers, Nick. -- Nick Coghlan | [email protected] | Brisbane, Australia ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
Alexander Belopolsky writes: > On Sun, Nov 28, 2010 at 7:01 PM, Antoine Pitrou wrote: > > Perhaps int(), float(), Decimal() and friends could take an optional > > parameter indicating whether non-ascii digits are considered. It > > would then satisfy all parties. > > What parties? I don't think anyone has claimed to actually have used > non-ASCII digits with float(). Rather, it has been pointed out that there is an unknown amount of existing code which does that. You're not going to know how much or how little from this forum. > Of course it is fun that Python can process Bengali numerals, but so > would be allowing Roman numerals. There is a reason why after careful > consideration, PEP 313 was ultimately rejected. Rejecting a proposed *new* capability is a different matter from disabling an *existing* capability which works in existing Python releases. -- \ “Following fashion and the status quo is easy. Thinking about | `\your users' lives and creating something practical is much | _o__)harder.” —Ryan Singer, 2008-07-09 | Ben Finney ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] constant/enum type in stdlib
On 29/11/2010 00:48, Nick Coghlan wrote: On Mon, Nov 29, 2010 at 2:28 AM, Michael Foord wrote: For wrapping mutable types I'm tempted to say YAGNI. For the standard library wrapping integers meets almost all our use-cases except for one float. (At work we have a decimal constant as it happens.) Perhaps we could require immutable types for groups but allow arbitrary values for individual named values? Whereas my opinion is that "immutable vs mutable" is such a blurry distinction that we shouldn't try to make it at the lowest level. Would it be possible to name frozenset instances? Tuples? How about objects that are conceptually immutable, but don't close all the loopholes allowing you to mutate them? (e.g. Decimal, Fraction) Better to design a named value API that doesn't care about mutability, and then leave questions of reverse mappings from values back to names to the grouping API level. At that level, it would be trivial (and natural) to limit names to referencing Hashable values so that a reverse lookup table would be easy to implement. For standard library purposes, we could even reasonably provide an int-only grouping API, since the main use case is almost certainly to be in managing translation of OS-level integer constants to named values. Sounds reasonable to me. Michael Cheers, Nick. -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384 final review
On 11/28/2010 6:40 PM, "Martin v. Löwis" wrote: I have now completed http://www.python.org/dev/peps/pep-0384/ The current text contains several error messages like: "System Message: WARNING/2 (pep-0384.txt, line 194) Bullet list ends without a blank line; unexpected unindent." Terry Jan Reedy ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
Martin v. Löwis wrote:
float('١٢٣٤.٥٦')
1234.56
I think it's a bug that this works. The definition of the float builtin says
[...]
I think that's a documentation bug rather than a coding bug. If Python
wishes to limit the digits allowed in numeric *literals* to ASCII 0...9,
that's one thing, but I think that the digits allowed in numeric
*strings* should allow the full range of digits supported by the Unicode
standard.
The former ensures that literals in code are always readable; the later
allows users to enter numbers in their own number system. How could that
be a bad thing?
--
Steven
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] constant/enum type in stdlib
On 28/11/2010 21:23, Greg Ewing wrote: Rob Cliffe wrote: But couldn't they be presented to the Python programmer as a single type, with the implementation details hidden "under the hood"? Not in CPython, because tuple items are kept in the same block of memory as the object header. Because CPython can't move objects, this means that the size of the tuple must be known when the object is created. But when a frozen list a.k.a. tuple would be created - either directly, or by setting a list's mutable flag to False which would really turn it into a tuple - the size *would* be known. And since the object would now be immutable, there would be no requirement for its size to change. (My idea doesn't require additional functionality, just a different API.) Rob Cliffe ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
On Sun, Nov 28, 2010 at 7:55 PM, Ben Finney wrote:
..
>> Of course it is fun that Python can process Bengali numerals, but so
>> would be allowing Roman numerals. There is a reason why after careful
>> consideration, PEP 313 was ultimately rejected.
>
> Rejecting a proposed *new* capability is a different matter from
> disabling an *existing* capability which works in existing Python
> releases.
Was this capability ever documented? It does not feel like a
deliberate feature. If it was, '\N{ARABIC DECIMAL SEPARATOR}' would
be accepted in arabic-indic notation. If feels more like a CPython
implementation detail similar to say:
>>> int('10') is 10
True
>>> int('1') is 1
False
Note that the underlying PyUnicode_EncodeDecimal() function is
described in the unicodeobject.h header file as follows:
/* --- Decimal Encoder */
/* Takes a Unicode string holding a decimal value and writes it into
an output buffer using standard ASCII digit codes.
..
The encoder converts whitespace to ' ', decimal characters to their
corresponding ASCII digit and all other Latin-1 characters except
\0 as-is. Characters outside this range (Unicode ordinals 1-256)
are treated as errors. This includes embedded NULL bytes.
*/
So the support for non-ASCII digits is accidental and should be
treated as a bug.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
Steven D'Aprano writes: > If Python wishes to limit the digits allowed in numeric *literals* to > ASCII 0...9, that's one thing, but I think that the digits allowed in > numeric *strings* should allow the full range of digits supported by > the Unicode standard. I assume you specifically mean that the numeric class constructors, like ‘int’ and ‘float’, should parse their input string such that any character Unicode defines as a numeric digit is mapped to the corresponding digit. That sounds attractive, but it raises questions about mixed notations, mixing digits from different writing systems, and probably other questionss I haven't thought of. It's not something to make a simple yes-or-no-decision on now, IMO. This sounds best suited to a PEP, which someone who cares enough can champion in ‘python-ideas’. -- \ “The manager has personally passed all the water served here.” | `\ —hotel, Acapulco | _o__) | Ben Finney ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
Alexander Belopolsky wrote:
Two recently reported issues brought into light the fact that Python
language definition is closely tied to character properties maintained
by the Unicode Consortium. [1,2] For example, when Python switches to
Unicode 6.0.0 (planned for the upcoming 3.2 release), we will gain two
additional characters that Python can use in identifiers. [3]
[...]
Why do you consider this a problem? It would be a problem if previously
valid identifiers *stopped* being valid, but not the other way around.
Of course, the likelihood is low that this change will affect any
user, but the change in str.isspace() reported in [1] is likely to
cause some trouble:
Looking at the thread here:
http://bugs.python.org/issue10567
I interpret it as indicting that Python's isspace() has been buggy for
many years, and is only now being fixed. It's always unfortunate when
people rely on bugs, but I'm not sure we should be promising to support
bug-for-bug compatibility from one version to the next :)
While we have little choice but to follow UCD in defining
str.isidentifier(), I think Python can promise users more stability in
what it treats as space or as a digit in its builtins. For example,
I don't think that supporting
float('١٢٣٤.٥٦')
1234.56
is more important than to assure users that once their program
accepted some text as a number, they can assume that the text is
ASCII.
Seems like a pretty foolish assumption, if you ask me, pretty much akin
to assuming that if string.isalpha() returns true that string is ASCII.
Support for non-Arabic numerals in number strings goes back to at least
Python 2.4:
[st...@sylar ~]$ python2.4
Python 2.4.6 (#1, Mar 30 2009, 10:08:01)
[GCC 4.1.2 20070925 (Red Hat 4.1.2-27)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> float(u'١٢٣٤.٥٦')
1234.55999
The fact that this is (apparently) only being raised now means that it
isn't actually a problem in real life. I'd even say that it's a feature,
and that if Python didn't support non-Arabic numerals, it should.
--
Steven
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
On Sun, Nov 28, 2010 at 6:43 PM, Steven D'Aprano wrote:
..
>> is more important than to assure users that once their program
>> accepted some text as a number, they can assume that the text is
>> ASCII.
>
> Seems like a pretty foolish assumption, if you ask me, pretty much akin to
> assuming that if string.isalpha() returns true that string is ASCII.
>
It is not to 99.9% of Python users whose code is written for 2.x.
Their strings are byte strings and string.isdigit() does imply ASCII
even if string.isalpha() does not in many locales.
..
> The fact that this is (apparently) only being raised now means that it isn't
> actually a problem in real life. I'd even say that it's a feature, and that
> if Python didn't support non-Arabic numerals, it should.
>
I raised this problem because I found a bug that is related to this
feature. The bug is also a regression from 2.x.
In 2.7:
>>> float(u'1234\xa1')
..
ValueError: invalid literal for float(): 1234?
The last character is lost, but the error message is still meaningful.
In 3.x, however:
>>> float('1234\xa1')
..
ValueError
See http://bugs.python.org/issue10557
While investigating this issue I found that by the time the string
gets to the number parser (_Py_dg_strtod), all non-ascii characters
are dropped by PyUnicode_EncodeDecimal() so it cannot produce
meaningful diagnostic.
Of course, PyUnicode_EncodeDecimal(), can be fixed by making it pass
non-ascii chars through as UTF-8 bytes, but I was wondering if
preserving the ability to parse exotic numerals was worth the effort.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] constant/enum type in stdlib
On 11/27/2010 04:51 AM, Nick Coghlan wrote:
x = named_value("FOO", 1)
y = named_value("BAR", "Hello World!")
z = named_value("BAZ", dict(a=1, b=2, c=3))
print(x, y, z, sep="\n")
print("\n".join(map(repr, (x, y, z
print("\n".join(map(str, map(type, (x, y, z)
set_named_values(globals(), foo=x._raw(), bar=y._raw(), baz=z._raw())
print("\n".join(map(repr, (foo, bar, baz
print(type(x) is type(foo), type(y) is type(bar), type(z) is type(baz))
==
# Session output for the last 6 lines
>>> print(x, y, z, sep="\n")
1
Hello World!
{'a': 1, 'c': 3, 'b': 2}
>>> print("\n".join(map(repr, (x, y, z
FOO=1
BAR='Hello World!'
BAZ={'a': 1, 'c': 3, 'b': 2}
This reminds me of python annotations. Which seem like an already
forgotten new feature. Maybe they can help with this?
It does associate additional info to names and creates a nice dictionary to
reference.
>>> def name_values( FOO: 1,
BAR: "Hello World!",
BAZ: dict(a=1, b=2, c=3) ):
... return FOO, BAR, BAZ
...
>>> foo(1,2,3)
(1, 2, 3)
>>> foo.__annotations__
{'BAR': 'Hello World!', 'FOO': 1, 'BAZ': {'a': 1, 'c': 3, 'b': 2}}
Cheers,
Ron
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
M.-A. Lemburg writes: > It is not uncommon for Asians and other non-Latin script users to > use their own native script symbols for numbers. Japanese don't, in computational or scientific work where float() would be used. Japanese numerals are used for dates and for certain felicitous ages (and even there so-called "Arabic" numerals are perfectly acceptable). Otherwise, it's all ASCII (although it might be "full-width" compatibility variants). > Please also remember that Python3 now allows Unicode names for > identifiers for much the same reasons. I don't think it's the same reason, not for Japanese, anyway. I agree that Python should make it easy for the programmer to get numerical values of native numeric strings, but it's not at all clear to me that there is any point to having float() recognize them by default. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
On Mon, Nov 29, 2010 at 1:39 PM, Stephen J. Turnbull wrote: > I agree that Python should make it easy for the programmer to get > numerical values of native numeric strings, but it's not at all clear > to me that there is any point to having float() recognize them by > default. Indeed, as someone else suggested earlier in the thread, supporting non-ASCII digits sounds more like a job for the locale module than for the builtin types. Deprecating non-ASCII support in the latter, while ensuring it is properly supported in the former sounds like a better way forward than maintaining the status quo (starting in 3.3 though, with the first beta just around the corner, we don't want to be monkeying with this in 3.2) Cheers, Nick. -- Nick Coghlan | [email protected] | Brisbane, Australia ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
> Perhaps int(), float(), Decimal() and friends could take an optional > parameter indicating whether non-ascii digits are considered. It would > then satisfy all parties. Not really. I still would want to see what the actual requirement is: i.e. do any users actually have the desire to have these digits accepted, yet the alternative decimal points rejected? Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
> The former ensures that literals in code are always readable; the later > allows users to enter numbers in their own number system. How could that > be a bad thing? It's YAGNI, feature bloat. It gives the illusion of supporting something that actually isn't supported very well (namely, parsing local number strings). I claim that there is no meaningful application of this feature. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
> That's mostly irrelevant. This feature exists and someone, somewhere, > may be using it. We normally don't remove stuff without deprecation. Sure: it should be deprecated before being removed. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384 final review
2010/11/29 "Martin v. Löwis" > I have now completed > > http://www.python.org/dev/peps/pep-0384/ was structseq.h considered? IMO it could be made PEP384-compliant with two additions that would replace two non-compliant functions: - A new function to create types, since PyStructSequence_InitType is supposed to work on a unititialized static variable: PyTypeObject *PyStructSequence_NewType(PyStructSequence_Desc *desc); - PyStructSequence_SetItem(), similar to the macro PyStructSequence_SET_ITEM; the PyStructSequence structure should be hidden. -- Amaury Forgeot d'Arc ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
