Re: [Python-Dev] Issue 19332: Guard against changing dict during iteration

2013-11-06 Thread Steven D'Aprano
On Tue, Nov 05, 2013 at 08:38:09PM -0800, Ethan Furman wrote:

> http://bugs.python.org/issue19332

Duplicate of this: http://bugs.python.org/issue6017

The conclusion on that also was that it is not worth guarding against 
such an unusual circumstance.


-- 
Steven
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Issue 19332: Guard against changing dict during iteration

2013-11-06 Thread R. David Murray
On Wed, 06 Nov 2013 23:34:22 +1100, Steven D'Aprano  wrote:
> On Tue, Nov 05, 2013 at 08:38:09PM -0800, Ethan Furman wrote:
> 
> > http://bugs.python.org/issue19332
> 
> Duplicate of this: http://bugs.python.org/issue6017
> 
> The conclusion on that also was that it is not worth guarding against 
> such an unusual circumstance.

If I remember correctly (and I may not) the size-change guards were
added because without them there were certain cases that could
either segfault or result in an infinite loop.

--David
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Issue 19332: Guard against changing dict during iteration

2013-11-06 Thread Victor Stinner
2013/11/6 R. David Murray :
> On Wed, 06 Nov 2013 23:34:22 +1100, Steven D'Aprano  
> wrote:
>> On Tue, Nov 05, 2013 at 08:38:09PM -0800, Ethan Furman wrote:
>>
>> > http://bugs.python.org/issue19332
>>
>> Duplicate of this: http://bugs.python.org/issue6017
>>
>> The conclusion on that also was that it is not worth guarding against
>> such an unusual circumstance.
>
> If I remember correctly (and I may not) the size-change guards were
> added because without them there were certain cases that could
> either segfault or result in an infinite loop.
>
> --David

This exception is quite old:
---
changeset:   17597:32e7d0898eab
branch:  legacy-trunk
user:Guido van Rossum 
date:Fri Apr 20 19:13:02 2001 +
files:   Include/Python.h Include/abstract.h Include/object.h
Include/opcode.h Include/pyerrors.h Lib/dis.py Makefile.pre.in
Objects/abstra
description:
Iterators phase 1.  This comprises:

new slot tp_iter in type object, plus new flag Py_TPFLAGS_HAVE_ITER
new C API PyObject_GetIter(), calls tp_iter
new builtin iter(), with two forms: iter(obj), and iter(function, sentinel)
new internal object types iterobject and calliterobject
new exception StopIteration
new opcodes for "for" loops, GET_ITER and FOR_ITER (also supported by dis.py)
new magic number for .pyc files
new special method for instances: __iter__() returns an iterator
iteration over dictionaries: "for x in dict" iterates over the keys
iteration over files: "for x in file" iterates over lines

TODO:

documentation
test suite
decide whether to use a different way to spell iter(function, sentinal)
decide whether "for key in dict" is a good idea
use iterators in map/filter/reduce, min/max, and elsewhere (in/not in?)
speed tuning (make next() a slot tp_next???)
---

More recently, I added another exception if a dictionary is modified
during a lookup.

When I proposed a new frozendict type to secure my pysandbox project,
Armin Rigo wrote that CPython segfaults must be fixed. So I fixed a
corner case on dictionaries.
http://bugs.python.org/issue14205

Thread in python-dev:
https://mail.python.org/pipermail/python-dev/2012-March/117290.html

The discussion in http://bugs.python.org/issue14205 is interesting.

Victor
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Issue 19332: Guard against changing dict during iteration

2013-11-06 Thread R. David Murray
On Wed, 06 Nov 2013 16:10:16 +0100, Victor Stinner  
wrote:
> More recently, I added another exception if a dictionary is modified
> during a lookup.
> 
> When I proposed a new frozendict type to secure my pysandbox project,
> Armin Rigo wrote that CPython segfaults must be fixed. So I fixed a
> corner case on dictionaries.
> http://bugs.python.org/issue14205

Ah, that may be what I was (mis)remembering.

--David
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Issue 19332: Guard against changing dict during iteration

2013-11-06 Thread Antoine Pitrou

Le 06/11/2013 06:41, Nick Coghlan a écrit :


The behaviour of mutating builtin containers while iterating over them
is formally undefined beyond "it won't segfault" (one of the few such
undefined behaviours in Python). The associated exceptions are thus
strictly "best effort given other constraints".


Not sure what you mean with "formally undefined". For example, you can 
perfectly well change a list's contents while iterating over it, and I 
bet there's a lot of code that relies on that, as in::


for i, value in enumerate(mylist):
if some_condition(value):
mylist[i] = some_function(value)

If you change "builtin containers" to "builtin unordered containers", 
then you probably are closer to the truth :)



 >or perhaps some performance measurements that would show there is no
performance based reason to not have the patch added.  (I can try to do
the performance part myself if necessary, I'm just not sure what all the
steps are yet.)

If the benchmark suite indicates there's no measurable speed penalty
then such a patch may be worth reconsidering. I'd be astonished if that
was actually the case, though - the lowest impact approach I can think
of is to check for live iterators when setting a dict entry, and that
still has non-trivial size and speed implications.


I think Serhiy's patch is much simpler than that (is it Serhiy's? I 
think so).


Regards

Antoine.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Issue 19332: Guard against changing dict during iteration

2013-11-06 Thread Serhiy Storchaka

06.11.13 07:41, Nick Coghlan написав(ла):

If the benchmark suite indicates there's no measurable speed penalty
then such a patch may be worth reconsidering.


I don't see any significant speed difference even in artificial 
presumably worst case (a lot of items assignment in tight loop).

If you have tests which demonstrate a difference, please show them.


I'd be astonished if that
was actually the case, though - the lowest impact approach I can think
of is to check for live iterators when setting a dict entry, and that
still has non-trivial size and speed implications.


Actually we should guard not against changing dict during iteration, but 
against iterating modifying dict (and only such modifications which 
change dict's keys). For this we need only keys modification counter in 
a dict and it's copy in an iterator (this doesn't increase memory 
requirements however). I suppose Java use same technique in HashMap.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Issue 19332: Guard against changing dict during iteration

2013-11-06 Thread Eric Snow
On Wed, Nov 6, 2013 at 11:02 AM, Serhiy Storchaka  wrote:
> Actually we should guard not against changing dict during iteration, but
> against iterating modifying dict (and only such modifications which change
> dict's keys).

s/iterating modifying/iteration modifying/ ?

Just to clarify, do you mean we should only guard against
modifications to the dict's keys during its iterator's __next__()?
Changes to the values during __next__() would be okay, as would
changes to the keys if they happen outside __next__()?

Presumably the above restriction also applies to the iterators of the
dict's views.



OrderedDict would also need to be changed, meaning this counter would
have to be accessible to and incrementable by subclasses.  OrderedDict
makes use of the MappingView classes in collections.abc, so those
would also have be adjusted to check this counter.

Would MutableMapping also need to be adjusted to accommodate the
counter?  Given that the MappingView classes would rely on the
counter, I'd expect MutableMapping to need some method or variable
that the views can rely on.  How would we make sure custom methods,
particularly __setitem__() and __delitem__(), increment the counter?



> For this we need only keys modification counter in a dict

A strictly monotonic counter, right?  Every mutation method of dict
would have to increment this counter.  So would that put a limit
(albeit a high one) on the number of mutations that can be made to a
dict?  Would there be conditions under which we'd reset the counter?
If so, how would existing iterators cope?

> and
> it's copy in an iterator (this doesn't increase memory requirements
> however).

Because the iterators (and views) already have a pointer to the dict, right?

-eric
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] A small patch.

2013-11-06 Thread Sreenivas Reddy T
diff -r bec6df56c053 Lib/datetime.py
--- a/Lib/datetime.pyTue Nov 05 22:01:46 2013 +0200
+++ b/Lib/datetime.pyThu Nov 07 00:46:34 2013 +0530
@@ -43,7 +43,7 @@

 def _days_in_month(year, month):
 "year, month -> number of days in that month in that year."
-assert 1 <= month <= 12, month
+assert 1 <= month <= 12, 'month must be in 1..12'
 if month == 2 and _is_leap(year):
 return 29
 return _DAYS_IN_MONTH[month]

Best Regards,
Srinivas Reddy Thatiparthy
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A small patch.

2013-11-06 Thread Brett Cannon
Please put all patches on bugs.python.org, else we will lose track of the
change.


On Wed, Nov 6, 2013 at 2:17 PM, Sreenivas Reddy T <
[email protected]> wrote:

> diff -r bec6df56c053 Lib/datetime.py
> --- a/Lib/datetime.pyTue Nov 05 22:01:46 2013 +0200
> +++ b/Lib/datetime.pyThu Nov 07 00:46:34 2013 +0530
> @@ -43,7 +43,7 @@
>
>  def _days_in_month(year, month):
>  "year, month -> number of days in that month in that year."
> -assert 1 <= month <= 12, month
> +assert 1 <= month <= 12, 'month must be in 1..12'
>  if month == 2 and _is_leap(year):
>  return 29
>  return _DAYS_IN_MONTH[month]
>
> Best Regards,
> Srinivas Reddy Thatiparthy
>
>
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>
>
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Issue 19332: Guard against changing dict during iteration

2013-11-06 Thread Serhiy Storchaka

06.11.13 21:12, Eric Snow написав(ла):

Just to clarify, do you mean we should only guard against
modifications to the dict's keys during its iterator's __next__()?
Changes to the values during __next__() would be okay, as would
changes to the keys if they happen outside __next__()?


Dict iteration order can be changed after adding or deleting a key so 
continue iterating after this is not well defined. __next__() is invalid 
if dict keys was modified after iterators creation. This is common 
assertion for hashtables in many programming languages. The raising an 
exception in such situation just helps a programmer to find his mistake. 
Java raises an exception, in C++ this is an undefined behavior.



Presumably the above restriction also applies to the iterators of the
dict's views.


Yes. The proposed patch have tests for these cases.


OrderedDict would also need to be changed, meaning this counter would
have to be accessible to and incrementable by subclasses.  OrderedDict
makes use of the MappingView classes in collections.abc, so those
would also have be adjusted to check this counter.


Perhaps issue19414 is related. OrderedDict not needs such behavior 
because it's iteration order is well defined and we can implement 
another reasonable behavior (for example deleted keys are skipped and 
newly added keys are always iterated, because they added at the end of 
iteration order).



Would MutableMapping also need to be adjusted to accommodate the
counter?  Given that the MappingView classes would rely on the
counter, I'd expect MutableMapping to need some method or variable
that the views can rely on.  How would we make sure custom methods,
particularly __setitem__() and __delitem__(), increment the counter?


MutableMapping doesn't implement neither __iter__() nor __setitem__(). 
Concrete implementation is responsible of this, either provide reliable 
__iter__() which have predictable behavior after __setitem__(), or 
detect such modification and raise an exception, or just ignore the problem.


The MappingView classes would get this for free because their iterators 
iterate over mapping itself.



A strictly monotonic counter, right?  Every mutation method of dict
would have to increment this counter.  So would that put a limit
(albeit a high one) on the number of mutations that can be made to a
dict?  Would there be conditions under which we'd reset the counter?
If so, how would existing iterators cope?


It is very unlikely that unintentional causes exact 2**32 or 2**64 
mutations between dict iterations. If this will happen the programmer 
should use other methods to find his mistakes.



Because the iterators (and views) already have a pointer to the dict, right?


Currently the PyDictObject object contains 3 words and adding yet one 
word unlikely change actually allocated size. Dict iterators already 
contain a field for dict's size, it will be replaced by a field for 
dict's counter.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A small patch.

2013-11-06 Thread Skip Montanaro
> -assert 1 <= month <= 12, month
> +assert 1 <= month <= 12, 'month must be in 1..12'

In addition to Brett's comment, you might as well include the
offending value in your AssertionError message. For example, a value
of 0 probably tells you something different about your underlying bug
than a value of 2013. Just knowing it's out of range isn't really
enough.

Skip
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A small patch.

2013-11-06 Thread Antoine Pitrou

Le 06/11/2013 21:39, Skip Montanaro a écrit :

-assert 1 <= month <= 12, month
+assert 1 <= month <= 12, 'month must be in 1..12'


In addition to Brett's comment, you might as well include the
offending value in your AssertionError message. For example, a value
of 0 probably tells you something different about your underlying bug
than a value of 2013. Just knowing it's out of range isn't really
enough.


Besides, if it's an assertion it's only an internal helper to check 
implementation correctness. If it's an error that can be caused by 
erroneous user data, it should be replaced with the proper exception 
class (perhaps ValueError).


Regards

Antoine.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A small patch.

2013-11-06 Thread Alexander Belopolsky
On Wed, Nov 6, 2013 at 3:43 PM, Antoine Pitrou  wrote:

> Besides, if it's an assertion it's only an internal helper to check
> implementation correctness. If it's an error that can be caused by
> erroneous user data, it should be replaced with the proper exception class
> (perhaps ValueError).


As far as I can tell, the only way to trigger this assertion is to call
internal _days_in_month() function directly - something that users should
not do.  For a public function with similar functionality look at
calendar.monthrange(
).
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Issue 19332: Guard against changing dict during iteration

2013-11-06 Thread Ethan Furman

Thank you, Raymond, and everyone else who responded both here and on the 
tracker.

--
~Ethan~
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Avoid formatting an error message on attribute error

2013-11-06 Thread Victor Stinner
Hi,

I'm trying to avoid unnecessary temporary Unicode strings when
possible (see issue #19512). While working on this, I saw that Python
likes preparing an user friendly message to explain why getting an
attribute failed. The problem is that in most cases, the caller
doesn't care of the message: the exception is simply deleted. For
example, hasattr() deletes immediatly the AttributeError.

It would be nice to only format the message on demand. The
AttributeError would keep a reference to the type. Keeping a strong
reference to the type might change the behaviour of some applications
in some corner cases. (Holding a reference to the object would be
worse, and the type looks to be preferred over the type to format the
error message.)

Pseudo-code for modified AssertionError:

class AttributeError(Exception):
def __init__(self, type, attr):
self.type = type
self.attr = attr
self._message = None

def __str__(self):
if self._message is None:
self._message = ("type object %s has no attribute %r"
 % (self.type.__name__, self.attr))
return self._message

AttributeError.args would be (type, attr) instead of (message,).
ImportError was also modified to add a new "name "attribute".

If AttributeError cannot be modified (because of backward
compatibility), would it be possible to add a new exception inheriting
from AttributeError?


I have a similar project for OSError (generate the message on demand),
but it's more tricky because os.strerror(errno) depends on the current
locale...

***

Example of C code raising an AttributeError:

PyErr_Format(PyExc_AttributeError,
 "'%.50s' object has no attribute '%U'",
 tp->tp_name, name);

Example of C code ignoring the AttributeError:

value = _PyObject_GetAttrId(mod, &PyId___initializing__);
if (value == NULL)
PyErr_Clear();
else {
...
}

Example of Python code ignoring the AttributeError:

try:
get_subactions = action._get_subactions
except AttributeError:
pass
else:
...

Another example in Python:

try:
retattr = getattr(self.socket, attr)
except AttributeError:
# the original error is not used
raise AttributeError("%s instance has no attribute '%s'"
 %(self.__class__.__name__, attr))

Victor
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Avoid formatting an error message on attribute error

2013-11-06 Thread Steven D'Aprano
On Wed, Nov 06, 2013 at 11:32:33PM +0100, Victor Stinner wrote:
> Hi,
> 
> I'm trying to avoid unnecessary temporary Unicode strings when
> possible (see issue #19512). While working on this, I saw that Python
> likes preparing an user friendly message to explain why getting an
> attribute failed. The problem is that in most cases, the caller
> doesn't care of the message: the exception is simply deleted. For
> example, hasattr() deletes immediatly the AttributeError.

My initial instinct here was to say that sounded like premature 
optimization, but to my surprise the overhead of generating the error 
message is actually significant -- at least from pure Python 3.3 code.

That is, the time it takes to generate the message:

"'%s' object has no attribute %r" % (obj.__class__.__name__, attrname)

is about the same as the time to raise and catch an exception:

try:
raise AttributeError
except AttributeError:
pass

(measured with timeit).


Given that, I wonder whether it would be worth coming up with a more 
general solution to the question of lazily generating error messages 
rather than changing AttributeError specifically. Just thinking aloud -- 
perhaps this should go to python-ideas? -- maybe something like this:

class _LazyMsg:
def __init__(self, template, obj, attrname):
self.template = template
self.typename = type(obj).__name__
self.attrname = attrname
def __str__(self):
return self.template % (self.typename, self.attrname)

Then:

py> err = AttributeError(
... _LazyMsg("'%s' object has no attribute '%s'", None, "spam"))
py> raise err
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'NoneType' object has no attribute 'spam'


Giving the same effect from C code, I leave as an exercise for the 
reader :-)


> It would be nice to only format the message on demand. The
> AttributeError would keep a reference to the type. 

Since only the type name is used, why keep a reference to the type 
instead of just type.__name__?


> AttributeError.args would be (type, attr) instead of (message,).
> ImportError was also modified to add a new "name "attribute".

I don't like changing the signature of AttributeError. I've got code 
that raises AttributeError explicitly.



-- 
Steven
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com