Re: [Python-Dev] Issue 19332: Guard against changing dict during iteration
On Tue, Nov 05, 2013 at 08:38:09PM -0800, Ethan Furman wrote: > http://bugs.python.org/issue19332 Duplicate of this: http://bugs.python.org/issue6017 The conclusion on that also was that it is not worth guarding against such an unusual circumstance. -- Steven ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Issue 19332: Guard against changing dict during iteration
On Wed, 06 Nov 2013 23:34:22 +1100, Steven D'Aprano wrote: > On Tue, Nov 05, 2013 at 08:38:09PM -0800, Ethan Furman wrote: > > > http://bugs.python.org/issue19332 > > Duplicate of this: http://bugs.python.org/issue6017 > > The conclusion on that also was that it is not worth guarding against > such an unusual circumstance. If I remember correctly (and I may not) the size-change guards were added because without them there were certain cases that could either segfault or result in an infinite loop. --David ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Issue 19332: Guard against changing dict during iteration
2013/11/6 R. David Murray : > On Wed, 06 Nov 2013 23:34:22 +1100, Steven D'Aprano > wrote: >> On Tue, Nov 05, 2013 at 08:38:09PM -0800, Ethan Furman wrote: >> >> > http://bugs.python.org/issue19332 >> >> Duplicate of this: http://bugs.python.org/issue6017 >> >> The conclusion on that also was that it is not worth guarding against >> such an unusual circumstance. > > If I remember correctly (and I may not) the size-change guards were > added because without them there were certain cases that could > either segfault or result in an infinite loop. > > --David This exception is quite old: --- changeset: 17597:32e7d0898eab branch: legacy-trunk user:Guido van Rossum date:Fri Apr 20 19:13:02 2001 + files: Include/Python.h Include/abstract.h Include/object.h Include/opcode.h Include/pyerrors.h Lib/dis.py Makefile.pre.in Objects/abstra description: Iterators phase 1. This comprises: new slot tp_iter in type object, plus new flag Py_TPFLAGS_HAVE_ITER new C API PyObject_GetIter(), calls tp_iter new builtin iter(), with two forms: iter(obj), and iter(function, sentinel) new internal object types iterobject and calliterobject new exception StopIteration new opcodes for "for" loops, GET_ITER and FOR_ITER (also supported by dis.py) new magic number for .pyc files new special method for instances: __iter__() returns an iterator iteration over dictionaries: "for x in dict" iterates over the keys iteration over files: "for x in file" iterates over lines TODO: documentation test suite decide whether to use a different way to spell iter(function, sentinal) decide whether "for key in dict" is a good idea use iterators in map/filter/reduce, min/max, and elsewhere (in/not in?) speed tuning (make next() a slot tp_next???) --- More recently, I added another exception if a dictionary is modified during a lookup. When I proposed a new frozendict type to secure my pysandbox project, Armin Rigo wrote that CPython segfaults must be fixed. So I fixed a corner case on dictionaries. http://bugs.python.org/issue14205 Thread in python-dev: https://mail.python.org/pipermail/python-dev/2012-March/117290.html The discussion in http://bugs.python.org/issue14205 is interesting. Victor ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Issue 19332: Guard against changing dict during iteration
On Wed, 06 Nov 2013 16:10:16 +0100, Victor Stinner wrote: > More recently, I added another exception if a dictionary is modified > during a lookup. > > When I proposed a new frozendict type to secure my pysandbox project, > Armin Rigo wrote that CPython segfaults must be fixed. So I fixed a > corner case on dictionaries. > http://bugs.python.org/issue14205 Ah, that may be what I was (mis)remembering. --David ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Issue 19332: Guard against changing dict during iteration
Le 06/11/2013 06:41, Nick Coghlan a écrit : The behaviour of mutating builtin containers while iterating over them is formally undefined beyond "it won't segfault" (one of the few such undefined behaviours in Python). The associated exceptions are thus strictly "best effort given other constraints". Not sure what you mean with "formally undefined". For example, you can perfectly well change a list's contents while iterating over it, and I bet there's a lot of code that relies on that, as in:: for i, value in enumerate(mylist): if some_condition(value): mylist[i] = some_function(value) If you change "builtin containers" to "builtin unordered containers", then you probably are closer to the truth :) >or perhaps some performance measurements that would show there is no performance based reason to not have the patch added. (I can try to do the performance part myself if necessary, I'm just not sure what all the steps are yet.) If the benchmark suite indicates there's no measurable speed penalty then such a patch may be worth reconsidering. I'd be astonished if that was actually the case, though - the lowest impact approach I can think of is to check for live iterators when setting a dict entry, and that still has non-trivial size and speed implications. I think Serhiy's patch is much simpler than that (is it Serhiy's? I think so). Regards Antoine. ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Issue 19332: Guard against changing dict during iteration
06.11.13 07:41, Nick Coghlan написав(ла): If the benchmark suite indicates there's no measurable speed penalty then such a patch may be worth reconsidering. I don't see any significant speed difference even in artificial presumably worst case (a lot of items assignment in tight loop). If you have tests which demonstrate a difference, please show them. I'd be astonished if that was actually the case, though - the lowest impact approach I can think of is to check for live iterators when setting a dict entry, and that still has non-trivial size and speed implications. Actually we should guard not against changing dict during iteration, but against iterating modifying dict (and only such modifications which change dict's keys). For this we need only keys modification counter in a dict and it's copy in an iterator (this doesn't increase memory requirements however). I suppose Java use same technique in HashMap. ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Issue 19332: Guard against changing dict during iteration
On Wed, Nov 6, 2013 at 11:02 AM, Serhiy Storchaka wrote: > Actually we should guard not against changing dict during iteration, but > against iterating modifying dict (and only such modifications which change > dict's keys). s/iterating modifying/iteration modifying/ ? Just to clarify, do you mean we should only guard against modifications to the dict's keys during its iterator's __next__()? Changes to the values during __next__() would be okay, as would changes to the keys if they happen outside __next__()? Presumably the above restriction also applies to the iterators of the dict's views. OrderedDict would also need to be changed, meaning this counter would have to be accessible to and incrementable by subclasses. OrderedDict makes use of the MappingView classes in collections.abc, so those would also have be adjusted to check this counter. Would MutableMapping also need to be adjusted to accommodate the counter? Given that the MappingView classes would rely on the counter, I'd expect MutableMapping to need some method or variable that the views can rely on. How would we make sure custom methods, particularly __setitem__() and __delitem__(), increment the counter? > For this we need only keys modification counter in a dict A strictly monotonic counter, right? Every mutation method of dict would have to increment this counter. So would that put a limit (albeit a high one) on the number of mutations that can be made to a dict? Would there be conditions under which we'd reset the counter? If so, how would existing iterators cope? > and > it's copy in an iterator (this doesn't increase memory requirements > however). Because the iterators (and views) already have a pointer to the dict, right? -eric ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] A small patch.
diff -r bec6df56c053 Lib/datetime.py --- a/Lib/datetime.pyTue Nov 05 22:01:46 2013 +0200 +++ b/Lib/datetime.pyThu Nov 07 00:46:34 2013 +0530 @@ -43,7 +43,7 @@ def _days_in_month(year, month): "year, month -> number of days in that month in that year." -assert 1 <= month <= 12, month +assert 1 <= month <= 12, 'month must be in 1..12' if month == 2 and _is_leap(year): return 29 return _DAYS_IN_MONTH[month] Best Regards, Srinivas Reddy Thatiparthy ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A small patch.
Please put all patches on bugs.python.org, else we will lose track of the change. On Wed, Nov 6, 2013 at 2:17 PM, Sreenivas Reddy T < [email protected]> wrote: > diff -r bec6df56c053 Lib/datetime.py > --- a/Lib/datetime.pyTue Nov 05 22:01:46 2013 +0200 > +++ b/Lib/datetime.pyThu Nov 07 00:46:34 2013 +0530 > @@ -43,7 +43,7 @@ > > def _days_in_month(year, month): > "year, month -> number of days in that month in that year." > -assert 1 <= month <= 12, month > +assert 1 <= month <= 12, 'month must be in 1..12' > if month == 2 and _is_leap(year): > return 29 > return _DAYS_IN_MONTH[month] > > Best Regards, > Srinivas Reddy Thatiparthy > > > ___ > Python-Dev mailing list > [email protected] > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > > ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Issue 19332: Guard against changing dict during iteration
06.11.13 21:12, Eric Snow написав(ла): Just to clarify, do you mean we should only guard against modifications to the dict's keys during its iterator's __next__()? Changes to the values during __next__() would be okay, as would changes to the keys if they happen outside __next__()? Dict iteration order can be changed after adding or deleting a key so continue iterating after this is not well defined. __next__() is invalid if dict keys was modified after iterators creation. This is common assertion for hashtables in many programming languages. The raising an exception in such situation just helps a programmer to find his mistake. Java raises an exception, in C++ this is an undefined behavior. Presumably the above restriction also applies to the iterators of the dict's views. Yes. The proposed patch have tests for these cases. OrderedDict would also need to be changed, meaning this counter would have to be accessible to and incrementable by subclasses. OrderedDict makes use of the MappingView classes in collections.abc, so those would also have be adjusted to check this counter. Perhaps issue19414 is related. OrderedDict not needs such behavior because it's iteration order is well defined and we can implement another reasonable behavior (for example deleted keys are skipped and newly added keys are always iterated, because they added at the end of iteration order). Would MutableMapping also need to be adjusted to accommodate the counter? Given that the MappingView classes would rely on the counter, I'd expect MutableMapping to need some method or variable that the views can rely on. How would we make sure custom methods, particularly __setitem__() and __delitem__(), increment the counter? MutableMapping doesn't implement neither __iter__() nor __setitem__(). Concrete implementation is responsible of this, either provide reliable __iter__() which have predictable behavior after __setitem__(), or detect such modification and raise an exception, or just ignore the problem. The MappingView classes would get this for free because their iterators iterate over mapping itself. A strictly monotonic counter, right? Every mutation method of dict would have to increment this counter. So would that put a limit (albeit a high one) on the number of mutations that can be made to a dict? Would there be conditions under which we'd reset the counter? If so, how would existing iterators cope? It is very unlikely that unintentional causes exact 2**32 or 2**64 mutations between dict iterations. If this will happen the programmer should use other methods to find his mistakes. Because the iterators (and views) already have a pointer to the dict, right? Currently the PyDictObject object contains 3 words and adding yet one word unlikely change actually allocated size. Dict iterators already contain a field for dict's size, it will be replaced by a field for dict's counter. ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A small patch.
> -assert 1 <= month <= 12, month > +assert 1 <= month <= 12, 'month must be in 1..12' In addition to Brett's comment, you might as well include the offending value in your AssertionError message. For example, a value of 0 probably tells you something different about your underlying bug than a value of 2013. Just knowing it's out of range isn't really enough. Skip ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A small patch.
Le 06/11/2013 21:39, Skip Montanaro a écrit : -assert 1 <= month <= 12, month +assert 1 <= month <= 12, 'month must be in 1..12' In addition to Brett's comment, you might as well include the offending value in your AssertionError message. For example, a value of 0 probably tells you something different about your underlying bug than a value of 2013. Just knowing it's out of range isn't really enough. Besides, if it's an assertion it's only an internal helper to check implementation correctness. If it's an error that can be caused by erroneous user data, it should be replaced with the proper exception class (perhaps ValueError). Regards Antoine. ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A small patch.
On Wed, Nov 6, 2013 at 3:43 PM, Antoine Pitrou wrote: > Besides, if it's an assertion it's only an internal helper to check > implementation correctness. If it's an error that can be caused by > erroneous user data, it should be replaced with the proper exception class > (perhaps ValueError). As far as I can tell, the only way to trigger this assertion is to call internal _days_in_month() function directly - something that users should not do. For a public function with similar functionality look at calendar.monthrange( ). ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Issue 19332: Guard against changing dict during iteration
Thank you, Raymond, and everyone else who responded both here and on the tracker. -- ~Ethan~ ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Avoid formatting an error message on attribute error
Hi,
I'm trying to avoid unnecessary temporary Unicode strings when
possible (see issue #19512). While working on this, I saw that Python
likes preparing an user friendly message to explain why getting an
attribute failed. The problem is that in most cases, the caller
doesn't care of the message: the exception is simply deleted. For
example, hasattr() deletes immediatly the AttributeError.
It would be nice to only format the message on demand. The
AttributeError would keep a reference to the type. Keeping a strong
reference to the type might change the behaviour of some applications
in some corner cases. (Holding a reference to the object would be
worse, and the type looks to be preferred over the type to format the
error message.)
Pseudo-code for modified AssertionError:
class AttributeError(Exception):
def __init__(self, type, attr):
self.type = type
self.attr = attr
self._message = None
def __str__(self):
if self._message is None:
self._message = ("type object %s has no attribute %r"
% (self.type.__name__, self.attr))
return self._message
AttributeError.args would be (type, attr) instead of (message,).
ImportError was also modified to add a new "name "attribute".
If AttributeError cannot be modified (because of backward
compatibility), would it be possible to add a new exception inheriting
from AttributeError?
I have a similar project for OSError (generate the message on demand),
but it's more tricky because os.strerror(errno) depends on the current
locale...
***
Example of C code raising an AttributeError:
PyErr_Format(PyExc_AttributeError,
"'%.50s' object has no attribute '%U'",
tp->tp_name, name);
Example of C code ignoring the AttributeError:
value = _PyObject_GetAttrId(mod, &PyId___initializing__);
if (value == NULL)
PyErr_Clear();
else {
...
}
Example of Python code ignoring the AttributeError:
try:
get_subactions = action._get_subactions
except AttributeError:
pass
else:
...
Another example in Python:
try:
retattr = getattr(self.socket, attr)
except AttributeError:
# the original error is not used
raise AttributeError("%s instance has no attribute '%s'"
%(self.__class__.__name__, attr))
Victor
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Avoid formatting an error message on attribute error
On Wed, Nov 06, 2013 at 11:32:33PM +0100, Victor Stinner wrote:
> Hi,
>
> I'm trying to avoid unnecessary temporary Unicode strings when
> possible (see issue #19512). While working on this, I saw that Python
> likes preparing an user friendly message to explain why getting an
> attribute failed. The problem is that in most cases, the caller
> doesn't care of the message: the exception is simply deleted. For
> example, hasattr() deletes immediatly the AttributeError.
My initial instinct here was to say that sounded like premature
optimization, but to my surprise the overhead of generating the error
message is actually significant -- at least from pure Python 3.3 code.
That is, the time it takes to generate the message:
"'%s' object has no attribute %r" % (obj.__class__.__name__, attrname)
is about the same as the time to raise and catch an exception:
try:
raise AttributeError
except AttributeError:
pass
(measured with timeit).
Given that, I wonder whether it would be worth coming up with a more
general solution to the question of lazily generating error messages
rather than changing AttributeError specifically. Just thinking aloud --
perhaps this should go to python-ideas? -- maybe something like this:
class _LazyMsg:
def __init__(self, template, obj, attrname):
self.template = template
self.typename = type(obj).__name__
self.attrname = attrname
def __str__(self):
return self.template % (self.typename, self.attrname)
Then:
py> err = AttributeError(
... _LazyMsg("'%s' object has no attribute '%s'", None, "spam"))
py> raise err
Traceback (most recent call last):
File "", line 1, in
AttributeError: 'NoneType' object has no attribute 'spam'
Giving the same effect from C code, I leave as an exercise for the
reader :-)
> It would be nice to only format the message on demand. The
> AttributeError would keep a reference to the type.
Since only the type name is used, why keep a reference to the type
instead of just type.__name__?
> AttributeError.args would be (type, attr) instead of (message,).
> ImportError was also modified to add a new "name "attribute".
I don't like changing the signature of AttributeError. I've got code
that raises AttributeError explicitly.
--
Steven
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
