[Python-Dev] Is there any remaining reason why weakref callbacks shouldn't be able to access the referenced object?
Hi all, It's an old feature of the weakref API that you can define an arbitrary callback to be invoked when the referenced object dies, and that when this callback is invoked, it gets handed the weakref wrapper object -- BUT, only after it's been cleared, so that the callback can't access the originally referenced object. (I.e., this callback will never raise: def callback(ref): assert ref() is None.) AFAICT the original motivation for this seems was that if the weakref callback could get at the object, then the weakref callback would effectively be another finalizer like __del__, and finalizers and reference cycles don't mix, so weakref callbacks can't be finalizers. There's a long document from the 2.4 days about all the terrible things that could happen if arbitrary code like callbacks could get unfettered access to cyclic isolates at weakref cleanup time [1]. But that was 2.4. In the mean time, of course, PEP 442 fixed it so that finalizers and weakrefs mix just fine. In fact, weakref callbacks are now run *before* __del__ methods [2], so clearly it's now okay for arbitrary code to touch the objects during that phase of the GC -- at least in principle. So what I'm wondering is, would anything terrible happen if we started passing still-live weakrefs into weakref callbacks, and then clearing them afterwards? (i.e. making step 1 of the PEP 442 cleanup order be "run callbacks and then clear weakrefs", instead of the current "clear weakrefs and then run callbacks"). I skimmed through the PEP 442 discussion, and AFAICT the rationale for keeping the old weakref behavior was just that no-one could be bothered to mess with it [3]. [The motivation for my question is partly curiosity, and partly that in the discussion about how to handle GC for async objects, it occurred to me that it might be very nice if arbitrary classes that needed access to the event loop during cleanup could do something like def __init__(self, ...): loop = asyncio.get_event_loop() loop.gc_register(self) # automatically called by the loop when I am GC'ed; async equivalent of __del__ async def aclose(self): ... Right now something *sort* of like this is possible but it requires a much more cumbersome API, where every class would have to implement logic to fetch a cleanup callback from the loop, store it, and then call it from its __del__ method -- like how PEP 525 does it. Delaying weakref clearing would make this simpler API possible.] -n [1] https://github.com/python/cpython/blob/master/Modules/gc_weakref.txt [2] https://www.python.org/dev/peps/pep-0442/#id7 [3] https://mail.python.org/pipermail/python-dev/2013-May/126592.html -- Nathaniel J. Smith -- https://vorpus.org ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Have I got my hg dependencies correct?
On Thu, 20 Oct 2016 at 04:48 Skip Montanaro wrote: > I've recently run into a problem building the math and cmath modules > for 2.7. (I don't rebuild very often, so this problem might have been > around for awhile.) My hg repos look like this: > > * My cpython repo pulls from https://hg.python.org/cpython > > * My 2.7 repo (and other non-tip repos) pulls from my cpython repo > > I think this setup was recommended way back in the day when hg was new > to the Python toolchain to avoid unnecessary network bandwidth. > > So, if I execute > > hg pull > hg update > > in first cpython, then 2.7 repos I should be up-to-date, correct? > Nope, you need to execute the same steps in your 2.7 checkout if you're keeping it in a separate directory from your cpython repo that you're referring to (you can also use `hg pull -u` to do the two steps above in a single command). ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Summary of Python tracker Issues
ACTIVITY SUMMARY (2016-10-14 - 2016-10-21) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open5524 ( -3) closed 34728 (+55) total 40252 (+52) Open issues with patches: 2398 Issues opened (28) == #28404: Logging SyslogHandler not appending '\n' to the end http://bugs.python.org/issue28404 reopened by elelement #28437: Documentation for handling of non-type metaclass hints is uncl http://bugs.python.org/issue28437 reopened by ncoghlan #28445: Wrong documentation for GzipFile.peek http://bugs.python.org/issue28445 opened by abacabadabacaba #28446: pyvenv generates malformed hashbangs for scripts http://bugs.python.org/issue28446 opened by alexreg #28449: tarfile.open(mode = 'r:*', ignore_zeros = True) has 50% chance http://bugs.python.org/issue28449 opened by Silver Fox #28450: Misleading/inaccurate documentation about unknown escape seque http://bugs.python.org/issue28450 opened by lelit #28451: pydoc.safeimport() raises ErrorDuringImport() if __builtin__._ http://bugs.python.org/issue28451 opened by segfault87 #28453: SSLObject.selected_alpn_protocol() not documented http://bugs.python.org/issue28453 opened by alex.gronholm #28457: Make public the current private known hash functions in the C- http://bugs.python.org/issue28457 opened by rhettinger #28459: _pyio module broken on Cygwin / setmode not usable http://bugs.python.org/issue28459 opened by erik.bray #28460: Minidom, order of attributes, datachars http://bugs.python.org/issue28460 opened by Petr Pulc #28462: subprocess pipe can't see EOF from a child in case of a few ch http://bugs.python.org/issue28462 opened by Vyacheslav Grigoryev #28463: Email long headers parsing/serialization http://bugs.python.org/issue28463 opened by ÐонÑÑанÑин Ðолков #28464: BaseEventLoop.close should shutdown executor before marking it http://bugs.python.org/issue28464 opened by cmeyer #28465: python 3.5 magic number http://bugs.python.org/issue28465 opened by æ¹å¿ #28469: timeit: use powers of 2 in autorange(), instead of powers of 1 http://bugs.python.org/issue28469 opened by haypo #28470: configure.ac -g debug compiler option when not Py_DEBUG http://bugs.python.org/issue28470 opened by Chris Byers #28474: WinError(): Python int too large to convert to C long http://bugs.python.org/issue28474 opened by Kelvin You #28475: Misleading error on random.sample when k < 0 http://bugs.python.org/issue28475 opened by franciscouzo #28477: Add optional user argument to pathlib.Path.home() http://bugs.python.org/issue28477 opened by josh.r #28478: Built-in module 'time' does not enable functions if -Werror sp http://bugs.python.org/issue28478 opened by toast12 #28482: test_typing fails if asyncio unavailable http://bugs.python.org/issue28482 opened by martin.panter #28485: compileall.compile_dir(workers=) does not raise Valu http://bugs.python.org/issue28485 opened by martin.panter #28488: shutil.make_archive (xxx, zip, root_dir) is adding './' entry http://bugs.python.org/issue28488 opened by bialix #28489: Fix comment in tokenizer.c http://bugs.python.org/issue28489 opened by Ryan.Gonzalez #28491: Remove bundled libffi for OSX http://bugs.python.org/issue28491 opened by zach.ware #28494: is_zipfile false positives http://bugs.python.org/issue28494 opened by Thomas.Waldmann #28496: Mark up constants 0, 1, -1 in C API docs http://bugs.python.org/issue28496 opened by serhiy.storchaka Most recent 15 issues with no replies (15) == #28485: compileall.compile_dir(workers=) does not raise Valu http://bugs.python.org/issue28485 #28470: configure.ac -g debug compiler option when not Py_DEBUG http://bugs.python.org/issue28470 #28464: BaseEventLoop.close should shutdown executor before marking it http://bugs.python.org/issue28464 #28460: Minidom, order of attributes, datachars http://bugs.python.org/issue28460 #28457: Make public the current private known hash functions in the C- http://bugs.python.org/issue28457 #28446: pyvenv generates malformed hashbangs for scripts http://bugs.python.org/issue28446 #28439: Remove redundant checks in PyUnicode_EncodeLocale and PyUnicod http://bugs.python.org/issue28439 #28429: ctypes fails to import with grsecurity's TPE http://bugs.python.org/issue28429 #28422: multiprocessing Manager mutable type member access failure http://bugs.python.org/issue28422 #28416: defining persistent_id in _pickle.Pickler subclass causes refe http://bugs.python.org/issue28416 #28412: os.path.splitdrive documentation out of date http://bugs.python.org/issue28412 #28408: Fix redundant code and memory leak in _PyUnicodeWriter_Finish http://bugs.python.org/issue28408 #28407: Improve coverage of email.utils.make_msgid() http://bugs.python.org/issue28407 #28401: Don't support the PEP384 stable
Re: [Python-Dev] Have I got my hg dependencies correct?
On Fri, Oct 21, 2016 at 1:12 PM, Brett Cannon wrote: >> in first cpython, then 2.7 repos I should be up-to-date, correct? > > > Nope, you need to execute the same steps in your 2.7 checkout "repos" == "checkout" in my message. So the hg up -C solved my problem, but I'm still a bit confused (nothing new, in addition to which I only use hg for my Python repositories)... Why didn't a plain "hg up" tell me it couldn't update some files because of changes? Or, like git (I think), attempt to incorporate the upstream changes, then leave conflict markers if that failed? Skip ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor
On Thu, Oct 20, 2016 at 11:48 PM, Nick Coghlan wrote: > > len(get_builtin_methods()) > >>230 > > > > So what? No one looks in all the methods of builtins at once. > > Yes, Python implementation developers do, which is why it's a useful > part of defining the overall "size" of Python and how that is growing > over time. > sure -- but of course, the trick is that adding *one" new method is never a big deal by itself. I'm confused though -- IIUC, you are proposing adding a `iobuffers` module to the std lib -- how is that not growing the "size" of Python? I'm still confused about the "io" in "iobuffers" -- I've used buffers a lot -- for passing data around between various C libs -- numpy, image processing, etc... I never really thought of it as IO though. which is why a simple frombuffer() seems to make a lot of sense to me, without any other stuff. (to be honest, I reach for Cyton these days for that sort of thing though) > and we make it easier for educators to decide whether or not they should be > introducing their students to the new capabilities. > advanced domain specific use cases (see > http://learning-python.com/books/python-changes-2014-plus.html for one > generalist author's perspective on the vast gulf that can arise > between "What professional programmers want" and "What's relevant to > new programmers") > thanks for the link -- I'll need to read the whole thing through -- though from a glance, I have a slightly different perspective, as an educator as well: Python 3, in general, is harder to learn and less suited to scripting, while potentially more suited to building larer systems. I came to this conclusion last year when I converted my introductory class to py3. Some of it is the redundancy and whatnot talked about in that link -- yes, those are issue or me. But more of it is real, maybe important change. Interestingly, the biggest issue with the transition: Unicode, is one thing that has made life much easier for newbies :-) But the big ones are things like: The more to be iterable focused rather than sequence focused -- iterables really are harder to wrap one's head around when you are first learning. And I was surprised at how often I had to wrap list() around stuff when converting my examples and exercise solutions. I've decided to teach the format() method for string formatting -- but it is harder to wrap your head around as a newbie. Even the extra parens in print() makes it a bit harder to script() well. Use with: -- now I have to explain context managers before they can even read a file.. (or gloss over it and jsut say " copy this code to open a file" Anyway, I've been meaning to write a Blog post about this, that would be better formed, but you get the idea. In short, I really appreciate the issues here -- though I really don't see how adding one method to a fairily obscure builtin really applies -- this is nothing like having three(!) ways to format strings. Which is more comprehensible and discoverable, dict.setdefault(), or > collections.defaultdict()? > Well, setdefault is Definitively more discoverable! not sure what your point is. As it happens, the homework for my intro class this week can greatly benefit from setdefault() (or defaultdict() ) -- and in the last few years, far fewer newbies have discovered defaultdict() for their solutions. Empirical evidence for discoverability. As for comprehensible -- I give a slight nod to .setdefault() - my solution to the HW uses that. I can't say I have a strong argument as to why -- but having (what looks like) a whole new class for this one extra feature seems a bit odd, and makes one look carefully to see what else might be different about it... > Micro-optimisations like dict.setdefault() typically don't make sense > in isolation - they only make sense in the context of a particular > pattern of thought. Now, one approach to such patterns is to say "We > just need to do a better job of teaching people to recognise and use > the pattern!". This approach tends not to work very well - you're > often better off extracting the entire pattern out to a higher level > construct, giving that construct a name, and teaching that, and > letting people worry about how it works internally later. > hmm -- maybe -- but to me, that example isn't really a pattern of thought (to me) -- I actually remember my history of learning about setdefault(). I found myself writing a bunch of code something like: if key not in a_dict: a_dict[key] = something a_dict['key'].somethign_or_other() Once I had written that code a few times, I thought: "There has got to be a cleaner way to do this", looked at the dict methods and eventually found setdefault() (took an embarrassingly long time). I did think -- "this has got to be a common enough pattern to be somehow supported" but I will say that it never, ever dawned on me to think: "this is got to be a common enough pattern that someone would have made a special kind of dict
Re: [Python-Dev] Have I got my hg dependencies correct?
On 10/21/2016 2:12 PM, Brett Cannon wrote: On Thu, 20 Oct 2016 at 04:48 Skip Montanaro mailto:[email protected]>> wrote: I've recently run into a problem building the math and cmath modules for 2.7. (I don't rebuild very often, so this problem might have been around for awhile.) My hg repos look like this: * My cpython repo pulls from https://hg.python.org/cpython * My 2.7 repo (and other non-tip repos) pulls from my cpython repo I think this setup was recommended way back in the day when hg was new to the Python toolchain to avoid unnecessary network bandwidth. So, if I execute hg pull hg update in first cpython, then 2.7 repos I should be up-to-date, correct? Nope, you need to execute the same steps in your 2.7 checkout if you're keeping it in a separate directory from your cpython repo that you're referring to If the 2.7 repository shares the default repository, as described in the devguide, then only update is needed. This has worked for me for at least two years. -- Terry Jan Reedy ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Benchmarking Python and micro-optimizations
On 20 October 2016 at 20:56, Victor Stinner wrote: > Hi, > > Last months, I worked a lot on benchmarks. I ran benchmarks, analyzed > results in depth (up to the hardware and kernel drivers!), I wrote new > tools and enhanced existing tools. Thanks Victor, very cool work! Cheers, Nick. -- Nick Coghlan | [email protected] | Brisbane, Australia ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Is there any remaining reason why weakref callbacks shouldn't be able to access the referenced object?
On 21 October 2016 at 17:09, Nathaniel Smith wrote:
> But that was 2.4. In the mean time, of course, PEP 442 fixed it so
> that finalizers and weakrefs mix just fine. In fact, weakref callbacks
> are now run *before* __del__ methods [2], so clearly it's now okay for
> arbitrary code to touch the objects during that phase of the GC -- at
> least in principle.
>
> So what I'm wondering is, would anything terrible happen if we started
> passing still-live weakrefs into weakref callbacks, and then clearing
> them afterwards?
The weakref-before-__del__ ordering change in
https://www.python.org/dev/peps/pep-0442/#disposal-of-cyclic-isolates
only applies to cyclic garbage collection,so for normal refcount
driven object cleanup in CPython, the __del__ still happens first:
>>> class C:
... def __del__(self):
... print("__del__ called")
...
>>> c = C()
>>> import weakref
>>> def cb():
... print("weakref callback called")
...
>>> weakref.finalize(c, cb)
>>> del c
__del__ called
weakref callback called
This means the main problem with a strong reference being reachable
from the weakref callback object remains: if the callback itself is
reachable, then the original object is reachable, and you don't have a
collectible cycle anymore.
>>> c = C()
>>> def cb2(obj):
... print("weakref callback called with object reference")
...
>>> weakref.finalize(c, cb2, c)
>>> del c
>>>
Changing that to support resurrecting the object so it can be passed
into the callback without the callback itself holding a strong
reference means losing the main "reasoning about software" benefit
that weakref callbacks offer: they currently can't resurrect the
object they relate to (since they never receive a strong reference to
it), so it nominally doesn't matter if the interpreter calls them
before or after that object has been entirely cleaned up.
Cheers,
Nick.
--
Nick Coghlan | [email protected] | Brisbane, Australia
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Is there any remaining reason why weakref callbacks shouldn't be able to access the referenced object?
On Fri, Oct 21, 2016 at 8:32 PM, Nick Coghlan wrote:
> On 21 October 2016 at 17:09, Nathaniel Smith wrote:
>> But that was 2.4. In the mean time, of course, PEP 442 fixed it so
>> that finalizers and weakrefs mix just fine. In fact, weakref callbacks
>> are now run *before* __del__ methods [2], so clearly it's now okay for
>> arbitrary code to touch the objects during that phase of the GC -- at
>> least in principle.
>>
>> So what I'm wondering is, would anything terrible happen if we started
>> passing still-live weakrefs into weakref callbacks, and then clearing
>> them afterwards?
>
> The weakref-before-__del__ ordering change in
> https://www.python.org/dev/peps/pep-0442/#disposal-of-cyclic-isolates
> only applies to cyclic garbage collection,so for normal refcount
> driven object cleanup in CPython, the __del__ still happens first:
>
> >>> class C:
> ... def __del__(self):
> ... print("__del__ called")
> ...
> >>> c = C()
> >>> import weakref
> >>> def cb():
> ... print("weakref callback called")
> ...
> >>> weakref.finalize(c, cb)
>
> >>> del c
> __del__ called
> weakref callback called
Ah, interesting! And in the old days this was of course the right way
to do it, because until __del__ has completed it's possible that the
object will get resurrected, and you don't want to clear the weakref
until you're certain that it's dead.
But PEP 442 already broke all that :-). Now weakref callbacks can
happen before __del__, and they can happen on objects that are about
to be resurrected. So if we wanted to pursue this then it seems like
it would make sense to standardize on the following sequence for
object teardown:
0) object becomes collectible (either refcount == 0 or it's part of a
cyclic isolate)
1) weakref callbacks fire
2) weakrefs are cleared (unconditionally, so we keep the rule that any
given weakref fires at most once, even if the object is resurrected)
3) if _PyGC_REFS_MASK_FINALIZED isn't set, __del__ fires, and then
_PyGC_REFS_MASK_FINALIZED is set
4) check for resurrection
5) deallocate the object
On further thought, this does still introduce one new edge case, which
is that even if we keep the guarantee that no individual weakref can
fire more than once, it's possible for *new* weakrefs to be registered
after resurrection, so it becomes possible for an object to be
resurrected multiple times. (Currently, resurrection can only happen
once, because __del__ is disabled on resurrected objects and weakrefs
can't resurrect at all.) I'm not actually sure that this is even a
problem, but in any case it's easy to fix by making a rule that you
can't take a weakref to an object whose _PyGC_REFS_MASK_FINALIZED flag
is already set, plus adjust the teardown sequence to be:
0) object becomes collectible (either refcount == 0 or it's part of a
cyclic isolate)
1) if _PyGC_REFS_MASK_FINALIZED is set, then go to step 7. Otherwise:
2) set _PyGC_REFS_MASK_FINALIZED
3) weakref callbacks fire
4) weakrefs are cleared (unconditionally)
5) __del__ fires
6) check for resurrection
7) deallocate the object
There remains one obscure corner case where multiple resurrection is
possible, because the resurrection-prevention flag doesn't exist on
non-GC objects, so you'd still be able to take new weakrefs to those.
But in that case __del__ can already do multiple resurrections, and
some fellow named Nick Coghlan seemed to think that was okay back in
2013 [1], so probably it's not too bad ;-).
[1] https://mail.python.org/pipermail/python-dev/2013-June/126850.html
> This means the main problem with a strong reference being reachable
> from the weakref callback object remains: if the callback itself is
> reachable, then the original object is reachable, and you don't have a
> collectible cycle anymore.
>
> >>> c = C()
> >>> def cb2(obj):
> ... print("weakref callback called with object reference")
> ...
> >>> weakref.finalize(c, cb2, c)
>
> >>> del c
> >>>
>
> Changing that to support resurrecting the object so it can be passed
> into the callback without the callback itself holding a strong
> reference means losing the main "reasoning about software" benefit
> that weakref callbacks offer: they currently can't resurrect the
> object they relate to (since they never receive a strong reference to
> it), so it nominally doesn't matter if the interpreter calls them
> before or after that object has been entirely cleaned up.
I guess I'm missing the importance of this -- does the interpreter
gain some particular benefit from having flexibility about when to
fire weakref callbacks? Obviously it has to pick one in practice.
(The async use case that got me thinking about this is, of course,
exactly one where we would want a weakref callback to resurrect the
object it refers to. Only once, though.)
-n
--
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Py
Re: [Python-Dev] Benchmarking Python and micro-optimizations
Hi, I removed all old benchmarks results and I started to run manually benchmarks. The timeline view is interesting to investigate performance regression: https://speed.python.org/timeline/#/?exe=3&ben=grid&env=1&revs=50&equid=off&quarts=on&extr=on For example, it seems like call_method became slower between Oct 9 and Oct 20: 35.9 ms => 59.9 ms: https://speed.python.org/timeline/#/?exe=3&ben=call_method&env=1&revs=50&equid=off&quarts=on&extr=on I don't know well the hardware of the benchmark runner, so maybe it's an issue with the server running benchmarks? Victor ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
