Re: [Python-Dev] [poll] New name for __builtins__
Guido van Rossum wrote: > On Dec 2, 2007 7:40 AM, Nick Coghlan <[EMAIL PROTECTED]> wrote: >> Just for the record, I also like the idea of __builtins__ being a magic >> alias for the boringly-but-practically named builtins module. > > [Imagine me jumping up and down and screaming at the top of my lungs > out of frustration:] > > BUT THAT'S NOT WHAT IT IS! IT'S A HOOK FOR SANDBOXING! YOU SHOULD > NEVER BE USING __builtins__ DIRECTLY EXCEPT WHEN CONTROLLING THE SET > OF BUILTINS AVAILABLE TO UNTRUSTED CODE! > I never mess with the builtin definitions under either name, but I agree that my description was highly inaccurate. It's not a topic I've spent much time considering :) Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- http://www.boredomandlaziness.org ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] blocking a non-blocking socket
Thanks, Audun. If you look at the code, you'll see that both a connect method and a do_handshake method already exist, and work pretty much as you describe. The issue is what to do when the user doesn't use them -- specifies do_handshake_on_connect=True. > Another way of doing it could be to expose a connect() method on the ssl > objects. It changes the socket.ssl api, but I'd say it is in the same > spirit as the do_handshake_on_connect parameter since no existing code > will break. The caller then calls connect() until it does not return Bill ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Non-string keys in namespace dicts
On Dec 2, 2007 12:49 PM, Neil Toronto <[EMAIL PROTECTED]> wrote: > It turned out not *that* hard to code around for attribute caching, and > the extra cruft only gets invoked on a cache miss. The biggest problem > isn't speed - it's that it's possible (though extremely unlikely), while > testing keys for equality, that a rich compare alters the underlying > dict. This causes the caching lookup to have to try to get an entry > pointer again, which could invoke the rich compare, which might alter > the underlying dict.. How about subclasses of str? These have all the same issues... > I'm working on making it as fast as the original when the MRO is short. > Question for Guido: should I roll this into the fastglobals patch? No, please keep them separate. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Non-string keys in namespace dicts
On Dec 2, 2007 6:28 PM, Phillip J. Eby <[EMAIL PROTECTED]> wrote: > I don't see a problem with requiring dictionary key comparisons to be > side-effect-free - even in the general case of dictionaries, not just > namespace ones. Me neither -- but the problem is enforcement. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Non-string keys in namespace dicts
Guido van Rossum wrote: > On Dec 2, 2007 12:49 PM, Neil Toronto <[EMAIL PROTECTED]> wrote: >> It turned out not *that* hard to code around for attribute caching, and >> the extra cruft only gets invoked on a cache miss. The biggest problem >> isn't speed - it's that it's possible (though extremely unlikely), while >> testing keys for equality, that a rich compare alters the underlying >> dict. This causes the caching lookup to have to try to get an entry >> pointer again, which could invoke the rich compare, which might alter >> the underlying dict.. > > How about subclasses of str? These have all the same issues... Yeah. I ended up having it, per class, permanently revert to uncached lookups when it detects that a class dict in the MRO has non-string keys. That's flagged by lookdict_string, which uses PyString_CheckExact. Neil ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Non-string keys in namespace dicts
At 12:27 PM 12/3/2007 -0700, Neil Toronto wrote: >Guido van Rossum wrote: > > On Dec 2, 2007 12:49 PM, Neil Toronto <[EMAIL PROTECTED]> wrote: > >> It turned out not *that* hard to code around for attribute caching, and > >> the extra cruft only gets invoked on a cache miss. The biggest problem > >> isn't speed - it's that it's possible (though extremely unlikely), while > >> testing keys for equality, that a rich compare alters the underlying > >> dict. This causes the caching lookup to have to try to get an entry > >> pointer again, which could invoke the rich compare, which might alter > >> the underlying dict.. > > > > How about subclasses of str? These have all the same issues... > >Yeah. I ended up having it, per class, permanently revert to uncached >lookups when it detects that a class dict in the MRO has non-string >keys. That's flagged by lookdict_string, which uses PyString_CheckExact. I'm a bit confused here. Isn't the simplest way to cache attribute lookups to just have a cache dictionary in the type, and update that dictionary whenever a change is made to a superclass? That's essentially how __slotted__ attribute changes on base classes work now, isn't it? Why do we need to mess around with the dictionary entries themselves in order to do that? ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Non-string keys in namespace dicts
Phillip J. Eby wrote: > At 12:27 PM 12/3/2007 -0700, Neil Toronto wrote: >> Guido van Rossum wrote: >> > How about subclasses of str? These have all the same issues... >> >> Yeah. I ended up having it, per class, permanently revert to uncached >> lookups when it detects that a class dict in the MRO has non-string >> keys. That's flagged by lookdict_string, which uses PyString_CheckExact. > > I'm a bit confused here. Isn't the simplest way to cache attribute > lookups to just have a cache dictionary in the type, and update that > dictionary whenever a change is made to a superclass? That's > essentially how __slotted__ attribute changes on base classes work now, > isn't it? Why do we need to mess around with the dictionary entries > themselves in order to do that? The nice thing about caching pointers to dict entries is that they don't change as often as values do. There are fewer ways to invalidate an entry pointer: inserting set, resize, clear, and delete. If you cache values, non-inserting set could invalidate as well. Because inserting into namespace dicts should be very rare, caching entries rather than values should reduce the number of times cache entries are invalidated to near zero. Updating is expensive, so that's good for performance. Rare updating also means it's okay to invalidate the entire cache rather than single entries, so the footprint of the caching mechanism in the dict can be very small. For example, I've got a single 64-bit counter in each dict that gets incremented after every potentially invalidating operation. That comes down to 8 bytes of storage and two extra machine instructions (currently) per invalidating operation. The cache checks it against its own counter, and updating ensures that it's synced. Some version of the non-string keys problem would exist with any caching mechanism, though. An evil rich compare can always monkey about with class dicts in the MRO. If a caching scheme caches values and doesn't account for that, it could return stale values. If it caches entries and doesn't account for that, it could segfault. I suppose you could argue that returning stale values is fitting punishment for using an evil rich compare, though the punishee isn't always the same person as the punisher. Neil ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Non-string keys in namespace dicts
On Dec 3, 2007 3:48 PM, Phillip J. Eby <[EMAIL PROTECTED]> wrote: > Actually, you're missing the part where such evil code *can't* muck > things up for class dictionaries. Type dicts aren't reachable via > ordinary Python code; you *have* to modify them via setattr. (The > __dict__ of types returns a read-only proxy object, so the most evil > rich compare you can imagine still can't touch it.) What's to prevent that evil comparison to call setattr on the class? -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Non-string keys in namespace dicts
At 03:26 PM 12/3/2007 -0700, Neil Toronto wrote: >Phillip J. Eby wrote: > > At 12:27 PM 12/3/2007 -0700, Neil Toronto wrote: > >> Guido van Rossum wrote: > >> > How about subclasses of str? These have all the same issues... > >> > >> Yeah. I ended up having it, per class, permanently revert to uncached > >> lookups when it detects that a class dict in the MRO has non-string > >> keys. That's flagged by lookdict_string, which uses PyString_CheckExact. > > > > I'm a bit confused here. Isn't the simplest way to cache attribute > > lookups to just have a cache dictionary in the type, and update that > > dictionary whenever a change is made to a superclass? That's > > essentially how __slotted__ attribute changes on base classes work now, > > isn't it? Why do we need to mess around with the dictionary entries > > themselves in order to do that? > >The nice thing about caching pointers to dict entries is that they don't >change as often as values do. There are fewer ways to invalidate an >entry pointer: inserting set, resize, clear, and delete. If you cache >values, non-inserting set could invalidate as well. > >Because inserting into namespace dicts should be very rare, caching >entries rather than values should reduce the number of times cache >entries are invalidated to near zero. Updating is expensive, so that's >good for performance. > >Rare updating also means it's okay to invalidate the entire cache rather >than single entries, so the footprint of the caching mechanism in the >dict can be very small. For example, I've got a single 64-bit counter in >each dict that gets incremented after every potentially invalidating >operation. That comes down to 8 bytes of storage and two extra machine >instructions (currently) per invalidating operation. The cache checks it >against its own counter, and updating ensures that it's synced. > >Some version of the non-string keys problem would exist with any caching >mechanism, though. An evil rich compare can always monkey about with >class dicts in the MRO. If a caching scheme caches values and doesn't >account for that, it could return stale values. If it caches entries and >doesn't account for that, it could segfault. I suppose you could argue >that returning stale values is fitting punishment for using an evil rich >compare, though the punishee isn't always the same person as the punisher. Actually, you're missing the part where such evil code *can't* muck things up for class dictionaries. Type dicts aren't reachable via ordinary Python code; you *have* to modify them via setattr. (The __dict__ of types returns a read-only proxy object, so the most evil rich compare you can imagine still can't touch it.) This means that MRO cache invalidation can already be detected using "type"'s tp_setattro implementation. And setting attributes on types is already extremely rare. It doesn't seem to me that there's any need to use the same namespace speedup mechanism here: capturing setattr operations on a type should be sufficient to implement invalidation, without mucking about with dictionary entries. An ordinary dict should suffice. Of course, I suppose there are use cases where somebody uses a class attribute as a "global" of sorts, and those use cases would be slowed down. However, if you want to use the entry caching approach, you wouldn't need to worry about the segfault case. (Since somebody would have to use C to get at the "real" dictionary.) ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Non-string keys in namespace dicts
Phillip J. Eby wrote: > At 03:26 PM 12/3/2007 -0700, Neil Toronto wrote: >> Phillip J. Eby wrote: >> > At 12:27 PM 12/3/2007 -0700, Neil Toronto wrote: >> Some version of the non-string keys problem would exist with any caching >> mechanism, though. An evil rich compare can always monkey about with >> class dicts in the MRO. If a caching scheme caches values and doesn't >> account for that, it could return stale values. If it caches entries and >> doesn't account for that, it could segfault. I suppose you could argue >> that returning stale values is fitting punishment for using an evil rich >> compare, though the punishee isn't always the same person as the >> punisher. > > Actually, you're missing the part where such evil code *can't* muck > things up for class dictionaries. Type dicts aren't reachable via > ordinary Python code; you *have* to modify them via setattr. (The > __dict__ of types returns a read-only proxy object, so the most evil > rich compare you can imagine still can't touch it.) Interesting. But I'm going to have to say it probably wouldn't work as well, since C code can and does alter tp_dict directly. Those places in the core would have to be altered to invalidate the cache. There's also the issue of extensions, which so far have been able to alter any tp_dict without problems. It'd also be really annoying for a class to have to notify all of its subclasses when one of its attributes changed. In other words, I can see the footprint being rather large and difficult to manage. By hooking right into dicts and letting them track when things change, every other piece of code in the system can happily continue doing whatever it likes without needing to worry that it might invalidate some cache entry somewhere. I'm confident that's the right design choice whether it's best to cache entries or not. I hope you don't feel that I'm just trying to be contradictory. I'm actually enjoying the discussion a lot. I'd rather have my grand ideas tested now than discover I was wrong later. Neil ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Non-string keys in namespace dicts
At 03:51 PM 12/3/2007 -0800, Guido van Rossum wrote: >On Dec 3, 2007 3:48 PM, Phillip J. Eby <[EMAIL PROTECTED]> wrote: > > Actually, you're missing the part where such evil code *can't* muck > > things up for class dictionaries. Type dicts aren't reachable via > > ordinary Python code; you *have* to modify them via setattr. (The > > __dict__ of types returns a read-only proxy object, so the most evil > > rich compare you can imagine still can't touch it.) > >What's to prevent that evil comparison to call setattr on the class? If you're caching values, it should be sufficient to have setattr trigger the invalidation. For entries, I have to admit I don't understand the approach well enough to make a specific proposal. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Non-string keys in namespace dicts
Phillip J. Eby wrote: > At 03:51 PM 12/3/2007 -0800, Guido van Rossum wrote: >> On Dec 3, 2007 3:48 PM, Phillip J. Eby <[EMAIL PROTECTED]> wrote: >> > Actually, you're missing the part where such evil code *can't* muck >> > things up for class dictionaries. Type dicts aren't reachable via >> > ordinary Python code; you *have* to modify them via setattr. (The >> > __dict__ of types returns a read-only proxy object, so the most evil >> > rich compare you can imagine still can't touch it.) >> >> What's to prevent that evil comparison to call setattr on the class? > > If you're caching values, it should be sufficient to have setattr > trigger the invalidation. For entries, I have to admit I don't > understand the approach well enough to make a specific proposal. As long as you could determine whether PyDict_SetItem inserted a new key, it would make sense. (If it only updates a value, the cache doesn't need to change because the pointer to the entry is still valid and the entry points to the new value.) The PyDict_SetItem API would have to change, or the dict would have to somehow pass the information out-of-bound. Neither option sounds great to me, so I'd go with caching values from setattr. Neil ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Non-string keys in namespace dicts
I apologize - I had forgotten what you were telling me by the time I replied. Here's a better answer. > Phillip J. Eby wrote: >> At 03:26 PM 12/3/2007 -0700, Neil Toronto wrote: >> Actually, you're missing the part where such evil code *can't* muck >> things up for class dictionaries. Type dicts aren't reachable via >> ordinary Python code; you *have* to modify them via setattr. (The >> __dict__ of types returns a read-only proxy object, so the most evil >> rich compare you can imagine still can't touch it.) C code can and does alter tp_dict directly already. If caching were implemented within type's setattr, all these places would have to be changed to use setattr only. That doesn't seem so bad at first. It's a change in convention, certainly: a new informal rule that says "no monkeying with a PyTypeObject's tp_dict, period". Lack of observance could be difficult to debug, as a PyDict_SetItem would appear to have worked just fine to C code but not show up to Python code. Neil ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Non-string keys in namespace dicts
At 10:17 PM 12/3/2007 -0700, Neil Toronto wrote: >Phillip J. Eby wrote: > > Actually, you're missing the part where such evil code *can't* muck > > things up for class dictionaries. Type dicts aren't reachable via > > ordinary Python code; you *have* to modify them via setattr. (The > > __dict__ of types returns a read-only proxy object, so the most evil > > rich compare you can imagine still can't touch it.) > >Interesting. But I'm going to have to say it probably wouldn't work as >well, since C code can and does alter tp_dict directly. Those places in >the core would have to be altered to invalidate the cache. Eh? Where is the type dictionary altered outside of setattr and class creation? > There's also >the issue of extensions, which so far have been able to alter any >tp_dict without problems. Do you have any actual examples? Believe me, I'm the last person to suggest removing useful hack, er, hooks. :) But I don't think that type __dict__ munging is actually common at all. >It'd also be really annoying for a class to >have to notify all of its subclasses when one of its attributes changed. It's not all subclasses - only those subclasses that don't shadow the attribute. Also, it's not necessarily the case that notification would be O(subclasses) - it could be done via a version counter, as in your approach. Admittedly, that would require an extra bit of indirection, since you'd need to keep (and check) counters for each descriptor. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Non-string keys in namespace dicts
Phillip J. Eby wrote: > At 10:17 PM 12/3/2007 -0700, Neil Toronto wrote: >> Interesting. But I'm going to have to say it probably wouldn't work as >> well, since C code can and does alter tp_dict directly. Those places in >> the core would have to be altered to invalidate the cache. > > Eh? Where is the type dictionary altered outside of setattr and class > creation? You're right - my initial grep turned up stuff that looked like tp_dict monkeying out of context. The ctypes module does it a lot, but only in its various *_new functions. >> It'd also be really annoying for a class to >> have to notify all of its subclasses when one of its attributes changed. > > It's not all subclasses - only those subclasses that don't shadow the > attribute. Also, it's not necessarily the case that notification would > be O(subclasses) - it could be done via a version counter, as in your > approach. Admittedly, that would require an extra bit of indirection, > since you'd need to keep (and check) counters for each descriptor. And the extra overhead comes back to bite us again, and probably in a critical path. (I'm sure you've been bitten in a critical path before.) That's been the issue with all of these caching schemes so far - Python is just too durned dynamic to guarantee them anything they can exploit for efficiency, so they end up slowing down common operations. (Not that I'd change a bit of Python, mind you.) For example, almost everything I've tried slows down attribute lookups on built-in types. Adding one 64-bit version counter check and a branch on failure incurs a 3-5% penalty. That's not the end of the world, but it makes pybench take about 0.65% longer. I finally overcame that by making a custom dictionary type to use as the cache. I haven't yet tested something my caching lookups are slower at - they're all faster so far for builtins and Python objects with any size MRO - but I haven't tested exhaustively and I haven't done failing hasattr-style lookups. Turns out that not finding an attribute all the way up the MRO (which can lead to a persistent cache miss if done with the same name) is rather frequent in Python and is expected to be fast. I can cache missing attributes as easily as present attributes, but they could pile up if someone decides to hasattr an object with a zillion different names. I have a cunning plan, though, which is probably best explained using a patch. At any rate, I'm warming to this setattr idea, and I'll likely try that next whether my current approach works out or not. Neil ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
