Carl Meyer <c...@oddbird.net> added the comment:

Hi Dennis, thanks for the questions!

> A curiosity: have you considered watching dict keys rather than whole dicts?

There's a bit of discussion of this above. A core requirement is to avoid any 
memory overhead and minimize CPU overhead on unwatched dicts. Additional memory 
overhead seems like a nonstarter, given the sheer number of dict objects that 
can exist in a large Python system. The CPU overhead for unwatched dicts in the 
current PR consists of a single added `testb` and `jne` (for checking if the 
dict is watched), in the write path only; I think that's effectively the 
minimum possible.

It's not clear to me how to implement per-key watching under this constraint. 
One option Brandt mentioned above is to steal the low bit of a `PyObject` 
pointer; in theory we could do this on `me_key` to implement per-key watching 
with no memory overhead. But then we are adding bit-masking overhead on every 
dict read and write. I think we really want the implementation here to be 
zero-overhead in the dict read path.

Open to suggestions if I've missed a good option here!

> That way, changing global values would not have to de-optimize, only adding 
> new global keys would.

> Indexing into dict values array wouldn't be as efficient as embedding direct 
> jump targets in JIT-generated machine code, but as long as we're not doing 
> that, maybe watching the keys is a happy medium?

But we are doing that, in the Cinder JIT. Dict watching here is intentionally 
exposed for use by extensions, including hopefully  in future the Cinder JIT as 
an installable extension. We burn exact pointer values for module globals into 
generated JIT code and deopt if they change (we are close to landing a change 
to code-patch instead of deopting.) This is quite a bit more efficient in the 
hot path than having to go through a layer of indirection.

I don't want to assume too much about how dict watching will be used in future, 
or go for an implementation that limits its future usefulness. The current PR 
is quite flexible and can be used to implement a variety of caching strategies. 
The main downside of dict-level watching is that a lot of notifications will be 
fired if code does a lot of globals-rebinding in modules where globals are 
watched, but this doesn't appear to be a problem in practice, either in our 
workloads or in pyperformance. It seems likely that a workable strategy if this 
ever was observed to be a problem would be to notice at runtime that globals 
are being re-bound frequently in a particular module and just stop watching 
that module's globals.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue46896>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to