Carl Meyer <c...@oddbird.net> added the comment:

Thanks for outlining the use cases. They make sense.

The current PR provides a flexible generic API that fully supports all three of 
those use cases (use cases 2 and 3 are strict subsets of use case 1.) Since the 
callback is called before the dict is modified, all the necessary information 
is available to the callback to decide whether the event is interesting to it 
or not.

The question is how much of the bookkeeping to classify events as "interesting" 
or "uninteresting" should be embedded in the core dispatch vs being handled by 
the callback.

One reason to prefer keeping this logic in the callback is that with 
potentially multiple chained callbacks in play, the filtering logic must always 
exist in the callback, regardless. E.g. if callback A wants to watch only 
keys-version changes to dict X, but callback B wants to watch all changes to 
it, events will fire for all changes, and callback A must still disregard 
"uninteresting" events that it may receive (just like it may receive events for 
dicts it never asked to watch at all.) So providing API for different "levels" 
of watching means that the "is this event interesting to me" predicate must 
effectively be duplicated both in the callback and in the watch level chosen.

The proposed rationale for this complexity and duplication is the idea that 
filtering out uninteresting events at dispatch will provide better performance. 
But this is hypothetical: it assumes the existence of perf-bottleneck code 
paths that repeatedly rebind globals. The only benchmark workload with this 
characteristic that I know of is pystone, and it is not even part of the 
pyperformance suite, I think precisely because it is not representative of 
real-world code patterns. And even assuming that we do need to optimize for 
such code, it's also not obvious that it will be noticeably cheaper in practice 
to filter on the dispatch side.

It may be more useful to focus on API. If we get the API right, internal 
implementation details can always be adjusted in future if a different 
implementation can be shown to be noticeably faster for relevant use cases. And 
if we get existing API right, we can always add new API if we have to. I don't 
think anything about the proposed simple API precludes adding 
`PyDict_WatchKeys` as an additional feature, if it turns out to be necessary.

One modification to the simple proposed API that should improve the performance 
(and ease of implementation) of use case #2 would be to split the current 
`PyDict_EVENT_MODIFIED` into two separate event types: `PyDict_EVENT_MODIFIED` 
and `PyDict_EVENT_NEW_KEY`. Then the callback-side event filtering for use case 
#2 would just be `event == PyDict_EVENT_NEW_KEY` instead of requiring a lookup 
into the dict to see whether the key was previously set or not.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue46896>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to