[issue41220] add optional make_key argument to lru_cache

Raymond Hettinger Fri, 17 Jul 2020 19:11:59 -0700

Raymond Hettinger <raymond.hettin...@gmail.com> added the comment:

I've left this open for a week to collect comments.  I concur with Felipe that 
this should be the responsibility of the caller.  And I concur with Jim that 
the use cases are likely too rare to warrant shoehorning in the extra 
functionality.


For my part, I'm concerned that this would be bug bait.  The result of the 
function would need to be uniquely associated with a particular output, 
independent of the contents of the mutable arguments.  I don't think it is good 
design for function calls to be disregarding the part of the input that was 
originally used to compute the output — in a way that notion conflict with the 
core idea of the of the lru_cache giving the same outputs for the same inputs.

Likewise, debugging and exception handing would be complicated by having two 
underlying user functions.  Exceptions could arise from three sources: the 
make_key() function, the cached function, and the lru cache internals.

Further, if the make_key() function turns out not to be unique, the ensuring 
bugs would be hard to find.  In the OP's later example, if we get two distinct 
inputs which happen to have the same timestamp, the error would pass without 
warning and would be hard to reproduce.  Similarly, the make_key() function has 
to assume that the mutable portion of the data hasn't mutated, but a key 
feature of mutable data is that it can mutate.  That too would result in a hard 
to find bug.

Also, while the OP has a specific use case in mind, we can't know in advance 
how others will use make_key().  It could be misused to convert mutable data to 
immutable data, possibly resulting in a cached function that is slower than the 
original (caching mean() for example).  Or the make_key() function could had 
side-effects.  The OP's original "my_list[0]" example showed yet another way 
that bugs could arise. It is problematic that that particular bug might not 
have been caught by a test suite that didn't know to call the function twice 
with a common first argument but with differing subsequent arguments.

Another possible bug trap is that it would make it easier to accidentally cache 
mutable function results without getting a warning.  For example:
      @lru_cache(128, make_key = lambda target, size, uniq_id, data: uniq_id)
      def search(target, size, uniq_id, data):
           i = data.index(target)
           return data[i : i+size]    # <== bug because this is mutable if data 
is mutable
       
I think the suggestion was a neat idea (and thought provoking), but I think we 
should decline because 1) the use case is uncommon 2) because it makes the API 
harder to understand 3) because it is bug bait and 4) because the 
responsibility should probably be with the caller.

Thank you for the suggestion.  It was inspired.  Don't be discouraged — many of 
my proposals don't get accepted as well.  Please continue to submit suggestions.



P.S. make_key() shouldn't be called a key-function so that we don't make that 
term ambiguous.  Elsewhere key-functions are all about comparison logic rather 
than hashing.  They can be created by functools.cmp_to_key(), they tend to be 
used only once per data element, and the same function tends to be inoperable 
with all the tools that accept key-functions.  So the term make_key() was 
likely the better terminology.  Perhaps get_unique_id() would have been even 
less ambiguous.

----------
resolution:  -> rejected
stage: patch review -> resolved
status: open -> closed

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue41220>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue41220] add optional make_key argument to lru_cache

Reply via email to