Raymond Hettinger <raymond.hettin...@gmail.com> added the comment:

I've put more thought into the proposal and am going to recommend against it.

At its heart, this a CPython optimization to take advantage of list() being 
slower than a handful of islice() calls.  It also gains a speed benefit by 
dropping the antibias logic because random() is faster than _randbelow().  IMO, 
this doesn't warrant an API extension.  I'm not looking forward to years of 
teaching people that there are two separate APIs for sampling without 
replacement and that the second one is almost never what they should use.

A few years ago, GvR rejected adding a pre-sizing argument to dicts even though 
there were some cases where it gave improved performance.  His rationale was 
that it was challenging for a user to know when they were better off and when 
they weren't.  It added a new complication that easily led to suboptimal 
choices. IMO, this new API puts the users in a similar situation.  There are a 
number of cases where a person is worse off, sometimes much worse off.

This new code runs O(n) instead of O(k).  It eats more entropy. It loses the 
the antibias protections.  The API makes it less explicit that the entire input 
iterable is consumed.  It can only be beneficial is the input is not a 
sequence.  When k gets bigger, the repeated calls to islice() become more 
expensive than a single call to list.   And given the math library functions 
involved, I not even sure that this code can guarantee it gives the same 
results across platforms.

Even if the user makes a correct initial decision about which API to use, the 
decision can become invalidated when the population sizes or sample sizes 
change over time.

Lastly, giving users choices between two substantially similar tools typically 
makes them worse off.  It creates a new burden to learn, remember, and 
distinguish the two.  It's really nice that we currently have just one sample() 
and that it behaves well across a broad range of cases — you generally get a 
good result without having to think about it.  Presumably, that was the wisdom 
behind having one-way-to-do-it.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue41311>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to