[issue41311] Add a function to get a random sample from an iterable (reservoir sampling)

Raymond Hettinger Fri, 17 Jul 2020 13:28:59 -0700


Raymond Hettinger <raymond.hettin...@gmail.com> added the comment:


Other implementations aren't directly comparable, but I thought I would check 
to see what others were doing:

* Scikit-learn uses reservoir sampling but only when k / n > 0.99.  Also, it 
requires a follow-on step to shuffle the selections.

* numpy does not use reservoir sampling.

* Julia's randsubseq() does not use reservoir sampling.  The docs guarantee 
that, "Complexity is linear in p*length(A), so this function is efficient even 
if p is small and A is large."

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue41311>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue41311] Add a function to get a random sample from an iterable (reservoir sampling)

Reply via email to