Paul Czyzewski <tallp...@gmail.com> added the comment:

btw, this was my suggestion. Steven opened the issue on my behalf (I'm new).

1) Documentation change.
 I would suggest that, to this paragraph:
"If neither weights nor cum_weights are specified, selections are made with 
equal probability. If a weights sequence is supplied, it must be the same 
length as the population sequence. It is a TypeError to specify both weights 
and cum_weights."

The following sentence be added:
"A cum_weights sequence, if supplied, must be in strictly-ascending order, else 
incorrect results will be (silently) returned."

[BTW, in the current documentation, when I read this sentence:
:For example, the relative weights [10, 5, 30, 5] are equivalent to the 
cumulative weights [10, 15, 45, 50]," it wasn't clear to me that this was the 
format of the cum_weights *argument*.  I thought that this conversion happened 
internally.  So, I'd prefer that something more explicit be stated (especially 
the part about silently giving bad results).]

2) I'm giving up on suggesting a code change.  However, I'll just 
respond that
  a) I believe that the big win of the cum_weights option is for people who 
already have the sequence in that form, rather than that they will save the 
O(n) cost of having the list built.  
  b) If I have big list, but also an other-than-tiny k value (eg, k=100), then 
the time (with the change) would be 400 time the O(log n) plus one times O(n), 
so this may or may not be significant.
  c) I agree that, if someone did, eg, 400 *separate* calls, each with k=1, the 
cost would be higher.  This seems unlikely to me but...

thanks
  Paul Czyzewski

----------
nosy: +PaulSFO

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue33494>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to