Paul Czyzewski <tallp...@gmail.com> added the comment: btw, this was my suggestion. Steven opened the issue on my behalf (I'm new).
1) Documentation change. I would suggest that, to this paragraph: "If neither weights nor cum_weights are specified, selections are made with equal probability. If a weights sequence is supplied, it must be the same length as the population sequence. It is a TypeError to specify both weights and cum_weights." The following sentence be added: "A cum_weights sequence, if supplied, must be in strictly-ascending order, else incorrect results will be (silently) returned." [BTW, in the current documentation, when I read this sentence: :For example, the relative weights [10, 5, 30, 5] are equivalent to the cumulative weights [10, 15, 45, 50]," it wasn't clear to me that this was the format of the cum_weights *argument*. I thought that this conversion happened internally. So, I'd prefer that something more explicit be stated (especially the part about silently giving bad results).] 2) I'm giving up on suggesting a code change. However, I'll just respond that a) I believe that the big win of the cum_weights option is for people who already have the sequence in that form, rather than that they will save the O(n) cost of having the list built. b) If I have big list, but also an other-than-tiny k value (eg, k=100), then the time (with the change) would be 400 time the O(log n) plus one times O(n), so this may or may not be significant. c) I agree that, if someone did, eg, 400 *separate* calls, each with k=1, the cost would be higher. This seems unlikely to me but... thanks Paul Czyzewski ---------- nosy: +PaulSFO _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue33494> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com