I'm not coming up with the right keywords to find what I'm hunting. I'd like to randomly sample a modestly compact list with weighted distributions, so I might have
data = ( ("apple", 20), ("orange", 50), ("grape", 30), ) and I'd like to random.sample() it as if it was a 100-element list. However, ideally, this could be done in O(size-of-data) storage rather than requiring the build-out of the entire set just for sampling purposes, as the actual data can get a bit large. For this small toy data-set, I can use sample_me = sum(([s]*n for s,n in data, []) random.sample(sample_me, k) but for large counts, the list returned from sum() grinds my system because I start swapping. What am I missing? (links to relevant keywords/searches/algorithms welcome in lieu of actually answering in-line) Thanks, -tkc . -- https://mail.python.org/mailman/listinfo/python-list