Michael Spencer wrote:
>>> def resample2(data):
... bag = {}
... random.shuffle(data)
... return [[(item, label)
... for item, label in group
... if bag.setdefault(label,[]).append(item)
... or len(bag[label]) < 3]
... for group in data if not
...which failed to calculate the minimum count of labels, try this instead (while I was at it, I removed the insance LC)
>>> def resample3(data):
... bag = {}
... sample = []
... labels = [label for group in data for item, label in group]
... min_count = min(labels.count(label) for label in set(labels))
... random.shuffle(data)
... for subgroup in data:
... random.shuffle(subgroup)
... subgroupsample = []
... for item, label in subgroup:
... bag.setdefault(label,[]).append(item)
... if len(bag[label]) <= min_count:
... subgroupsample.append((item,label))
... sample.append(subgroupsample)
... return sample
...
>>>Cheers
Michael
-- http://mail.python.org/mailman/listinfo/python-list
