Hello,
I want to build a function which return values which appear two or more times in a list:
This is very similar to removing duplicate items from a list which was the subject of a long recent thread, full of suggested approaches.
Here's one way to do what you want:
>>> l = [1, 7, 3, 4, 3, 2, 1] >>> seen = set() >>> set(x for x in l if x in seen or seen.add(x)) set([1, 3]) >>>
This is a 'generator expression' applied as an argument to the set constructor. It relies on the fact that seen.add returns None, and is therefore always false.
this is equivalent to:
>>> def _generate_duplicates(iterable): ... seen = set() ... for x in iterable: ... if x in seen: # it's a duplicate ... yield x ... else: ... seen.add(x) ... >>> generator = _generate_duplicates(l) >>> generator <generator object at 0x16C114B8> >>> set(generator) set([1, 3])
>>> # In case you want to preserve the order and number of the duplicates, you >>> # would use a list >>> generator = _generate_duplicates(l) >>> list(generator) [3, 1] >>>
So, I decided to write a little example which doesn't work: #l = [1, 7, 3, 4, 3, 2, 1] #i = iter(l) #for x in i: # j = iter(i) # for y in j: # if x == y: # print x
In thinked that the instruction 'j= iter(i)' create a new iterator 'j' based on 'i' (some kind of clone). I wrote this little test which show that 'j = iter(i)' is the same as 'j = i' (that makes me sad):
I don't think your algorithm would work even if iter(iterator) did return a copy or separate iterator. If, however, you do have an algorithm that needs that capability, you can use itertools.tee
Cheers Michael
-- http://mail.python.org/mailman/listinfo/python-list