Em Ter, 2006-04-18 às 10:31 -0500, Tim Chase escreveu: > Is there an obvious/pythonic way to remove duplicates from a > list (resulting order doesn't matter, or can be sorted > postfacto)? My first-pass hack was something of the form > > >>> myList = [3,1,4,1,5,9,2,6,5,3,5] > >>> uniq = dict([k,None for k in myList).keys() > > or alternatively > > >>> uniq = list(set(myList)) > > However, it seems like there's a fair bit of overhead > here...creating a dictionary just to extract its keys, or > creating a set, just to convert it back to a list. It feels > like there's something obvious I'm missing here, but I can't > put my finger on it.
Your list with 11 elements (7 unique): $ python2.4 -mtimeit -s 'x = [3,1,4,1,5,9,2,6,5,3,5]' 'y = dict((k,None) for k in x).keys()' 100000 loops, best of 3: 8.01 usec per loop $ python2.4 -mtimeit -s 'x = [3,1,4,1,5,9,2,6,5,3,5]' $'y = []\nfor i in x:\n if i not in y:\n y.append(i)' 100000 loops, best of 3: 5.43 usec per loop $ python2.4 -mtimeit -s 'x = [3,1,4,1,5,9,2,6,5,3,5]' 'y = list(set(x))' 100000 loops, best of 3: 3.57 usec per loop A list with 100 000 elements (1 000 unique): $ python2.4 -mtimeit -s 'x = range(1000) * 100' $'y = []\nfor i in x:\n if i not in y:\n y.append(i)' 10 loops, best of 3: 2.12 sec per loop $ python2.4 -mtimeit -s 'x = range(1000) * 100' 'y = dict((k,None) for k in x).keys()' 10 loops, best of 3: 32.2 msec per loop $ python2.4 -mtimeit -s 'x = range(1000) * 100' 'y = list(set(x))' 100 loops, best of 3: 6.09 msec per loop "list(set(x))" is the clear winner with almost O(1) performance. *However*, can't you always use "set" or "frozenset" instead of converting back and forth? HTH, -- Felipe. -- http://mail.python.org/mailman/listinfo/python-list