Diez B. Roggisch wrote: >>Both of these techniques are O(n^2). You can reduce it to O(n log n) >>by using sets: >> >>>>>set2 = set(list2) >>>>>[x for x in list1 if x not in set2] >> >>Checking to see if an item is in a set is much more efficient than a >>list. > > Is the set-lookup reliably O(log n)? I was under the impression that it is > hash-based, and this should be O(1) usually, but couldbve O(n) worst-case > (hash the same for _all_ entries).
That's largely a theoretical concern. Google for something like '''dict worst-case performance "tim peters"''' to learn more. (The third article there (no doubt obsolete in some ways, given that it was in 2000) says that Python "keeps at least 1/3 of the internal hash table entries unused, making collisions very rarely a problem... It's possible to contrive keys that will cause collisions systematically ... but unlikely to happen by accident in 2.0") -Peter -- http://mail.python.org/mailman/listinfo/python-list