En Mon, 12 Nov 2007 16:21:36 -0300, Scott SA <[EMAIL PROTECTED]> escribió:
> I decided to test the speeds of the four methods: (but one should always check for correctness before checking speed) > def dict_example(urls): > d = {} > for url in urls: > if url in d: > d[url] = 1 The above does nothing more than iterate over all urls. > For the simple quest of "Find the unique URLs in a list", the 'dict' > example is by far the quickest. After correction, there is no significative difference between dict_example and set_example (sometimes one is slightly slower than the other, and sometimes it's the other way). But the simpler s = set(urls) suggested by Terry Reedy clearly wins (at least on my tests). > Buuut, if you need the indexes and possibly a 'hit count', then the > enumerated example seems about the best (of the possibilities tested) In this case there is another alternative using defaultict: def defaultdict_example(urls): d = defaultdict(list) for idx,url in enumerate(urls): d[url].append(idx) which is faster than setdefault_enum (again, on my tests). Another point worth menction, I did shuffle(dummy_urls) before testing. Having all of them ordered may introduce some bias. -- Gabriel Genellina -- http://mail.python.org/mailman/listinfo/python-list