Skip Montanaro <[EMAIL PROTECTED]> wrote in message news:<[EMAIL PROTECTED]>... > >> > Is there a reason to use sets here? I think lists will do as well. > >> > >> Sets are implemented using dictionaries, so the "if w in KEYWORDS" > >> part would be O(1) instead of O(n) as with lists... > >> > >> (I.e. searching a list is a brute-force operation, whereas > >> sets are not.) > > Jp> And yet... using sets here is slower in every possible case: > ... > Jp> This is a pretty clear example of premature optimization. > > I think the set concept is correct. The keywords of interest are best > thought of as an unordered collection. Lists imply some ordering (or at > least that potential). Premature optimization would have been realizing > that scanning a short list of strings was faster than testing for set > membership and choosing to use lists instead of sets. > > Skip
Jp scores extra points for pre-maturity by not trying out version 2.4, by not reading the bit about sets now being built-in, based on dicts, dicts being one of the timbot's optimise-the-snot-out-of targets ... herewith some results from a box with a 1.4Ghz Athlon chip running Windows 2000: C:\junk>\python24\python \python24\lib\timeit.py -s "from sets import Set; x = Set(['and', 'or', 'not'])" "None in x" 1000000 loops, best of 3: 1.81 usec per loop C:\junk>\python24\python \python24\lib\timeit.py -s "from sets import Set; x = Set(['and', 'or', 'not'])" "None in x" 1000000 loops, best of 3: 1.77 usec per loop C:\junk>\python24\python \python24\lib\timeit.py -s "x = set(['and', 'or', 'not'])" "None in x" 1000000 loops, best of 3: 0.29 usec per loop C:\junk>\python24\python \python24\lib\timeit.py -s "x = set(['and', 'or', 'not'])" "None in x" 1000000 loops, best of 3: 0.289 usec per loop C:\junk>\python24\python \python24\lib\timeit.py -s "x = ['and', 'or', 'not']" "None in x" 1000000 loops, best of 3: 0.804 usec per loop C:\junk>\python24\python \python24\lib\timeit.py -s "x = ['and', 'or', 'not']" "None in x" 1000000 loops, best of 3: 0.81 usec per loop C:\junk>\python24\python \python24\lib\timeit.py -s "from sets import Set; x = Set(['and', 'or', 'not'])" "'and' in x" 1000000 loops, best of 3: 1.69 usec per loop C:\junk>\python24\python \python24\lib\timeit.py -s "x = set(['and', 'or', 'not'])" "'and' in x" 1000000 loops, best of 3: 0.243 usec per loop C:\junk>\python24\python \python24\lib\timeit.py -s "x = set(['and', 'or', 'not'])" "'and' in x" 1000000 loops, best of 3: 0.245 usec per loop C:\junk>\python24\python \python24\lib\timeit.py -s "x = ['and', 'or', 'not']" "'and' in x" 1000000 loops, best of 3: 0.22 usec per loop C:\junk>\python24\python \python24\lib\timeit.py -s "x = ['and', 'or', 'not']" "'and' in x" 1000000 loops, best of 3: 0.22 usec per loop C:\junk>\python24\python \python24\lib\timeit.py -s "x = set(['and', 'or', 'not'])" "'not' in x" 1000000 loops, best of 3: 0.257 usec per loop C:\junk>\python24\python \python24\lib\timeit.py -s "x = ['and', 'or', 'not']" "'not' in x" 1000000 loops, best of 3: 0.34 usec per loop tee hee ... -- http://mail.python.org/mailman/listinfo/python-list