Le mardi 22 août 2006 23:15, Fredrik Lundh a écrit : > Maric Michaud wrote: > > The problem here, is that the strings in the set are compared by value, > > which is not optimal, and I guess python compare them by adress ("s*n is > > s*n" has the same complexity than "s*n == s*n" in CPython, right ?). > > wrong. >
Ah ! wrong, thanks, but not in our case : In [80]: for i in (10**e for e in range(5)) : ....: res1 = timeit.Timer('s == t', 's, t = "e"*%i, "e"*%i'%(i, i)).timeit() ....: res2 = timeit.Timer('s is t', 's, t = "e"*%i, "e"*%i'%(i, i)).timeit() ....: print res1/res2 ....: ....: 1.10532866525 1.27507328965 1.90244004672 8.33974283485 89.5215441627 Indeed, it's wrong for two instances of str, but not if we compare an instance to itself : In [79]: for i in (10**e for e in range(9)) : ....: r1=timeit.Timer('s is p', "s, p = ('s'*%s,)*2" % i).timeit() ....: r2=timeit.Timer('s == p', "s, p = ('s'*%s,)*2" % i).timeit() ....: print r1/r2 ....: ....: 0.854690643008 0.872682262181 0.851785060822 0.871193603744 0.890304121256 0.86925960859 0.846364097331 0.91614070798 0.825424114324 So my lastest c++ algorithm is the good one still, as the python code is like : In [29]: timeit.Timer('set(t)', 't=("e"*10,)*10').timeit() Out[29]: 1.883868932723999 In [30]: timeit.Timer('set(t)', 't=("e"*100,)*10').timeit() Out[30]: 1.8789899349212646 and not : In [27]: timeit.Timer('set(t)', 't=["e"*10 for i in range(10)]').timeit() Out[27]: 2.6000990867614746 In [28]: timeit.Timer('set(t)', 't=["e"*100 for i in range(10)]').timeit() Out[28]: 4.1173238754272461 > > timeit -s"s='x'; n=1000" "s*n is n*s" > > 1000000 loops, best of 3: 1.9 usec per loop > > > timeit -s"s='x'; n=1000" "s*n == n*s" > > 100000 loops, best of 3: 4.5 usec per loop > > </F> -- _____________ Maric Michaud _____________ Aristote - www.aristote.info 3 place des tapis 69004 Lyon Tel: +33 426 880 097 -- http://mail.python.org/mailman/listinfo/python-list