Ah, OK, a couple of things. First, if you're running timing experiments, you probably want the times to be measured in seconds for them to be at all reliable. It seems to take up to 10 seconds of executing a bit of code before HotSpot is done optimizing it (depending on complexity of what you're running -- here, probably not so much), so usually you want to execute at least one "throw-away" run, and then do a final set of timings. Even then, if your final times aren't at least a second or so, you'll end up running afoul of intermittent GC overhead (sometimes it will happen during your run, sometimes not). At least, that's how I understand things.
Second, building the set in your case will be faster since there are only 10 unique elements. AFAIK this means only 10 new "Set" object creations, compared to 100 if the elements were all unique. Cheers, Jason On Feb 26, 2009, at 12:15 PM, Mark Volkmann wrote: > > On Thu, Feb 26, 2009 at 12:02 PM, Jason Wolfe <jawo...@berkeley.edu> > wrote: >> >> Hi Mark, >> >> The results will depend on the objects you are comparing. If you >> need >> to search through the list multiple times, converting to a set once >> is >> almost certainly going to be faster. But, if you're just doing it >> once, iterating will usually be much faster: > > That makes sense, but doesn't match the results I see. See my > additional detail below. > >> user> (time (dotimes [_ 100000] (contains? (set (range 100)) 10))) >> "Elapsed time: 7708.336 msecs" >> >> user> (time (dotimes [_ 100000] (some #(= 10 %) (range 100)))) >> "Elapsed time: 291.474 msecs" >> >> user> (time (let [s (set (range 100))] (dotimes [_ 100000] (contains? >> s 10)))) >> "Elapsed time: 27.978 msecs" >> >> In a simple test I get similar results for strings too... what >> conditions were you testing under? > > Here's the actual code I ran to test. > > (def stooges (list > "0" "1" "2" "3" "4" "5" "6" "7" "8" "9" > "0" "1" "2" "3" "4" "5" "6" "7" "8" "9" > "0" "1" "2" "3" "4" "5" "6" "7" "8" "9" > "0" "1" "2" "3" "4" "5" "6" "7" "8" "9" > "0" "1" "2" "3" "4" "5" "6" "7" "8" "9" > "0" "1" "2" "3" "4" "5" "6" "7" "8" "9" > "0" "1" "2" "3" "4" "5" "6" "7" "8" "9" > "0" "1" "2" "3" "4" "5" "6" "7" "8" "9" > "0" "1" "2" "3" "4" "5" "6" "7" "8" "9" > "0" "1" "2" "3" "4" "5" "6" "7" "8" "9" > "Moe" "Larry" "Curly")) > > (time (dotimes [_ 1000] (some #(= % "Curly") stooges))) > > (time (dotimes [_ 1000] (contains? (set stooges) "Curly"))) > > (let [s (set stooges)] > (time (dotimes [_ 1000] (contains? s "Curly")))) > > I'm seeing a wide variation in results. Here are results from > consecutive runs. > > "Elapsed time: 40.515 msecs" > "Elapsed time: 121.825 msecs" > "Elapsed time: 1.163 msecs" > > "Elapsed time: 69.026 msecs" > "Elapsed time: 60.438 msecs" > "Elapsed time: 1.732 msecs" > > So sometimes converting to a set for a single lookup (the second case) > is faster, but not usually. > >> On Feb 26, 7:53 am, Mark Volkmann <r.mark.volkm...@gmail.com> wrote: >>> I thought for sure it would be faster to use "some" to determine >>> whether an item was in a list rather than convert the list to a set >>> and then use "contains?", but it turns out I was wrong. The latter >>> approach takes about 1/3 to 1/2 the time! This is the case even when >>> the list contains 100 items. Of course I realize that if this is a >>> commonly needed operation for a particular collection, it's better >>> to >>> use a set instead of a list in the first place. >>> >>> (def stooges (list "Moe" "Larry" "Curly")) >>> (time (some #(= % "Curly") stooges)) >>> (time (contains? (set stooges) "Curly")) > > -- > R. Mark Volkmann > Object Computing, Inc. > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~----------~----~----~----~------~----~------~--~---