> Lucene started out at an avg 3ms but subsequent runs took it down 
> dramatically due to OS file caching. The all-in-memory hashset implementation 
> clearly did not demonstrate the same speed ups between runs.

I don't say the benchmark was wrong or anything, but this is
surprising. I mean, the default HashSet impl. is a bucketed
linked-list implementation. It made me wonder how the data was
distributed. Even with OS file caching the in-memory data structure
shouldn't fall short, at least intuitively.

> I can make the code available but the data wouldn't be possible.
> The English Wikipedia page titles are probably an equivalent size and shape 
> so I could try and package something up around that as a benchmarking tool 
> for others to play with.

If you find a spare cycle, it'd be great, thanks!

Dawid

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to