On 6/9/06, Peter Keegan <[EMAIL PROTECTED]> wrote:
However, my throughput testing shows that the Solr method is at least 50% faster than mine. I'm seeing a big win with the use of the HashDocSet for lower hit counts. On my 64-bit platform, a MAX_SIZE value of 10K-20K seems to provide optimal performance.
Interesting... how many documents are in your collection? It would prob be nice to make the HashDocSet cutt-off dynamic rather than fixed. Are you using Solr, or just some of it's code?
I'm looking forward to trying this with OpenBitSet.
I checked in the OpenBitSet changes today. I imagine this will lower the optimal max HashDocSet size for performance a little. You might not see much performance improvement if most of the intersections involved a HashDocSet... the OpenBitSet improvements only kick in with bitset<->bitset intersection counts. -Yonik http://incubator.apache.org/solr Solr, the open-source Lucene search server --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]