IndexSearcher and multi-threaded performance

Dmitri Bichko Tue, 11 Nov 2008 14:00:20 -0800

Hi,

I'm pretty new to Lucene, so please bear with me if this has been
covered before.


The wiki suggests sharing a single IndexSearcher between threads for
best performance
(http://wiki.apache.org/lucene-java/ImproveSearchingSpeed).  I've
tested running the same set of queries with: multiple threads sharing
the same searcher, with a separate searcher for each thread, both
shared/private with a RAMDirectory in-memory index, and (just for fun)
in multiple JVMs running concurrently (the results are in milliseconds
to complete the whole job):

threads  multi-jvm  shared  per-thread  ram-shared  ram-thread
      1      72997   70883       72573       60308       60012
      2      33147   48762       35973       25498       25734
      4      16229   46828       21267       13127       27164
      6      13088   47240       14028        9858       29917
      8       9775   47020       10983        8948       10440
     10       8721   50132       11334        9587       11355
     12       7290   49002       11798        9832
     16       9365   47099       12338       11296

The shared searcher indeed behaves better with a ram-based index, but
what's going on with the disk-based one?  It's basically not scaling
beyond two threads. Am I just doing something completely wrong here?

The test consists of about 1,500 Boolean OR queries with 1-10
PhraseQueries each, with 1-20 Terms per PhraseQuery.  I'm using a
HitCollector to count the hits, so I'm not retrieving any results.
The index is about 5GB and 20 million documents.

This is running on a 8 x quad-core Opteron machine with plenty of RAM to spare.

Any idea why I would see this behaviour?

Thanks,
Dmitri

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

IndexSearcher and multi-threaded performance

Reply via email to