Re: 2.3.2 Indexing Performance

Michael McCandless Fri, 08 Aug 2008 16:36:58 -0700


Thanks for the data point!

This is expected -- alot of work went into increasing IndexWriter'sthroughput in 2.3.

Actually, I'd expect even more speedup, if indeed Lucene is thebottleneck in your app. You could test how much time just creating/parsing & tokenizing the docs (from whatever is holding them) takes,to see. Also you might eke more performance out following thesuggestions here:


    http://wiki.apache.org/lucene-java/ImproveIndexingSpeed

Since you've got 4 CPUs and lots of RAM you should definitely usemultiple indexing threads with a large RAM buffer.


Mike

Gary Moore wrote:

Parsing and indexing 4.5 million MARC/XML bibliographic records wasrequiring ~14 hrs. using 2.2. The same job using 2.3 takes ~ 5 hrs.on the same platform -- a quad processor Sun V440 w/8GB memory.I'm using the PerFieldAnalyzerWrapper (StandardAnalyzer andSnowballAnalyzer).
I'm impressed!  Is this typical?

Gary Moore
[EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: 2.3.2 Indexing Performance

Reply via email to