We recently upgraded from lucene 2.4.0 to lucene 3.0.2. Our load testing revealed a serious performance drop specific to traversing the list of terms and their associated documents for a given indexed field. Our code looks something like this:
for(Term term : terms) { TermDocs termDocs = indexReader.termDocs(term); while(termDocs.next()) { // much slower here int doc = termDocs.doc(); ...do something with each doc... } The slowness is all on the first call to TermDocs.next() for each term. Further investigation comparing 2.4.0 and 3.0.2 revealed that there is some new synchronization on the SegmentTermDocs constructor and the SegmentReader.getTermsReader(). The first call to next() hits this synchronization, causing a 4x slowdown on an 8 CPU machine. My first question is should we be using a different approach to process each term's doc list that would be more efficient? The synchronization appears to be on aspects of these classes that the next() operation is not concerned with. My other question is whether there are planned performance enhancements to address this loss of performance? Thanks. John