That looks good, but contains the inner search loop (looking up the stored fields from within the main search loop, which is the hit collector). For few results this is ok, but if you are collecting thousands of hits from a very large index that does not fit into memory, the collect gets slow because of a lot of disk seeking (even when you filter out some fields with fieldselector, the blocks are read from HDD).
To optimize, store the filename not as stored field, but as a non-tokenized, indexed term. You can then use arr = FieldCache.getDefault().getStrings(searcher.getIndexReader(),"FILE"); The returned array contains one entry per document id. Inside the search loop, just use arr[docID] to get the file name. Please note, on large indexes the initial field cache loading could take some time. In Lucene 2.9 this gets better with the new Collectors, that directly work on segments, if you want to use 2.9 just ask, how the same can be achieved there. The new collector can there be optimized to get the FieldCaches for each segment inside Collector.setNextReader() ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -----Original Message----- > From: Paul J. Lucas [mailto:p...@lucasmail.org] > Sent: Wednesday, June 10, 2009 5:26 PM > To: java-user@lucene.apache.org > Subject: Re: Migrating from Hit/Hits to TopDocs/TopDocCollector > > On Jun 10, 2009, at 3:17 AM, Uwe Schindler wrote: > > > A HitCollector is the correct way to do this (especially because the > > order of hits is mostly not interesting when retrieving all hits). > > OK, here's what I came up with: > > Term t = /* ... */ > Collection<File> files = new LinkedList<File>(); > FieldSelector fieldSelector = new FieldSelector() { > public FieldSelectorResult accept( String fieldName ) { > if ( fieldName.equals( "FILE" ) ) > return FieldSelectorResult.LOAD_AND_BREAK; > return FieldSelectorResult.NO_LOAD; > } > }; > HitCollector hitCollector = new HitCollector() { > public void collect( int docID, float score ) { > try { > Document doc = searcher.doc( docID, fieldSelector ); > files.add( new File( doc.get( "FILE" ) ) ); > } > catch ( Exception e ) { > // ignore > } > } > }; > searcher.search( new TermQuery( t ), hitCollector ); > > How's that? > > - Paul > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org