On Jul 27, 2010, at 8:50 AM, Philippe wrote:

> Hi,
> 
> what would be the fastest way to get all terms for all documents matching a 
> specific query?
> 
> Sofar I:
> 
> 1.) Query the index
> 2.) Retrieve all scoreDocs
> 3.) Iterate the scoreDocs and retrieve all terms using the getValues method 
> and a customised "FieldSelector"
> 
> However, retrieving and iterating the scoredocs is quite costly.  So is there 
> a better/faster way to perform this?


If you can afford to store TermVectors (disk is cheap, right?) then it will 
give you back the terms post analysis and you won't have to split again, which 
you would have to do if you use the getValues() approach.  You might also hook 
into the Collector (HitCollector) and build it as you go, assuming you don't 
need the score docs structure.

-Grant



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to