Hi,

in Lucene 4.0 I would recommend to use TermsFilter (from queries module), not 
FieldCacheTermsFilter, because the term dictionary is much faster and it is in 
this case better to use the posting lists, instead of scanning all documents 
(which FCTermsCache does). How many filter terms do you have? Is the filter 
selective? To further improve, use CachingWrapperFilter, too (this will cache 
filter results, which is useful if you have a set of Filters/terms that are 
used quite often).
The problem with FCTermsFilter is: It scans all documents from beginning to end 
and looks them up the terms cache. In Lucene 4.0 the structure of the 
FieldCache changed to be more memory efficient (which does not hurt the primary 
use-case of sorting), but scanning all documents and resolving all terms is not 
always the best option (this also heavily relies on your index structure, 
FCTermsFilter may still be faster under some circumstances).

Uwe
-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


> -----Original Message-----
> From: emmanuel Gosse [mailto:emmanuel.go...@gmail.com]
> Sent: Saturday, January 19, 2013 10:58 PM
> To: java-user@lucene.apache.org
> Subject: FieldCacheTermsFilter performance
> 
> Hi,
> 
> I would like to share a performance problem about FieldCacheTermsFilter
> between 3.0.3 and 4.0.0 Lucene versions.
> 
> I've made tests with the same application with 3.0.3 (my production
> version) and 4.0.0.
> And I found a "big" difference of response time.
> 
> I run "real life" injection of 400 000 queries and I obtain the average of 
> time
> response.
> I used to run this type of tests to validate that we have no performance
> regression.
> 
> So I've made other tests to find out where comes this difference.
> Desactivating faceting or changing Directory used or other more...
> 
> And for one test, I desactivated the filters (I use only
> FieldCacheTermsFilter) and I obtained the same average of time response.
> 
> To give some data :
> 20 millions of documents
> 3 indexes under a multireader
> no indexations, only searcher (indexation is not implemented in this app)
> 400 000 queries with jmeter
> 
> Test :
> 
> 3.0.3 or 4.0.0
> Queries without filters : 60ms (average of time response)
> 
> Queries with filters:
> 3.0.3 : 150ms
> 4.0.0 : 400ms
> 
> The code difference of my application is only the required one to plug with
> each Lucene version.
> 
> The fields used to filter are not stored and in 4.0.0 version, are 
> stringfield.
> I checked that caches of fieldCache dont move for the test.
> 
> I have no more ideas to seek. Maybe I've not understood which type of field
> I should use.
> 
> Emmanuel
> 
> -----------
> Emmanuel Gosse
> Fnac.Com <http://www.fnac.com>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to