Re: search performance

Jamie Fri, 20 Jun 2014 01:08:15 -0700

Greetings Lucene Users

As a follow-up to my earlier mail:

We are also using Lucene segment warmers, as per recommendation,segments per tier is now set to five, buffer memory is set to(Runtime.getRuntime().totalMemory()*.08)/1024/1024;


See below for code used to instantiate writer:

LimitTokenCountAnalyzer limitAnalyzer = newLimitTokenCountAnalyzer(application.getAnalyzerFactory().getAnalyzer(language,AnalyzerFactory.Operation.INDEX), maxPerFieldTokens);IndexWriterConfig conf = newIndexWriterConfig(Version.LUCENE_46, limitAnalyzer);TieredMergePolicy logMergePolicy = newTieredMergePolicy();

                    logMergePolicy.setSegmentsPerTier(5);
                    conf.setMergePolicy(logMergePolicy);
                     conf.setRAMBufferSizeMB(bufferMemoryMB);
                    writer = new IndexWriter(fsDirectory, conf);
writer.getConfig().setMergedSegmentWarmer(readerWarmer);

This particular monster 24 core machine has 110G of RAM. I suppose onepossibly is to load the indexes that aren' t being changed into RAM onstartup. However, the indexes are already residing on fast SSD drives.


We're using the following JRE parameters:

-XX:+UseG1GC -XX:MaxGCPauseMillis=100 -XX:SurvivorRatio=3-XX:+AggressiveOpts.

Let me know if there is anything else, we can try to obtain performancegains.


Much appreciate

Jamie
On 2014/06/20, 9:51 AM, Jamie wrote:

Hi All
Thank you for all your suggestions. Some of the recommendations hadn'tyet been implemented, as our code base was using older versions ofLucene with reduced capabilities. Thus, far, all the recommendationsfor fast search have been implemented (e.g. using pagination withsearchAfter, DirectoryReader.openIfChanged, avoiding wrapping lucenescoreDoc results, option to disable sorting, etc.).
While, in some environments, search performance has improvedsignificantly, in other larger ones we are unfortunately, still seeing1 minute - 5 minute search times. For instance, in one site, the totalindex size is 500GB with 190 million documents indexed. They arerunning a machine with 24 core and 4 SSD drives to house the indexes.New emails are being added to the indexes at a rate of 10 message/sec.
One area possible area for improvement: Searching is being conductedacross several indexes. To accomplish this, on each search, aMultiReader is constructed, that consists of several subreaderscreated by the DirectoryReader.openIfChangedMethod. Only one of theindexes is updated frequently, the others are never updated. For eachsearch, a new IndexSearcher is created passed the MultiReader in theconstructor. From what I've read, MultiReader and IndexSearcher arerelatively lightweight and should not impact search performance. Isthis correct? Is there a faster way to handle searching acrossmultiple indexes? What is the performance impact of searching acrossmultiple indexes?
Am I correct that using SearchManager can't be used with a MultiReaderand NRT? I would appreciate all suggestions on how to optimize oursearch performance further. Search time has become a usability issue.
Much appreciate

Jamie



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: search performance

Reply via email to