Re: lucene (search) performance tuning

Simon Willnauer Sat, 26 May 2012 04:38:46 -0700

On Sat, May 26, 2012 at 2:59 AM, Yang <teddyyyy...@gmail.com> wrote:
> I tested with more threads / processes. indeed this is completely
> cpu-bound, since running 1 thread gives the same latency as 4 threads (my
> box has 4 cores)
>
>
> given this, is there any way to simplify the scoring computation (i'm only
> using lucene as a first level "rough" search, so the search quality is not
> a huge issue here) , so that, for example, fewer fields are evaluated or a
> simpler scoring function is used?


are you using disjunction or conjunction queries? Can you make some
parts of the query mandatory?

simon
>
> thanks
> Yang
>
> On Fri, May 25, 2012 at 5:47 PM, Yang <teddyyyy...@gmail.com> wrote:
>
>> thanks a lot guys
>>
>>
>> On Tue, May 22, 2012 at 1:34 AM, Ian Lea <ian....@gmail.com> wrote:
>>
>>> Lots of good tips in
>>> http://wiki.apache.org/lucene-java/ImproveSearchingSpeed, linked from
>>> the FAQ.
>>>
>>>
>>> --
>>> Ian.
>>>
>>>
>>> On Tue, May 22, 2012 at 2:08 AM, Li Li <fancye...@gmail.com> wrote:
>>> > something wrong when writing in my android client.
>>> > if RAMDirectory do not help, i think the bottleneck is cpu. you may try
>>> to
>>> > tune jvm but i do not expect much improvement.
>>> > the best one is splitting your index into 2 or more smaller ones.
>>> > you can then use solr s distributed searching.
>>> > if the cpu is not fully used, yuo can do this in one physical machine
>>> >
>>> > 在 2012-5-22 上午8:50，"Li Li" <fancye...@gmail.com>写道：
>>> >>
>>> >>
>>> >> 在 2012-5-22 凌晨4:59，"Yang" <teddyyyy...@gmail.com>写道：
>>> >>
>>> >> >
>>> >> > I'm trying to make my search faster. right now a query like
>>> >> >
>>> >> > name:Joe Moe Pizza   address:77 main street  city:San Francisco
>>> >> >is this a conjunction query or a disjunction query？
>>> >>
>>> >> > in a index with 20mil such short business descriptions (total size
>>> > about 3GB) takes about 100--200ms.
>>> >> >20m is not a small size, how many results for a query in average？
>>> >>
>>> >> > I profiled the query, most time is spent in TermScorer.score(), as is
>>> > shown by the attached yourkit screenshot.
>>> >> >that＇s true, for a query, matching and scoring is very time consuming
>>> > and cpu intensive. another one is io for reading postings.
>>> >>
>>> >> >
>>> >> >
>>> >> >
>>> >> > I tried loading the index onto tmpfs (in-memory block device), and
>>> also
>>> > tried RAMDirectory, neither helps much.
>>> >> >if that is true. it seems that io is not the
>>> >> > I am reading
>>> > http://www.cnlp.org/presentations/slides/AdvancedLuceneEU.pdf
>>> >> > it mentions
>>> >> > Size
>>> >> > – Stopword removal
>>> >> > – Stemming
>>> >> > • Lucene has a number of stemmers available
>>> >> > • Light versus Aggressive
>>> >> > • May prevent fine-grained matches in some cases
>>> >> > – Not a linear factor (usually) due to index compression
>>> >> >
>>> >> > so for "stopword removal", I'm already using the standard analyzer,
>>> so
>>> > stop word removal is already included, right?
>>> >> >
>>> >> > also generally any other tricks to try for reducing the search
>>> latency?
>>> >> >
>>> >> > Thanks!
>>> >> > Yang
>>> >> >
>>> >> >
>>> >> > ---------------------------------------------------------------------
>>> >> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>> >> > For additional commands, e-mail: java-user-h...@lucene.apache.org
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>>
>>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: lucene (search) performance tuning

Reply via email to