if you don't score but sort by id, it may be a little bit faster. but for 3.x, you can hardly speed up by simpler scoring function. for your situation, the bottleneck is cpu. you can speed up by paralleling. so the best one is to split index and searching concurrently. so the cpus can be fully used. you can split do paralleling search in lucene. but I recommend you using solr because it's easy to scale to many nodes without many pains.
On Sat, May 26, 2012 at 8:59 AM, Yang <teddyyyy...@gmail.com> wrote: > I tested with more threads / processes. indeed this is completely > cpu-bound, since running 1 thread gives the same latency as 4 threads (my > box has 4 cores) > > > given this, is there any way to simplify the scoring computation (i'm only > using lucene as a first level "rough" search, so the search quality is not > a huge issue here) , so that, for example, fewer fields are evaluated or a > simpler scoring function is used? > > thanks > Yang > > On Fri, May 25, 2012 at 5:47 PM, Yang <teddyyyy...@gmail.com> wrote: > >> thanks a lot guys >> >> >> On Tue, May 22, 2012 at 1:34 AM, Ian Lea <ian....@gmail.com> wrote: >> >>> Lots of good tips in >>> http://wiki.apache.org/lucene-java/ImproveSearchingSpeed, linked from >>> the FAQ. >>> >>> >>> -- >>> Ian. >>> >>> >>> On Tue, May 22, 2012 at 2:08 AM, Li Li <fancye...@gmail.com> wrote: >>> > something wrong when writing in my android client. >>> > if RAMDirectory do not help, i think the bottleneck is cpu. you may try >>> to >>> > tune jvm but i do not expect much improvement. >>> > the best one is splitting your index into 2 or more smaller ones. >>> > you can then use solr s distributed searching. >>> > if the cpu is not fully used, yuo can do this in one physical machine >>> > >>> > 在 2012-5-22 上午8:50,"Li Li" <fancye...@gmail.com>写道: >>> >> >>> >> >>> >> 在 2012-5-22 凌晨4:59,"Yang" <teddyyyy...@gmail.com>写道: >>> >> >>> >> > >>> >> > I'm trying to make my search faster. right now a query like >>> >> > >>> >> > name:Joe Moe Pizza address:77 main street city:San Francisco >>> >> >is this a conjunction query or a disjunction query? >>> >> >>> >> > in a index with 20mil such short business descriptions (total size >>> > about 3GB) takes about 100--200ms. >>> >> >20m is not a small size, how many results for a query in average? >>> >> >>> >> > I profiled the query, most time is spent in TermScorer.score(), as is >>> > shown by the attached yourkit screenshot. >>> >> >that's true, for a query, matching and scoring is very time consuming >>> > and cpu intensive. another one is io for reading postings. >>> >> >>> >> > >>> >> > >>> >> > >>> >> > I tried loading the index onto tmpfs (in-memory block device), and >>> also >>> > tried RAMDirectory, neither helps much. >>> >> >if that is true. it seems that io is not the >>> >> > I am reading >>> > http://www.cnlp.org/presentations/slides/AdvancedLuceneEU.pdf >>> >> > it mentions >>> >> > Size >>> >> > – Stopword removal >>> >> > – Stemming >>> >> > • Lucene has a number of stemmers available >>> >> > • Light versus Aggressive >>> >> > • May prevent fine-grained matches in some cases >>> >> > – Not a linear factor (usually) due to index compression >>> >> > >>> >> > so for "stopword removal", I'm already using the standard analyzer, >>> so >>> > stop word removal is already included, right? >>> >> > >>> >> > also generally any other tricks to try for reducing the search >>> latency? >>> >> > >>> >> > Thanks! >>> >> > Yang >>> >> > >>> >> > >>> >> > --------------------------------------------------------------------- >>> >> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>> >> > For additional commands, e-mail: java-user-h...@lucene.apache.org >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>> >>> >> --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org