Re: Performance measurements

Sriram Sankar Wed, 24 Jul 2013 12:59:26 -0700

On Wed, Jul 24, 2013 at 10:24 AM, Jack Krupansky <j...@basetechnology.com>wrote:


> Unicorn sounds like it was optimized for graph search. Specialized search
> engines can in fact beat out generalized search engines for specific use
> cases.
>

Yes and no (I worked on it).  Yes, there are many aspect of Unicorn that
have been optimized for graph search.  But the tests I am running have very
little to do with those optimizations.  I am still learning about Lucene
and have suspected that the scoring framework (that has to be very general)
may be contributing to the performance issues.  With Unicorn, we made a
decision to do all scoring after retrieval and not during retrieval.


>
> Scoring has been a major focus of Lucene. Non-scored filters are also
> available, but the query parsers are focused (exclusively) on scored-search.
>

When you say "filter" do you mean a step performed after retrieval?  Or is
it yet another retrieval operation?


>
> As Adrien indicates, try using raw Lucene filters and you should get much
> better results. Whether even that will compete with a use-case-specific
> (graph) search engine remains to be seen.


Thanks (I will study this more).

Sriram.



>
>
> -- Jack Krupansky
>
> -----Original Message----- From: Sriram Sankar
> Sent: Wednesday, July 24, 2013 1:03 PM
> To: java-user@lucene.apache.org
> Subject: Re: Performance measurements
>
>
> No I do not need scoring.  This is a pure retrieval query - which matches
> what we used to do with Unicorn in Facebook - something like:
>
> (name:sriram AND (friend:1 OR friend:2 ...))
>
> This automatically gives us second degree.
>
> With Unicorn, we would always get sub-millisecond performance even for
> n>500.
>
> Should I assume that Lucene is that much worse - or is it that this use
> case has not been optimized?
>
> Sriram.
>
>
>
> On Wed, Jul 24, 2013 at 9:59 AM, Adrien Grand <jpou...@gmail.com> wrote:
>
>  Hi,
>>
>> On Wed, Jul 24, 2013 at 6:11 PM, Sriram Sankar <san...@gmail.com> wrote:
>> > termA AND (termB1 OR termB2 OR ... OR termBn)
>>
>> Maybe this comment is not appropriate for your use-case, but if you
>> don't actually need scoring from the disjunction on the right of the
>> query, a TermsFilter will be faster when n gets large.
>>
>> --
>> Adrien
>>
>> ------------------------------**------------------------------**---------
>> To unsubscribe, e-mail: 
>> java-user-unsubscribe@lucene.**apache.org<java-user-unsubscr...@lucene.apache.org>
>> For additional commands, e-mail: 
>> java-user-help@lucene.apache.**org<java-user-h...@lucene.apache.org>
>>
>>
>>
>
> ------------------------------**------------------------------**---------
> To unsubscribe, e-mail: 
> java-user-unsubscribe@lucene.**apache.org<java-user-unsubscr...@lucene.apache.org>
> For additional commands, e-mail: 
> java-user-help@lucene.apache.**org<java-user-h...@lucene.apache.org>
>
>

Re: Performance measurements

Reply via email to