Re: TopDocs.merge PriorityQueue usage

Adrien Grand Tue, 03 Nov 2015 01:02:44 -0800

Hi Daniel,

I saw the JIRA issue, thanks! Unfortunately Mike's benchmarks won't help as
they try to track performance of indexing/searching a single index, while
the code that you modified is about merging TopDocs from several indices.
When I said micro benchmarks, I don't think we need anything fancy like
JMH, I would be fine with just some java code that creates some random
TopDocs instances and measures how long was spent running TopDocs.merge on
them with System.nanotime. If we want to avoid the pitfalls of benchmarks,
something else that would be interesting to know is how many times the
PriorityQueue.lessThan method is called on trunk vs. your patch, I suspect
it will be less with your patch.


Le mar. 3 nov. 2015 à 00:25, Daniel Jeliński <[email protected]> a
écrit :

> Hi Adrien,
> LUCENE-6878 created.
> This method is called by some of IndexSearcher's search overrides. I'm
> going to try out Mike's benchmark
> <https://github.com/mikemccand/luceneutil> first, and learn to write
> micro benchmarks in the meantime.
> Regards,
> Daniel
>
> 2015-11-02 0:08 GMT+01:00 Adrien Grand <[email protected]>:
>
>> Hi Daniel,
>>
>> Your patch could indeed make things more efficient when merging top hits
>> from many shards, and the code is still easy to read, so +1 to create a
>> JIRA issue. I'm not surprised that ant test did not get faster as we rarely
>> call this method when running tests, maybe you can try to write a simple
>> micro benchmark from randomly generated TopDocs instances?
>>
>>
>> Le dim. 1 nov. 2015 à 23:39, Daniel Jeliński <[email protected]> a
>> écrit :
>>
>>> Hello all,
>>> The function TopDocs.merge uses PriorityQueue in a pattern: pop, update
>>> value (ref.hitIndex++), add. JavaDocs for PriorityQueue.updateTop
>>> <http://grepcode.com/file/repo1.maven.org/maven2/org.apache.lucene/lucene-core/5.2.0/org/apache/lucene/util/PriorityQueue.java#204>
>>> say that using this function instead should be at least twice as fast.
>>> Would a patch like the one attached be acceptable? Should I create a JIRA
>>> issue for it?
>>> I tried comparing the time taken to run ant test before and after the
>>> patch was applied, but apparently it was affected by random factors more
>>> than it was affected by the patch, so I don't have any performance numbers
>>> to show if / how much it changed. Is there any standard way of benchmarking?
>>> Regards,
>>> Daniel
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [email protected]
>>> For additional commands, e-mail: [email protected]
>>
>>
>

Re: TopDocs.merge PriorityQueue usage

Reply via email to