[
https://issues.apache.org/jira/browse/LUCENE-5015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13669082#comment-13669082
]
Rob Audenaerde edited comment on LUCENE-5015 at 5/29/13 8:12 AM:
-----------------------------------------------------------------
Time in ms.
{noformat}
MADQ 75% hits
Complements DISABLE Com. Complements DISABLE complements
#facets Takmi Noop Takmi Noop Takmi Noop Takmi Noop
1 999 433 1024 393 1239 541 969 432
2 2292 388 2877 512 2379 609 2489 457
3 2501 219 3228 413 2477 569 2590 434
4 3589 224 5052 392 3372 562 4093 503
5 4764 247 6863 493 4356 577 5103 533
{noformat}
{noformat}
SamplingParams sampleParams = new SamplingParams();
sampleParams.setMaxSampleSize( 5000 );
sampleParams.setMinSampleSize( 5000 );
sampleParams.setSamplingThreshold( 75000 );
sampleParams.setOversampleFactor( 1.0d );
{noformat}
was (Author: robau):
Time in ms.
{noformat}
MADQ 75% hits
Complements DISABLE Com. Complements DISABLE complements
Takmi Noop Takmi Noop Takmi Noop Takmi Noop
1 999 433 1024 393 1239 541 969 432
2 2292 388 2877 512 2379 609 2489 457
3 2501 219 3228 413 2477 569 2590 434
4 3589 224 5052 392 3372 562 4093 503
5 4764 247 6863 493 4356 577 5103 533
{noformat}
{noformat}
SamplingParams sampleParams = new SamplingParams();
sampleParams.setMaxSampleSize( 5000 );
sampleParams.setMinSampleSize( 5000 );
sampleParams.setSamplingThreshold( 75000 );
sampleParams.setOversampleFactor( 1.0d );
{noformat}
> Unexpected performance difference between SamplingAccumulator and
> StandardFacetAccumulator
> ------------------------------------------------------------------------------------------
>
> Key: LUCENE-5015
> URL: https://issues.apache.org/jira/browse/LUCENE-5015
> Project: Lucene - Core
> Issue Type: Bug
> Components: modules/facet
> Affects Versions: 4.3
> Reporter: Rob Audenaerde
> Assignee: Shai Erera
> Priority: Minor
> Attachments: LUCENE-5015.patch, LUCENE-5015.patch, LUCENE-5015.patch,
> LUCENE-5015.patch, LUCENE-5015.patch, LUCENE-5015.patch
>
>
> I have an unexpected performance difference between the SamplingAccumulator
> and the StandardFacetAccumulator.
> The case is an index with about 5M documents and each document containing
> about 10 fields. I created a facet on each of those fields. When searching to
> retrieve facet-counts (using 1 CountFacetRequest), the SamplingAccumulator is
> about twice as fast as the StandardFacetAccumulator. This is expected and a
> nice speed-up.
> However, when I use more CountFacetRequests to retrieve facet-counts for more
> than one field, the speeds of the SampingAccumulator decreases, to the point
> where the StandardFacetAccumulator is faster.
> {noformat}
> FacetRequests Sampling Standard
> 1 391 ms 1100 ms
> 2 531 ms 1095 ms
> 3 948 ms 1108 ms
> 4 1400 ms 1110 ms
> 5 1901 ms 1102 ms
> {noformat}
> Is this behaviour normal? I did not expect it, as the SamplingAccumulator
> needs to do less work?
> Some code to show what I do:
> {code}
> searcher.search( facetsQuery, facetsCollector );
> final List<FacetResult> collectedFacets =
> facetsCollector.getFacetResults();
> {code}
> {code}
> final FacetSearchParams facetSearchParams = new FacetSearchParams(
> facetRequests );
> FacetsCollector facetsCollector;
> if ( isSampled )
> {
> facetsCollector =
> FacetsCollector.create( new SamplingAccumulator( new
> RandomSampler(), facetSearchParams, searcher.getIndexReader(), taxo ) );
> }
> else
> {
> facetsCollector = FacetsCollector.create( FacetsAccumulator.create(
> facetSearchParams, searcher.getIndexReader(), taxo ) );
> {code}
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]