[
https://issues.apache.org/jira/browse/SOLR-10634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16022210#comment-16022210
]
ASF subversion and git services commented on SOLR-10634:
--------------------------------------------------------
Commit 4f86bf6df8670c3f6d9ceb458be9f14df28b8aeb in lucene-solr's branch
refs/heads/branch_6x from [[email protected]]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=4f86bf6 ]
SOLR-10634: calc metrics in first phase if limit=-1 and no subfacets
> Move calculation of some aggregations to collection phase
> ---------------------------------------------------------
>
> Key: SOLR-10634
> URL: https://issues.apache.org/jira/browse/SOLR-10634
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Components: Facet Module
> Reporter: Yonik Seeley
> Attachments: SOLR-10634.patch, SOLR-10634.patch
>
>
> From http://markmail.org/message/pwgnt7iqxkzcnckh
> {quote}
> The current code is more optimized for finding the top K buckets from
> a total of N.
> When one asks to return the top 10 buckets when there are potentially
> millions of buckets, it makes sense to defer calculating other metrics
> for those buckets until we know which ones they are. After we
> identify the top 10 buckets, we calculate the domain for that bucket
> and use that to calculate the remaining metrics.
> The current method is obviously much slower when one is requesting
> *all* buckets. We might as well just calculate all metrics in the
> first pass rather than trying to defer them.
> {quote}
> So we should move aggregations from the second pass to the first pass under
> the following conditions:
> - no limit (or a high limit compared to the number of potential buckets?)
> - no sub-facets (or anything else) that will need the domain calculated anyway
> - aggregation is not really memory intensive per-slot (i.e. moving some
> calculations from per-bucket in the second phase, to all-buckets-in-parallel
> in the first phase could be really bad for peak memory usage)
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]