[
https://issues.apache.org/jira/browse/SOLR-8988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249046#comment-15249046
]
Hoss Man commented on SOLR-8988:
--------------------------------
You've convinced me that i don't understand the point behind that existing
{{TODO: we could change this to 1...}} comment, but I still want to review the
code more thoroughly before i'm confident enough to concede your approach is
better in all cases.
That said: If you updated your patch to make it optional based on a param
w/some tests that randomly toggled the value (TestCloudPivotFacet,
DistributedFacetPivotLongTailTest would be good ones) then i'd probably be game
to commit even w/o being confident it's better in all cases, and we could worry
about changing the default later.
bq. However I think this line block should also be changed.
Hmm, yeah ... that does smell like it could be optimized.
(FWIW: we have a TrackingShardHandlerFactory that can be used in tests to make
assertions about what per-shard requests solr triggers. That can be used along
with some carefully crafted shards/docs/requests to verify that no unnecessary
refinement is done in cases where you don't expect it -- like with this
{{initialMincount}} vs {{initialMincount-1}} situation)
> Improve facet.method=fcs performance in SolrCloud
> -------------------------------------------------
>
> Key: SOLR-8988
> URL: https://issues.apache.org/jira/browse/SOLR-8988
> Project: Solr
> Issue Type: Improvement
> Reporter: Keith Laban
> Attachments: SOLR-8988.patch
>
>
> This relates to SOLR-8559 -- which improves the algorithm used by fcs
> faceting when {{facet.mincount=1}}
> This patch allows {{facet.mincount}} to be sent as 1 for distributed queries.
> As far as I can tell there is no reason to set {{facet.mincount=0}} for
> refinement purposes . After trying to make sense of all the refinement logic,
> I cant see how the difference between _no value_ and _value=0_ would have a
> negative effect.
> *Test perf:*
> - ~15million unique terms
> - query matches ~3million documents
> *Params:*
> {code}
> facet.mincount=1
> facet.limit=500
> facet.method=fcs
> facet.sort=count
> {code}
> *Average Time Per Request:*
> - Before patch: ~20seconds
> - After patch: <1 second
> *Note*: all tests pass and in my test, the output was identical before and
> after patch.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]