[ 
https://issues.apache.org/jira/browse/SOLR-8988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249046#comment-15249046
 ] 

Hoss Man commented on SOLR-8988:
--------------------------------

You've convinced me that i don't understand the point behind that existing 
{{TODO: we could change this to 1...}} comment, but I still want to review the 
code more thoroughly before i'm confident enough to concede your approach is 
better in all cases.

That said: If you updated your patch to make it optional based on a param 
w/some tests that randomly toggled the value (TestCloudPivotFacet, 
DistributedFacetPivotLongTailTest would be good ones) then i'd probably be game 
to commit even w/o being confident it's better in all cases, and we could worry 
about changing the default later.

bq. However I think this line block should also be changed.

Hmm, yeah ... that does smell like it could be optimized.

(FWIW: we have a TrackingShardHandlerFactory that can be used in tests to make 
assertions about what per-shard requests solr triggers. That can be used along 
with some carefully crafted shards/docs/requests to verify that no unnecessary 
refinement is done in cases where you don't expect it -- like with this 
{{initialMincount}} vs {{initialMincount-1}} situation)

> Improve facet.method=fcs performance in SolrCloud
> -------------------------------------------------
>
>                 Key: SOLR-8988
>                 URL: https://issues.apache.org/jira/browse/SOLR-8988
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Keith Laban
>         Attachments: SOLR-8988.patch
>
>
> This relates to SOLR-8559 -- which improves the algorithm used by fcs 
> faceting when {{facet.mincount=1}}
> This patch allows {{facet.mincount}} to be sent as 1 for distributed queries. 
> As far as I can tell there is no reason to set {{facet.mincount=0}} for 
> refinement purposes . After trying to make sense of all the refinement logic, 
> I cant see how the difference between _no value_ and _value=0_ would have a 
> negative effect.
> *Test perf:*
> - ~15million unique terms
> - query matches ~3million documents
> *Params:*
> {code}
> facet.mincount=1
> facet.limit=500
> facet.method=fcs
> facet.sort=count
> {code}
> *Average Time Per Request:*
> - Before patch:  ~20seconds
> - After patch: <1 second
> *Note*: all tests pass and in my test, the output was identical before and 
> after patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to