Michael Gibney created SOLR-16144: ------------------------------------- Summary: Don't internally round [foreground|background]_popularity values in RelatednessAgg Key: SOLR-16144 URL: https://issues.apache.org/jira/browse/SOLR-16144 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Components: Facet Module Affects Versions: main (10.0) Reporter: Michael Gibney
The "relatedness" facet function supports the concept of {{foreground_popularity}} and {{background_popularity}} -- i.e., the cardinality of the intersection of bucket domain with the foreground and background sets (respectively), each normalized with respect to background set cardinality. The logic appears to be: # To provide clients with context of computed relatedness values # To preemptively (optionally) screen out "noise" from low-frequency terms via the {{min_popularity}} function parameter. For both purposes, popularity values are currently rounded to 5 digits. This issue proposes that although rounding to 5 digits makes sense for the _first_ case (providing context to clients), this arbitrary truncation does not make sense as currently implemented for internally evaluating threshold pop values for bucket inclusion. Consider the case of a high-cardinality field with a relatively large background set and a selective foreground set. For {{|background_set| = 2,000,000}} and a foreground set of cardinality 9, even a bucket with a domain that exactly matches the foreground set would be screened out, for _any_ explicit setting of {{min_popularity}}. This behavior is due to where the rounding takes place (internally, upon initial {{computeDerivedValues()}}). It is further problematic that {{RelatednessAgg}} will currently accept {{min_popularity < 0.00001}}, which would be guaranteed to exclude _all_ buckets. -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org