[ 
https://issues.apache.org/jira/browse/SOLR-8741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15170752#comment-15170752
 ] 

Yonik Seeley commented on SOLR-8741:
------------------------------------

IIRC numBuckets is using the same estimation algorithm used for "unique" 
described here: http://yonik.com/solr-count-distinct/
before hyperloglog got added.

We should prob add some way to use hll for numBuckets as well, but for now you 
may be able to work around by using hll directly yourself.

Example:
{code}
json.facet={
  numCat:"hll(cat)",
  categories: {
    type : terms,
    field : cat
  }
}'
{code}

That should work for the common case, but not for other cases like mincount=N 
(where N>1) for example, or for other domain switching techniques like block 
join.

> Json Facet API, numBuckets not returning real number of buckets.
> ----------------------------------------------------------------
>
>                 Key: SOLR-8741
>                 URL: https://issues.apache.org/jira/browse/SOLR-8741
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Pablo Anzorena
>
> Hi, using the json facet api I realized that the numBuckets is wrong. It is 
> not returning the right number of buckets. I have a dimension which 
> numBuckets says it has 1340, but when retrieving all the results it brings 
> 988. 
> FYI the field is of type string.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to