[
https://issues.apache.org/jira/browse/SOLR-8768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15180181#comment-15180181
]
Pablo Anzorena commented on SOLR-8768:
--------------------------------------
[~mariusneo]You are right, with the sample I posted the error can't be
reproduce.
Now with real data (the cardinality of seller_name is around 2000) this is the
response if I ask for the top 3:
{code}
{
"responseHeader": {
"status": 0,
"QTime": 1992,
"params": {
"q": "*:*",
"shards":
"localhost:8983/solr/sellers_2005,localhost:8983/solr/sellers_2006,localhost:8983/solr/sellers_2007",
"json.facet": "{\n top_sellers: {\n type: terms,\n field:
seller_name,\n limit: 3,\n offset: 0,\n sort: \"seller_measure
desc\",\n facet: {\n seller_measure: \"sum(seller_measure)\"\n }\n
}\n}",
"rows": "0",
"wt": "json"
}
},
"response": {
"numFound": 94641193,
"start": 0,
"maxScore": 1.0,
"docs": [
]
},
"facets": {
"count": 94641193,
"top_sellers": {
"buckets": [
{
"val": "Tyrion",
"count": 22067,
"seller_measure": 6.381640740799999E8
},
{
"val": "Jon",
"count": 9323,
"seller_measure": 4.376016594200097E8
},
{
"val": "PoorNed",
"count": 3714,
"seller_measure": 2.1381292140000007E8
}
]
}
}
}
{code}
Now look when I change the query to filter specifically those three
seller_names:
{code}
{
"responseHeader": {
"status": 0,
"QTime": 26,
"params": {
"q": "seller_name:(Tyrion Jon PoorNed)",
"shards":
"localhost:8983/solr/sellers_2005,localhost:8983/solr/sellers_2006,localhost:8983/solr/sellers_2007",
"json.facet": "{\n top_sellers: {\n type: terms,\n field:
seller_name,\n limit: 3,\n offset: 0,\n sort: \"seller_measure
desc\",\n facet: {\n seller_measure: \"sum(seller_measure)\"\n }\n
}\n}",
"rows": "0",
"wt": "json"
}
},
"response": {
"numFound": 37552,
"start": 0,
"maxScore": 2.4321828,
"docs": [
]
},
"facets": {
"count": 37552,
"top_sellers": {
"buckets": [
{
"val": "Tyrion",
"count": 24515,
"seller_measure": 6.436709089399998E8
},
{
"val": "Jon",
"count": 9323,
"seller_measure": 4.376016594200096E8
},
{
"val": "PoorNed",
"count": 3714,
"seller_measure": 2.1381292140000007E8
}
]
}
}
}
{code}
See the difference in the seller_measure of Tyrion? This happens (I think)
because making a ranking desc by seller_measure, Tyrion is in the position 1000
for the shards sellers_2005 and sellers_2006.
If I make the same request with limit 2000, Tyrion appears in the top 3 with
the correct measure, that is the sum of the three shards.
> Wrong behaviour in json facets
> ------------------------------
>
> Key: SOLR-8768
> URL: https://issues.apache.org/jira/browse/SOLR-8768
> Project: Solr
> Issue Type: Bug
> Components: Facet Module
> Reporter: Pablo Anzorena
>
> This bug is quite difficult to explain it, so I will first show it with an
> example and then explain it.
> I have a core splitted into three shards, let's call them 'sellers_2014',
> 'sellers_2015', 'sellers_2016'.
> The schema has the following fields:
> seller_name, string
> seller_measure, double
> seller_date, date
> With the following data.
> 'sellers_2014'
> Tyrion, 1, 2014-01-01T00:00:00Z
> Jon, 50, 2014-01-01T00:00:00Z
> PoorNed, 4, 2014-01-01T00:00:00Z
> 'sellers_2015'
> Tyrion, 100, 2015-01-01T00:00:00Z
> Jon, 50, 2015-01-01T00:00:00Z
> PoorNed, 4, 2015-01-01T00:00:00Z
> 'sellers_2016'
> Tyrion, 1, 2015-01-01T00:00:00Z
> Jon, 50, 2015-01-01T00:00:00Z
> PoorNed, 4, 2015-01-01T00:00:00Z
> Request:
> http://localhost:8983/solr/sellers_2016/select?q=*:*&shards=localhost:8983/solr/sellers_2014,localhost:8983/solr/sellers_2015,localhost:8983/solr/sellers_2016&json.facet=
> {code}
> {
> top_sellers: {
> type: terms,
> field: seller_name,
> limit: 2,
> offset: 0,
> sort: "seller_measure desc",
> facet: {
> seller_measure: "sum(seller_measure)",
> seller_dates: {
> type: range,
> field: seller_date,
> start: "2014-01-01T00:00:00Z",
> end: "2016-12-31T00:00:00Z",
> gap: "+1YEARS",
> facet: {
> seller_measure: "sum(seller_measure)"
> }
> }
> }
> }
> }
> {code}
> So... With the request I want to know the top 2 sellers across the three
> shards and for each seller, their seller_measure for each year.
> The response I'm getting is:
> {code}
> "val": "Jon",
> "count": 3,
> "seller_measure": 150,
> "seller_dates": {
> "buckets": [
> {
> "val": "2014-01-01T00:00:00Z",
> "count": 1,
> "seller_measure": 50
> },
> {
> "val": "2015-01-01T00:00:00Z",
> "count": 1,
> "seller_measure": 50
> },
> {
> "val": "2016-01-01T00:00:00Z",
> "count": 1,
> "seller_measure": 50
> }
> ]
> },
> "val": "Tyrion",
> "count": 3,
> "seller_measure": 102,
> "seller_dates": {
> "buckets": [
> {
> "val": "2015-01-01T00:00:00Z",
> "count": 1,
> "seller_measure": 100
> }
> ]
> }
> {code}
> which is incorrect, because the two buckets of 2014 and 2016 in Tyrion are
> missing.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]