[
https://issues.apache.org/jira/browse/SOLR-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hoss Man updated SOLR-1782:
---------------------------
Summary: stats.facet assumes FieldCache.StringIndex - fails horribly on
multivalued fields (was: unexpected statscomponent values)
Description: the StatsComponent assumes any field specified in the
stats.facet param can be faceted using FieldCache.DEFAULT.getStringIndex. This
can cause problems with a variety of field types, but in the case of
multivalued fields it can either cause erroneous false stats when the number of
distinct values is small, or it can cause ArrayIndexOutOfBoundsException when
the number of distinct values is greater then the number of documents. (was: I
wanted to understand the statscomponent better, so I setup a simple test index
with a few thousand docs. In my schema I have:
- an indexed multivalue sint field (StatsFacetField) that can contain values 0
thru 5 that I want to use as my stats.facet field.
- an indexed single value sint field (ValueOfOneField) that will always contain
the value 1 and that I want stats on for this test
When I execute the following query:
http://localhost:8080/solr/select?q=*:*&stats=true&stats.field=ValueOfOneField&stats.facet=StatsFacetField&rows=0&facet=on&facet.limit=10&facet.field=StatsFacetField
For this situation (*:*) I was expecting that the statscomponent Count/Sum
values for each possible value in StatsFacetField to match the facet values for
StatsFacetField. They don't. Some are close (ie 204 vs 214) while others are
way off (ie 230 vs 8000))
Updating issue summary and description based on the root cause
> stats.facet assumes FieldCache.StringIndex - fails horribly on multivalued
> fields
> ---------------------------------------------------------------------------------
>
> Key: SOLR-1782
> URL: https://issues.apache.org/jira/browse/SOLR-1782
> Project: Solr
> Issue Type: Bug
> Components: search
> Affects Versions: 1.4
> Environment: reproduced on Win2k3 using 1.5.0-dev solr ($Id:
> CHANGES.txt 906924 2010-02-05 12:43:11Z noble $)
> Reporter: Gerald DeConto
> Attachments: index.rar
>
>
> the StatsComponent assumes any field specified in the stats.facet param can
> be faceted using FieldCache.DEFAULT.getStringIndex. This can cause problems
> with a variety of field types, but in the case of multivalued fields it can
> either cause erroneous false stats when the number of distinct values is
> small, or it can cause ArrayIndexOutOfBoundsException when the number of
> distinct values is greater then the number of documents.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]