[ 
https://issues.apache.org/jira/browse/SOLR-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12877865#action_12877865
 ] 

Wojtek Piaseczny commented on SOLR-1782:
----------------------------------------

I'd like to contribute to solving this issue, but I'm not sure if I'm going 
down the right path. Here are the possible solutions I see:

1. Use UninvertedField for multi-valued facets in the StatsComponent. This 
would require a new method in UninvertedField: something like getValues(int 
docID). The problem with this is the big terms collection in UninvertedField... 
getting all values for a single document via big terms is expensive (have to 
iterate entire collection). 
2. Get facet values for the result set in the StatsComponent, then iterate 
through each value and get a new document set for each value, then iterate 
through each document in this new set and calculate stats. Sounds expensive.

Are there better options? 

> stats.facet assumes FieldCache.StringIndex - fails horribly on multivalued 
> fields
> ---------------------------------------------------------------------------------
>
>                 Key: SOLR-1782
>                 URL: https://issues.apache.org/jira/browse/SOLR-1782
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 1.4
>         Environment: reproduced on Win2k3 using 1.5.0-dev solr ($Id: 
> CHANGES.txt 906924 2010-02-05 12:43:11Z noble $)
>            Reporter: Gerald DeConto
>         Attachments: index.rar, SOLR-1782.test.patch
>
>
> the StatsComponent assumes any field specified in the stats.facet param can 
> be faceted using FieldCache.DEFAULT.getStringIndex.  This can cause problems 
> with a variety of field types, but in the case of multivalued fields it can 
> either cause erroneous false stats when the number of distinct values is 
> small, or it can cause ArrayIndexOutOfBoundsException when the number of 
> distinct values is greater then the number of documents.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to