[
https://issues.apache.org/jira/browse/SOLR-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12926290#action_12926290
]
Yonik Seeley commented on SOLR-2068:
------------------------------------
Going back over my old notes on how to efficiently do a string field
per-segment:
Phase1:
- Basically, hash based on ord (or a direct index lookup if the # of ords is
small enough). We don't look up the value of the string at this point.
- When a segment changes, we need to convert the ords from the old segment to
the new segment (i.e. look up it's value in the old segment, and find the ord
of that in the new segment).
- if the group value is not found in the new segment, the remove it from the
hash. Keep it in the ordered map since it can still be pushed out by other
insertions.
Phase 2:
- at the start of each segment, look up the ords for the values and hash the
group based on that ord (or leave it out of the hash if it didn't exist in that
segment).
Martijn's optimization in SOLR-2205 probably made Phase1 less important (except
if there are very few unique groups), so perhaps we should start with Phase2
first.
> Search Grouping: collapse by string specialization
> --------------------------------------------------
>
> Key: SOLR-2068
> URL: https://issues.apache.org/jira/browse/SOLR-2068
> Project: Solr
> Issue Type: Sub-task
> Reporter: Yonik Seeley
>
> Create specialized implementations for collapsing by an indexed string field.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]