[
https://issues.apache.org/jira/browse/SOLR-6581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Joel Bernstein updated SOLR-6581:
---------------------------------
Attachment: SOLR-6581.patch
Latest work including test cases for collapsing on numeric field. Not all
tests are passing yet.
> Prepare CollapsingQParserPlugin and ExpandComponent for 5.0
> -----------------------------------------------------------
>
> Key: SOLR-6581
> URL: https://issues.apache.org/jira/browse/SOLR-6581
> Project: Solr
> Issue Type: Bug
> Reporter: Joel Bernstein
> Assignee: Joel Bernstein
> Priority: Minor
> Fix For: 5.0
>
> Attachments: SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch,
> SOLR-6581.patch, SOLR-6581.patch, renames.diff
>
>
> *Background*
> The 4x implementation of the CollapsingQParserPlugin and the ExpandComponent
> are optimized to work with a top level FieldCache. Top level FieldCaches have
> a very fast docID to top-level ordinal lookup. Fast access to the top-level
> ordinals allows for very high performance field collapsing on high
> cardinality fields.
> LUCENE-5666 unified the DocValues and FieldCache api's so that the top level
> FieldCache is no longer in regular use. Instead all top level caches are
> accessed through MultiDocValues.
> There are some major advantages of using the MultiDocValues rather then a top
> level FieldCache. But there is one disadvantage, the lookup from docId to
> top-level ordinals is slower using MultiDocValues.
> My testing has shown that *after optimizing* the CollapsingQParserPlugin code
> to use MultiDocValues, the performance drop is around 100%. For some use
> cases this performance drop is a blocker.
> *What About Faceting?*
> String faceting also relies on the top level ordinals. Is faceting
> performance affected also? My testing has shown that the faceting performance
> is affected much less then collapsing.
> One possible reason for this may be that field collapsing is memory bound and
> faceting is not. So the additional memory accesses needed for MultiDocValues
> affects field collapsing much more then faceting.
> *Proposed Solution*
> The proposed solution is to have the default Collapse and Expand algorithm
> use MultiDocValues, but to provide an option to use a top level FieldCache if
> the performance of MultiDocValues is a blocker.
> The proposed mechanism for switching to the FieldCache would be a new "hint"
> parameter. If the hint parameter is set to "FAST_QUERY" then the top-level
> FieldCache would be used for both Collapse and Expand.
> Example syntax:
> {code}
> fq={!collapse field=x hint=FAST_QUERY}
> {code}
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]