[
https://issues.apache.org/jira/browse/SOLR-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16673290#comment-16673290
]
Tim Underwood commented on SOLR-12878:
--------------------------------------
Sure. I've updated the pull request with what I'm currently playing with:
[https://github.com/apache/lucene-solr/pull/473]
There are currently 3 commits in there:
1 - The original FacetFieldProcessorByHashDV.java change to avoid calling
getSlowAtomicReader
2 - The change requested by [~dsmiley] to move the caching of FieldInfos from
SolrIndexSearcher to
SlowCompositeReaderWrapper
3 - Adding a check in TestUtil.checkReader to verify that
LeafReader.getFieldInfos() returns a cached copy along with the changes
required to make that pass. Specifically there are several places that
construct an empty FieldInfos instance so I just created a static
FieldInfos.EMPTY instance that can be referenced. Also, MemoryIndexReader
needed to be modified to cache a copy of its FieldInfos. The constructor was
already looping over the fields so I just added it there (vs creating it
lazily).
What are your thoughts on #3? Is it a good idea to require LeafReader
instances to cache their FieldInfos?
It seems like something like this is a common pattern across the codebase (both
Lucene and Solr):
{code:java}
reader.getFieldInfos().fieldInfo(field)
{code}
So it might be desirable to make sure FieldInfos isn't always being recomputed?
I'm still verifying that I've checked that all LeafReader.getFieldInfos()
implementations perform the caching and that all tests pass (I'm seeing a few
failures but they seem unrelated).
> FacetFieldProcessorByHashDV is reconstructing FieldInfos on every
> instantiation
> -------------------------------------------------------------------------------
>
> Key: SOLR-12878
> URL: https://issues.apache.org/jira/browse/SOLR-12878
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Components: Facet Module
> Affects Versions: 7.5
> Reporter: Tim Underwood
> Priority: Major
> Labels: performance
> Fix For: 7.6, master (8.0)
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> The FacetFieldProcessorByHashDV constructor is currently calling:
> {noformat}
> FieldInfo fieldInfo =
> fcontext.searcher.getSlowAtomicReader().getFieldInfos().fieldInfo(sf.getName());
> {noformat}
> Which is reconstructing FieldInfos each time. Simply switching it to:
> {noformat}
> FieldInfo fieldInfo =
> fcontext.searcher.getFieldInfos().fieldInfo(sf.getName());
> {noformat}
>
> causes it to use the cached version of FieldInfos in the SolrIndexSearcher.
> On my index the FacetFieldProcessorByHashDV is 2-3 times slower than the
> legacy facets without this fix.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]