[jira] [Commented] (SOLR-12878) FacetFieldProcessorByHashDV is reconstructing FieldInfos on every instantiation

Tim Underwood (JIRA) Fri, 02 Nov 2018 08:43:16 -0700


    [ 
https://issues.apache.org/jira/browse/SOLR-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16673290#comment-16673290
 ]


Tim Underwood commented on SOLR-12878:
--------------------------------------

Sure.  I've updated the pull request with what I'm currently playing with:  
[https://github.com/apache/lucene-solr/pull/473]

There are currently 3 commits in there:

1 - The original FacetFieldProcessorByHashDV.java change to avoid calling 
getSlowAtomicReader

2 - The change requested by [~dsmiley] to move the caching of FieldInfos from 
SolrIndexSearcher to 

SlowCompositeReaderWrapper

3 - Adding a check in TestUtil.checkReader to verify that 
LeafReader.getFieldInfos() returns a cached copy along with the changes 
required to make that pass.  Specifically there are several places that 
construct an empty FieldInfos instance so I just created a static 
FieldInfos.EMPTY instance that can be referenced.  Also, MemoryIndexReader 
needed to be modified to cache a copy of its FieldInfos.  The constructor was 
already looping over the fields so I just added it there (vs creating it 
lazily).

 

What are your thoughts on #3?  Is it a good idea to require LeafReader 
instances to cache their FieldInfos?

It seems like something like this is a common pattern across the codebase (both 
Lucene and Solr):
{code:java}
reader.getFieldInfos().fieldInfo(field)
{code}
So it might be desirable to make sure FieldInfos isn't always being recomputed?

 

I'm still verifying that I've checked that all LeafReader.getFieldInfos() 
implementations perform the caching and that all tests pass (I'm seeing a few 
failures but they seem unrelated).

 

 

 

 

> FacetFieldProcessorByHashDV is reconstructing FieldInfos on every 
> instantiation
> -------------------------------------------------------------------------------
>
>                 Key: SOLR-12878
>                 URL: https://issues.apache.org/jira/browse/SOLR-12878
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: Facet Module
>    Affects Versions: 7.5
>            Reporter: Tim Underwood
>            Priority: Major
>              Labels: performance
>             Fix For: 7.6, master (8.0)
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> The FacetFieldProcessorByHashDV constructor is currently calling:
> {noformat}
> FieldInfo fieldInfo = 
> fcontext.searcher.getSlowAtomicReader().getFieldInfos().fieldInfo(sf.getName());
> {noformat}
> Which is reconstructing FieldInfos each time.  Simply switching it to:
> {noformat}
> FieldInfo fieldInfo = 
> fcontext.searcher.getFieldInfos().fieldInfo(sf.getName());
> {noformat}
>  
> causes it to use the cached version of FieldInfos in the SolrIndexSearcher.
> On my index the FacetFieldProcessorByHashDV is 2-3 times slower than the 
> legacy facets without this fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-12878) FacetFieldProcessorByHashDV is reconstructing FieldInfos on every instantiation

Reply via email to