[
https://issues.apache.org/jira/browse/LUCENE-7391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15389646#comment-15389646
]
Martijn van Groningen commented on LUCENE-7391:
-----------------------------------------------
> is it part of the contract that fields() should only return indexed fields
> then?
Yes.
I think David's fix is the easiest here. Computing this count each time fields
is invoked is less of an overhead compared what happens now when building
{{MemoryFields}}. Since that count is computed each time, I think you shouldn't
worry about caching or cache invalidation.
The concurrency aspect of the MemoryIndex is in my opinion a bit of a mess. It
allows fields to be added to be made after a reader has been created, except
when the freeze method is invoked (and then it should be able to be used from
many threads). I think the MemoryIndex class itself should be kind of a builder
that just returns an IndexReader and shouldn't be able to be used after an
IndexReader instance has been made.
> MemoryIndexReader.fields() performance regression
> -------------------------------------------------
>
> Key: LUCENE-7391
> URL: https://issues.apache.org/jira/browse/LUCENE-7391
> Project: Lucene - Core
> Issue Type: Bug
> Reporter: Steve Mason
> Attachments: LUCENE-7391.patch
>
>
> While upgrading our codebase from Lucene 4 to Lucene 6 we found a significant
> performance regression - a 5x slowdown
> On profiling the code, the method MemoryIndexReader.fields() shows up as one
> of the hottest methods
> Looking at the method, it just creates a copy of the inner {{fields}} Map
> before passing it to {{MemoryFields}}. It does this so that it can filter out
> fields with {{numTokens <= 0}}.
> The simplest "fix" would be to just remove the copying of the map completely,
> and pass {{fields}} directly to {{MemoryFields}}. It's simple and removes
> any slowdown caused by this method. It does potentially change behaviour
> though, but none of the unit tests seem to test that behaviour so I wonder
> whether it's necessary (I looked at the original ticket LUCENE-7091 that
> introduced this code, I can't find much in way of an explanation). I'm going
> to attach a patch to this effect anyway and we can take things from there
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]