[ 
https://issues.apache.org/jira/browse/LUCENE-7456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15526416#comment-15526416
 ] 

Julien MASSENET commented on LUCENE-7456:
-----------------------------------------

I agree that being this sneaky is not ideal, but the only alternative I see 
would be change the Postings API.

In this patch, I tried to keep modifications constrained to the 
{{org.apache.lucene.codecs.perfield}} package, but I can give a shot at trying 
to a come up with a cleaner implementation that updates the other APIs if you 
want.

I've uploaded an updated version of the patch which does not remove the 
sneakiness but makes the {{PerFieldMergeState}} more robust:
* The {{FilterFieldInfos}} class now correctly computes and exposes the 
{{hasXXX}} properties.
* Calls to {{FilterFieldInfos.fieldInfo()}} and 
{{FilterFieldsProducer.terms()}} now throw {{IllegalArgumentException}} if 
called with fields outside if the current merge context
* I've tweaked the unit tests to work with the latest 6.2.1 since the 
{{Legacy*}} field types are not accessible in this module anymore.

As for why we're overriding merge() calls in our codec:
* We're using a customized codec to emulate nested documents.
* It works with the same idea as BlockJoin, but is less generic (it's tailored 
to our use case).
* The main difference is that the maxDoc() for segments remain equal to the 
number of "parent" documents, with only the nested fields having larger posting 
lists.
* For it to work, when merging, we need to rebase correctly the "docIds" for 
the nested documents (using the same idea as the docMap in the general use 
case).

> PerField(DocValues|Postings)Format do not call the per-field merge methods
> --------------------------------------------------------------------------
>
>                 Key: LUCENE-7456
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7456
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/codecs
>    Affects Versions: 6.2.1
>            Reporter: Julien MASSENET
>         Attachments: LUCENE-7456-v2.patch, LUCENE-7456.patch
>
>
> While porting some old codec code from Lucene 4.3.1, I couldn't get the 
> per-field formats to call upon the per-field merge methods; the default merge 
> method was always being called.
> I think this is a side-effect of LUCENE-5894.
> Attached is a patch with a test that reproduces the error and an associated 
> fix that pass the unit tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to