[ 
https://issues.apache.org/jira/browse/LUCENE-5618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-5618:
-------------------------------

    Attachment: LUCENE-5618.patch

Patch addresses the following:

* Modifies Lucene45/42DocValuesProducer to assert that all encoded fields exist 
in the FieldInfos.

* Simplifies ReaderAndUpdates.writeFieldUpdates readability by breaking out the 
updates to separate methods.

* Each DocValues field's updates are written to separate files.

* Adds SegmentCommitInfo.docValuesGen, separate from fieldInfosGen.

* Fixes LUCENE-5636 by tracking per-field updates files, as well as fieldInfos 
files.
** per-generation update files are kept as deprecated, needed for 4.6-4.8 
indexes back-compat. These become empty after the segment is merged.

* Improved {{testDeleteUnusedUpdatesFiles}} to cover two fields' updates (this 
exposes the bug on LUCENE-5636).

In terms of backwards compatibility, indexes between 4.6-4.8 will continue to 
reference unneeded files until the segment is merged. This is impossible to fix 
without breaking back-compat or introduce weird hacks which assume the default 
codec. This is not terrible though, since the number of unneeded-but-referenced 
files is limited by the number of DV fields the app has updated.

I'd appreciate a review on this. Before I commit it though, I want to take care 
of LUCENE-5619, so we're sure the back-compat logic in this patch indeed works.

> DocValues updates send wrong fieldinfos to codec producers
> ----------------------------------------------------------
>
>                 Key: LUCENE-5618
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5618
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Robert Muir
>            Priority: Blocker
>             Fix For: 4.9
>
>         Attachments: LUCENE-5618.patch
>
>
> Spinoff from LUCENE-5616.
> See the example there, docvalues readers get a fieldinfos, but it doesn't 
> contain the correct ones, so they have invalid field numbers at read time.
> This should really be fixed. Maybe a simple solution is to not write 
> "batches" of fields in updates but just have only one field per gen? 
> This removes many-many relationships and would make things easy to understand.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to