[
https://issues.apache.org/jira/browse/LUCENE-5578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13960525#comment-13960525
]
Adrien Grand commented on LUCENE-5578:
--------------------------------------
I quickly discussed with Robert about a way to check for such issues by
checking that the stored field files are stable through merges (eg. you merge
into 1 segment twice and check that you got the same output every time). We
could run this test on all index formats for which such a property is expected
(stored fields, term vectors, postings, ...).
> Stored fields might accumulate checksums on merges
> --------------------------------------------------
>
> Key: LUCENE-5578
> URL: https://issues.apache.org/jira/browse/LUCENE-5578
> Project: Lucene - Core
> Issue Type: Bug
> Reporter: Adrien Grand
> Assignee: Adrien Grand
> Priority: Blocker
> Fix For: 4.8
>
> Attachments: LUCENE-5578.patch
>
>
> The bulk merge operation of our stored fields format is optimized in order to
> avoid decompressing data when not needed. In order to know the offset of the
> end of the current block, it either consults the stored fields index, or uses
> {{fieldsStream.length()}} for the last chunk.
> However, we just added checksums at the end of index files, so it might
> currently copy the current checksum in addition to the last chunk, and then
> write a new checksum.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]