[
https://issues.apache.org/jira/browse/LUCENE-8525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16740186#comment-16740186
]
Simon Willnauer commented on LUCENE-8525:
-----------------------------------------
I do agree with [~rcmuir] here. There is not much to do in terms of detecting
this particular problem on DataInput and friends. One way to improve this would
certainly be the wording on the java doc. We can just clarify that detecting
_CorruptIndexException_ is best effort.
Another idea is to checksum the entire file before we read the commit we can
either do this on the Elasticsearch end or improve _SegmentInfos#readCommit_ .
Reading this file twice isn't a big deal I guess.
> throw more specific exception on data corruption
> ------------------------------------------------
>
> Key: LUCENE-8525
> URL: https://issues.apache.org/jira/browse/LUCENE-8525
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Vladimir Dolzhenko
> Priority: Major
>
> DataInput throws generic IOException if data looks odd
> [DataInput:141|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/DataInput.java#L141]
> there are other examples like
> [BufferedIndexInput:219|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/BufferedIndexInput.java#L219],
>
> [CompressionMode:226|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/codecs/compressing/CompressionMode.java#L226]
> and maybe
> [DocIdsWriter:81|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/util/bkd/DocIdsWriter.java#L81]
> That leads to some difficulties - see [elasticsearch
> #34322|https://github.com/elastic/elasticsearch/issues/34322]
> It would be better if it throws more specific exception.
> As a consequence
> [SegmentInfos.readCommit|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java#L281]
> violates its own contract
> {code:java}
> /**
> * @throws CorruptIndexException if the index is corrupt
> * @throws IOException if there is a low-level IO error
> */
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]