[ 
https://issues.apache.org/jira/browse/LUCENE-8525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16740186#comment-16740186
 ] 

Simon Willnauer commented on LUCENE-8525:
-----------------------------------------

I do agree with [~rcmuir] here. There is not much to do in terms of detecting 
this particular problem on DataInput and friends. One way to improve this would 
certainly be the wording on the java doc. We can just clarify that detecting 
_CorruptIndexException_ is best effort. 
Another idea is to checksum the entire file before we read the commit we can 
either do this on the Elasticsearch end or improve _SegmentInfos#readCommit_ . 
Reading this file twice isn't a big deal I guess.

> throw more specific exception on data corruption
> ------------------------------------------------
>
>                 Key: LUCENE-8525
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8525
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Vladimir Dolzhenko
>            Priority: Major
>
> DataInput throws generic IOException if data looks odd
> [DataInput:141|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/DataInput.java#L141]
> there are other examples like 
> [BufferedIndexInput:219|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/BufferedIndexInput.java#L219],
>  
> [CompressionMode:226|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/codecs/compressing/CompressionMode.java#L226]
>  and maybe 
> [DocIdsWriter:81|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/util/bkd/DocIdsWriter.java#L81]
> That leads to some difficulties - see [elasticsearch 
> #34322|https://github.com/elastic/elasticsearch/issues/34322]
> It would be better if it throws more specific exception.
> As a consequence 
> [SegmentInfos.readCommit|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java#L281]
>  violates its own contract
> {code:java}
> /**
>    * @throws CorruptIndexException if the index is corrupt
>    * @throws IOException if there is a low-level IO error
>    */
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to