[
https://issues.apache.org/jira/browse/LUCENE-8525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16736342#comment-16736342
]
Armin Braun commented on LUCENE-8525:
-------------------------------------
{quote}There is a reason why this isn't a separate exception type in Java too –
very hard to predict and react to each and every type of filesystem problem.
{quote}
I'm not sure this is a valid analogy. Java will throw plain IOException for
things like running out of FDs or disk space and things and so on. Things that
are specific to the filesystem.
But it will also thrown FileNotFoundException, EOFException etc. where the
content of the filesystem isn't what was expected.
So from that angle it seems somewhat off, that Lucene would throw plain
IOException when it can't deserialize some bytes in e.g.
[https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/DataInput.java#L141]
which literally leaves the caller with interpreting the exception message.
Especially when, as was argued above for this example, the data could be read
from memory, putting the exception in a very different category from file
system issues.
I would agree, that handling EOFException in Lucene is a little closer, since
that could be a generic file system issue or simply corrupt data, but for
throwing on unexpected data in deserialization I think throwing plain
IOException is making the user's life needlessly hard here?
> throw more specific exception on data corruption
> ------------------------------------------------
>
> Key: LUCENE-8525
> URL: https://issues.apache.org/jira/browse/LUCENE-8525
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Vladimir Dolzhenko
> Priority: Major
>
> DataInput throws generic IOException if data looks odd
> [DataInput:141|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/DataInput.java#L141]
> there are other examples like
> [BufferedIndexInput:219|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/BufferedIndexInput.java#L219],
>
> [CompressionMode:226|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/codecs/compressing/CompressionMode.java#L226]
> and maybe
> [DocIdsWriter:81|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/util/bkd/DocIdsWriter.java#L81]
> That leads to some difficulties - see [elasticsearch
> #34322|https://github.com/elastic/elasticsearch/issues/34322]
> It would be better if it throws more specific exception.
> As a consequence
> [SegmentInfos.readCommit|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java#L281]
> violates its own contract
> {code:java}
> /**
> * @throws CorruptIndexException if the index is corrupt
> * @throws IOException if there is a low-level IO error
> */
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]