Márcio Furlani Carmona created SPARK-23308: ----------------------------------------------
Summary: ignoreCorruptFiles should not ignore retryable IOException Key: SPARK-23308 URL: https://issues.apache.org/jira/browse/SPARK-23308 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.2.1, 2.3.1 Reporter: Márcio Furlani Carmona When `spark.sql.files.ignoreCorruptFiles` is set it totally ignores any kind of RuntimeException or IOException, but some possible IOExceptions may happen even if the file is not corrupted. One example is the SocketTimeoutException which can be retried to possibly fetch the data without meaning the data is corrupted. See: https://github.com/apache/spark/blob/e30e2698a2193f0bbdcd4edb884710819ab6397c/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileScanRDD.scala#L163 -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org