Re: Skip Corrupted Parquet blocks / footer.

khyati Tue, 03 Jan 2017 11:04:44 -0800

Hi Reynold Xin,

I tried setting spark.sql.files.ignoreCorruptFiles = true by using commands,


val sqlContext =new org.apache.spark.sql.hive.HiveContext(sc)

sqlContext.setConf("spark.sql.files.ignoreCorruptFiles","true") /
sqlContext.sql("set spark.sql.files.ignoreCorruptFiles=true")

but still getting error while reading parquet files using 
val newDataDF =
sqlContext.read.parquet("/data/tempparquetdata/corruptblock.0","/data/tempparquetdata/data1.parquet")

Error: ERROR executor.Executor: Exception in task 0.0 in stage 4.0 (TID 4)
java.io.IOException: Could not read footer: java.lang.RuntimeException:
hdfs://192.168.1.53:9000/data/tempparquetdata/corruptblock.0 is not a
Parquet file. expected magic number at tail [80, 65, 82, 49] but found [65,
82, 49, 10]
        at
org.apache.parquet.hadoop.ParquetFileReader.readAllFootersInParallel(ParquetFileReader.java:248)


Please let me know if I am missing anything.




--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Skip-Corrupted-Parquet-blocks-footer-tp20418p20433.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: Skip Corrupted Parquet blocks / footer.

Reply via email to