Re: Skip Corrupted Parquet blocks / footer.

Ryan Blue Tue, 03 Jan 2017 12:14:50 -0800

Khyati,

Are you using Spark 2.1? The usual entry point for Spark 2.x is spark
rather than sqlContext.


rb


On Tue, Jan 3, 2017 at 11:03 AM, khyati <khyati.s...@guavus.com> wrote:

> Hi Reynold Xin,
>
> I tried setting spark.sql.files.ignoreCorruptFiles = true by using
> commands,
>
> val sqlContext =new org.apache.spark.sql.hive.HiveContext(sc)
>
> sqlContext.setConf("spark.sql.files.ignoreCorruptFiles","true") /
> sqlContext.sql("set spark.sql.files.ignoreCorruptFiles=true")
>
> but still getting error while reading parquet files using
> val newDataDF =
> sqlContext.read.parquet("/data/tempparquetdata/corruptblock.0","/data/
> tempparquetdata/data1.parquet")
>
> Error: ERROR executor.Executor: Exception in task 0.0 in stage 4.0 (TID 4)
> java.io.IOException: Could not read footer: java.lang.RuntimeException:
> hdfs://192.168.1.53:9000/data/tempparquetdata/corruptblock.0 is not a
> Parquet file. expected magic number at tail [80, 65, 82, 49] but found [65,
> 82, 49, 10]
>         at
> org.apache.parquet.hadoop.ParquetFileReader.readAllFootersInParallel(
> ParquetFileReader.java:248)
>
>
> Please let me know if I am missing anything.
>
>
>
>
> --
> View this message in context: http://apache-spark-
> developers-list.1001551.n3.nabble.com/Skip-Corrupted-
> Parquet-blocks-footer-tp20418p20433.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>


-- 
Ryan Blue
Software Engineer
Netflix

Re: Skip Corrupted Parquet blocks / footer.

Reply via email to