Hi,
I made a comment to the following ticket:
https://issues.apache.org/jira/browse/SPARK-17647
I believe it is still broken on 2.2.0, about backslash escaping.
Can someone take a look?
Thanks,
Dong
--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
--
Hi,
We are running on Spark 2.2.1, generating parquet files, like the following
pseudo code
df.write.parquet(...)
We have recently noticed parquet file corruptions, when reading the parquet
in Spark or Presto, as the following:
Caused by: org.apache.parquet.io.ParquetDecodingException: Can not r
a recurrence? Can you share your experience?
Thanks,
Dong
From: Ryan Blue
Reply-To: "rb...@netflix.com"
Date: Monday, February 5, 2018 at 12:38 PM
To: Dong Jiang
Cc: Spark Dev List
Subject: Re: Corrupt parquet file
Dong,
We see this from time to time as well. In my experience, it
uot;
Date: Monday, February 5, 2018 at 1:34 PM
To: Dong Jiang
Cc: Spark Dev List
Subject: Re: Corrupt parquet file
We ensure the bad node is removed from our cluster and reprocess to replace the
data. We only see this once or twice a year, so it isn't a significant problem.
We've d
before, what do you do to
prevent a recurrence?
Thanks,
Dong
From: Ryan Blue
Reply-To: "rb...@netflix.com"
Date: Monday, February 5, 2018 at 12:46 PM
To: Dong Jiang
Cc: Spark Dev List
Subject: Re: Corrupt parquet file
If you can still access the logs, then you should be able to
back the entire
data set, and then copy from HDFS to S3. Any other thoughts?
From: Steve Loughran
Date: Monday, February 12, 2018 at 2:27 PM
To: "rb...@netflix.com"
Cc: Dong Jiang , Apache Spark Dev
Subject: Re: Corrupt parquet file
What failure mode is likely here?
As the uploads
Hi,
I opened a JIRA ticket https://issues.apache.org/jira/browse/SPARK-23549, I
don't know if anyone can take a look?
Spark SQL unexpected behavior when comparing timestamp to date
scala> spark.version
res1: String = 2.2.1
scala> spark.sql("select cast('2017-03-01 00:00:00' as timestamp) betw