Re: CSV write to S3 failing silently with partial completion

2017-09-27 Thread Mcclintic, Abbi
requirement for our data and wouldn’t solve the problem for our data used outside of Redshift. Hope that helps someone else out if you hit the same issue. -Abbi From: Gourav Sengupta Date: Monday, September 11, 2017 at 6:32 AM To: "Mcclintic, Abbi" Cc: user Subject: Re: CSV write to

Re: CSV write to S3 failing silently with partial completion

2017-09-11 Thread Gourav Sengupta
Hi, Can you please let me know the following: 1. Why are you using JAVA? 2. The way you are creating the SPARK cluster 3. The way you are initiating SPARK session or context 4. Are you able to query the data that is written to S3 using a SPARK dataframe and validate that the number of rows in the

Re: CSV write to S3 failing silently with partial completion

2017-09-08 Thread Steve Loughran
On 7 Sep 2017, at 18:36, Mcclintic, Abbi mailto:ab...@amazon.com>> wrote: Thanks all – couple notes below. Generally all our partitions are of equal size (ie on a normal day in this particular case I see 10 equally sized partitions of 2.8 GB). We see the problem with repartitioning and withou

Re: CSV write to S3 failing silently with partial completion

2017-09-07 Thread Mcclintic, Abbi
y not use the JDBC driver? -Original Message- From: abbim [mailto:ab...@amazon.com] Sent: Thursday, September 07, 2017 1:02 AM To: user@spark.apache.org Subject: CSV write to S3 failing silently with partial completion Hi all, My te

Re: CSV write to S3 failing silently with partial completion

2017-09-07 Thread Patrick Alwell
BC driver? -Original Message- From: abbim [mailto:ab...@amazon.com] Sent: Thursday, September 07, 2017 1:02 AM To: user@spark.apache.org Subject: CSV write to S3 failing silently with partial completion Hi all, My team has been experiencing a

RE: CSV write to S3 failing silently with partial completion

2017-09-07 Thread JG Perrin
...@amazon.com] Sent: Thursday, September 07, 2017 1:02 AM To: user@spark.apache.org Subject: CSV write to S3 failing silently with partial completion Hi all, My team has been experiencing a recurring unpredictable bug where only a partial write to CSV in S3 on one partition of our Dataset is performed. For

CSV write to S3 failing silently with partial completion

2017-09-06 Thread abbim
Hi all, My team has been experiencing a recurring unpredictable bug where only a partial write to CSV in S3 on one partition of our Dataset is performed. For example, in a Dataset of 10 partitions written to CSV in S3, we might see 9 of the partitions as 2.8 GB in size, but one of them as 1.6 GB. H