subject:"Re\: Dataframe, Spark SQL \- Drops First 8 Characters of String on Amazon EMR"

Re: Dataframe, Spark SQL - Drops First 8 Characters of String on Amazon EMR

2016-01-29 Thread Daniel Darabos

Hi Andrew, If you still see this with Spark 1.6.0, it would be very helpful if you could file a bug about it at https://issues.apache.org/jira/browse/SPARK with as much detail as you can. This issue could be a nasty source of silent data corruption in a case where some intermediate data loses 8 ch

Re: Dataframe, Spark SQL - Drops First 8 Characters of String on Amazon EMR

2016-01-28 Thread Jonathan Kelly

Just FYI, Spark 1.6 was released on emr-4.3.0 a couple days ago: https://aws.amazon.com/blogs/aws/emr-4-3-0-new-updated-applications-command-line-export/ On Thu, Jan 28, 2016 at 7:30 PM Andrew Zurn wrote: > Hey Daniel, > > Thanks for the response. > > After playing around for a bit, it looks like

Re: Dataframe, Spark SQL - Drops First 8 Characters of String on Amazon EMR

2016-01-28 Thread Andrew Zurn

Hey Daniel, Thanks for the response. After playing around for a bit, it looks like it's probably the something similar to the first situation you mentioned, with the Parquet format causing issues. Both programmatically created dataset and a dataset pulled off the internet (rather than out of S3 a

Re: Dataframe, Spark SQL - Drops First 8 Characters of String on Amazon EMR

2016-01-26 Thread Daniel Darabos

Have you tried setting spark.emr.dropCharacters to a lower value? (It defaults to 8.) :) Just joking, sorry! Fantastic bug. What data source do you have for this DataFrame? I could imagine for example that it's a Parquet file and on EMR you are running with two wrong version of the Parquet librar

Re: Dataframe, Spark SQL - Drops First 8 Characters of String on Amazon EMR

2016-01-25 Thread awzurn

Sorry for the bump, but wondering if anyone else has seen this before. We're hoping to either resolve this soon, or move on with further steps to move this into an issue. Thanks in advance, Andrew Zurn -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Datafr

Re: Dataframe, Spark SQL - Drops First 8 Characters of String on Amazon EMR

Re: Dataframe, Spark SQL - Drops First 8 Characters of String on Amazon EMR

Re: Dataframe, Spark SQL - Drops First 8 Characters of String on Amazon EMR

Re: Dataframe, Spark SQL - Drops First 8 Characters of String on Amazon EMR

Re: Dataframe, Spark SQL - Drops First 8 Characters of String on Amazon EMR

5 matches

Site Navigation

Mail list logo

Footer information