Spark 2.0 issue

2016-09-29 Thread Ashish Shrowty
If I try to inner-join two dataframes which originated from the same initial dataframe that was loaded using spark.sql() call, it results in an error - // reading from Hive .. the data is stored in Parquet format in Amazon S3 val d1 = spark.sql("select * from ") val df1 = d1.groupBy("

Re: Spark 1.5.1+Hadoop2.6 .. unable to write to S3 (HADOOP-12420)

2015-10-22 Thread Ashish Shrowty
Thanks Steve. I built it from source. On Thu, Oct 22, 2015 at 4:01 PM Steve Loughran wrote: > > > On 22 Oct 2015, at 15:12, Ashish Shrowty > wrote: > > > > I understand that there is some incompatibility with the API between > Hadoop > > 2.6/2.7 and Am

Spark 1.5.1+Hadoop2.6 .. unable to write to S3 (HADOOP-12420)

2015-10-22 Thread Ashish Shrowty
I understand that there is some incompatibility with the API between Hadoop 2.6/2.7 and Amazon AWS SDK where they changed a signature of com.amazonaws.services.s3.transfer.TransferManagerConfiguration.setMultipartUploadThreshold. The JIRA indicates that this would be fixed in Hadoop 2.8. (https://

Re: Spark shell and StackOverFlowError

2015-08-31 Thread Ashish Shrowty
he above step. > > Cheers > > On Mon, Aug 31, 2015 at 9:42 AM, Ashish Shrowty > wrote: > >> Sure .. here it is (scroll below to see the NotSerializableException). >> Note that upstream, I do load up the (user,item,ratings) data from a file >> using ObjectI

Re: Spark shell and StackOverFlowError

2015-08-31 Thread Ashish Shrowty
PM Ted Yu wrote: > Ashish: > Can you post the complete stack trace for NotSerializableException ? > > Cheers > > On Mon, Aug 31, 2015 at 8:49 AM, Ashish Shrowty > wrote: > >> bcItemsIdx is just a broadcast variable constructed out of >> Array[(String)] .. it

Re: Spark shell and StackOverFlowError

2015-08-31 Thread Ashish Shrowty
re serializing a stream somewhere. I'd look at what's inside > bcItemsIdx as that is not shown here. > > On Mon, Aug 31, 2015 at 3:34 PM, Ashish Shrowty > wrote: > > Sean, > > > > Thanks for your comments. What I was really trying to do was to > transform a

Re: Spark shell and StackOverFlowError

2015-08-30 Thread Ashish Shrowty
Do you think I should create a JIRA? On Sun, Aug 30, 2015 at 12:56 PM Ted Yu wrote: > I got StackOverFlowError as well :-( > > On Sun, Aug 30, 2015 at 9:47 AM, Ashish Shrowty > wrote: > >> Yep .. I tried that too earlier. Doesn't make a difference. Are you able &g

Re: Spark shell and StackOverFlowError

2015-08-30 Thread Ashish Shrowty
guide.html#broadcast-variables > > Cheers > > On Sun, Aug 30, 2015 at 8:54 AM, Ashish Shrowty > wrote: > >> @Sean - Agree that there is no action, but I still get the >> stackoverflowerror, its very weird >> >> @Ted - Variable a is just an int - val a = 10 .