If I try to inner-join two dataframes which originated from the same initial
dataframe that was loaded using spark.sql() call, it results in an error -
// reading from Hive .. the data is stored in Parquet format in Amazon
S3
val d1 = spark.sql("select * from ")
val df1 =
d1.groupBy("
Thanks Steve. I built it from source.
On Thu, Oct 22, 2015 at 4:01 PM Steve Loughran
wrote:
>
> > On 22 Oct 2015, at 15:12, Ashish Shrowty
> wrote:
> >
> > I understand that there is some incompatibility with the API between
> Hadoop
> > 2.6/2.7 and Am
I understand that there is some incompatibility with the API between Hadoop
2.6/2.7 and Amazon AWS SDK where they changed a signature of
com.amazonaws.services.s3.transfer.TransferManagerConfiguration.setMultipartUploadThreshold.
The JIRA indicates that this would be fixed in Hadoop 2.8.
(https://
he above step.
>
> Cheers
>
> On Mon, Aug 31, 2015 at 9:42 AM, Ashish Shrowty
> wrote:
>
>> Sure .. here it is (scroll below to see the NotSerializableException).
>> Note that upstream, I do load up the (user,item,ratings) data from a file
>> using ObjectI
PM Ted Yu wrote:
> Ashish:
> Can you post the complete stack trace for NotSerializableException ?
>
> Cheers
>
> On Mon, Aug 31, 2015 at 8:49 AM, Ashish Shrowty
> wrote:
>
>> bcItemsIdx is just a broadcast variable constructed out of
>> Array[(String)] .. it
re serializing a stream somewhere. I'd look at what's inside
> bcItemsIdx as that is not shown here.
>
> On Mon, Aug 31, 2015 at 3:34 PM, Ashish Shrowty
> wrote:
> > Sean,
> >
> > Thanks for your comments. What I was really trying to do was to
> transform a
Do you think I should create a JIRA?
On Sun, Aug 30, 2015 at 12:56 PM Ted Yu wrote:
> I got StackOverFlowError as well :-(
>
> On Sun, Aug 30, 2015 at 9:47 AM, Ashish Shrowty
> wrote:
>
>> Yep .. I tried that too earlier. Doesn't make a difference. Are you able
&g
guide.html#broadcast-variables
>
> Cheers
>
> On Sun, Aug 30, 2015 at 8:54 AM, Ashish Shrowty
> wrote:
>
>> @Sean - Agree that there is no action, but I still get the
>> stackoverflowerror, its very weird
>>
>> @Ted - Variable a is just an int - val a = 10 .