Unsubscribe
I think we can do it using greatest function .
Closing this ticket !
On Mon, Jun 7, 2021 at 2:43 AM kushagra deep
wrote:
> Hi Guys,
>
> I have a problem where I have a df as below:
>
> ===+
> Marks1 | Marks2| Marks3 |
> 10. 30. 40.
:
===+
Marks1| Marks2| Marks3 | Max |
10. 30. 40. 40
Thanks In Advance
Reg,
Kushagra Deep
Thanks a lot Mich , this works though I have to test for scalability.
I have one question though . If we dont specify any column in partitionBy
will it shuffle all the records in one executor ? Because this is what
seems to be happening.
Thanks once again !
Regards
Kushagra Deep
On Tue, May 18
one column and you want to UNION them
> in a certain way but the correlation is not known. In other words this
> UNION is as is?
>
>amount_6m | amount_9m
>100 500
>200 600
>
> HTH
>
>
> On Wed, 12 May 2021 at 13:51, ku
logical spark
partitions with the same cardinality for each partition ?
Reg,
Kushagra Deep
On Wed, May 12, 2021, 21:00 Raghavendra Ganesh
wrote:
> You can add an extra id column and perform an inner join.
>
> val df1_with_id = df1.withColumn("id", monotonically_increasing_id())
500
200 600
300 700
400 800
500 900
Thanks in advance
Reg,
Kushagra Deep
Hi all,
I just wanted to know that when we create a 'createOrReplaceTempView' on a
spark dataset, where does the view reside ? Does all the data come to driver
and the view is created ? Or individual executors have part of the views (based
on the data each executor has) with them , so that when
Kushagra Deep
From: Mich Talebzadeh
Date: Monday, 12 October 2020 at 11:23 PM
To: Santosh74
Cc: "user @spark"
Subject: Re: Spark as computing engine vs spark cluster
Hi Santosh,
Generally speaking, there are two ways of making a process faster:
1. Do more intelligent work by creati
Hi ,
I have a use case where I have to cogroup two streams using cogroup in
streaming. However when I do so I get an exception that “Cogrouping in
streaming is not supported in DataFrame/Dataset”. Please clarify.
Regards ,
Kushagra Deep
11 matches
Mail list logo