that DStreams are some sort of different type of RDDs
From: Tathagata Das [mailto:t...@databricks.com]
Sent: Wednesday, April 15, 2015 11:11 PM
To: Evo Eftimov
Cc: user
Subject: Re: RAM management during cogroup and join
Well, DStream joins are nothing but RDD joins at its core. However
gt; *From:* Tathagata Das [mailto:t...@databricks.com]
> *Sent:* Wednesday, April 15, 2015 9:48 PM
>
> *To:* Evo Eftimov
> *Cc:* user
> *Subject:* Re: RAM management during cogroup and join
>
>
>
> Agreed.
>
>
>
> On Wed, Apr 15, 2015 at 1:29 PM, Evo Eftimov
Subject: Re: RAM management during cogroup and join
Agreed.
On Wed, Apr 15, 2015 at 1:29 PM, Evo Eftimov wrote:
That has been done Sir and represents further optimizations – the objective
here was to confirm whether cogroup always results in the previously described
“greedy” explosion of
5 9:25 PM
> *To:* Evo Eftimov
> *Cc:* user
> *Subject:* Re: RAM management during cogroup and join
>
>
>
> Significant optimizations can be made by doing the joining/cogroup in a
> smart way. If you have to join streaming RDDs with the same batch RDD, then
> you can first par
change the total number of elements
included in the result RDD and RAM allocated – right?
From: Tathagata Das [mailto:t...@databricks.com]
Sent: Wednesday, April 15, 2015 9:25 PM
To: Evo Eftimov
Cc: user
Subject: Re: RAM management during cogroup and join
Significant optimizations can be made
oins are
> based on cogroup then even though they can present a prettier picture in
> terms of the end result visible to the end user does that mean that under
> the hood there is still the same atrocious RAM consumption going on
>
>
>
>
> --
> View this message in context:
le.com/RAM-management-during-cogroup-and-join-tp22505.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-ma