subject:"RE\: RAM management during cogroup and join"

RE: RAM management during cogroup and join

2015-04-15 Thread Evo Eftimov

that DStreams are some sort of different type of RDDs From: Tathagata Das [mailto:t...@databricks.com] Sent: Wednesday, April 15, 2015 11:11 PM To: Evo Eftimov Cc: user Subject: Re: RAM management during cogroup and join Well, DStream joins are nothing but RDD joins at its core. However

Re: RAM management during cogroup and join

2015-04-15 Thread Tathagata Das

gt; *From:* Tathagata Das [mailto:t...@databricks.com] > *Sent:* Wednesday, April 15, 2015 9:48 PM > > *To:* Evo Eftimov > *Cc:* user > *Subject:* Re: RAM management during cogroup and join > > > > Agreed. > > > > On Wed, Apr 15, 2015 at 1:29 PM, Evo Eftimov

RE: RAM management during cogroup and join

2015-04-15 Thread Evo Eftimov

Subject: Re: RAM management during cogroup and join Agreed. On Wed, Apr 15, 2015 at 1:29 PM, Evo Eftimov wrote: That has been done Sir and represents further optimizations – the objective here was to confirm whether cogroup always results in the previously described “greedy” explosion of

Re: RAM management during cogroup and join

2015-04-15 Thread Tathagata Das

5 9:25 PM > *To:* Evo Eftimov > *Cc:* user > *Subject:* Re: RAM management during cogroup and join > > > > Significant optimizations can be made by doing the joining/cogroup in a > smart way. If you have to join streaming RDDs with the same batch RDD, then > you can first par

RE: RAM management during cogroup and join

2015-04-15 Thread Evo Eftimov

change the total number of elements included in the result RDD and RAM allocated – right? From: Tathagata Das [mailto:t...@databricks.com] Sent: Wednesday, April 15, 2015 9:25 PM To: Evo Eftimov Cc: user Subject: Re: RAM management during cogroup and join Significant optimizations can be made

Re: RAM management during cogroup and join

2015-04-15 Thread Tathagata Das

Significant optimizations can be made by doing the joining/cogroup in a smart way. If you have to join streaming RDDs with the same batch RDD, then you can first partition the batch RDDs using a partitions and cache it, and then use the same partitioner on the streaming RDDs. That would make sure t

RE: RAM management during cogroup and join

Re: RAM management during cogroup and join

RE: RAM management during cogroup and join

Re: RAM management during cogroup and join

RE: RAM management during cogroup and join

Re: RAM management during cogroup and join

6 matches

Site Navigation

Mail list logo

Footer information