hes receiver and worker
nodes should discard the old data ?
>
>
>
> On Mon, May 22, 2017 at 5:20 PM, Manish Malhotra <
> manish.malhotra.w...@gmail.com> wrote:
>
>> thanks Alonso,
>>
>> Sorry, but there are some security reservations.
>>
>> But
gt; [image: https://]about.me/alonso.isidoro.roman
>
> <https://about.me/alonso.isidoro.roman?promo=email_sig&utm_source=email_sig&utm_medium=email_sig&utm_campaign=external_links>
>
> 2017-05-20 7:54 GMT+02:00 Manish Malhotra
> :
>
>> Hello,
>>
&g
Hello,
have implemented Java based custom receiver, which consumes from messaging
system say JMS.
once received message, I call store(object) ... Im storing spark Row object.
it run for around 8 hrs, and then goes OOM, and OOM is happening in receiver
nodes.
I also tried to run multiple receiver
Im also facing same problem.
I have implemented Java based custom receiver, which consumes from
messaging system say JMS.
once received message, I call store(object) ... Im storing spark Row object.
it run for around 8 hrs, and then goes OOM, and OOM is happening in
receiver nodes.
I also tried t
Its a pretty nice question !
I'll trying to understand the problem, and see can help further.
When you say CustomRDD I believe you will using it in the transformation
stage, once the data is read from a file source like HDFS or Cassandra or
Kafka.
Now the RDD.getPartitions() should return the pa
thanks for sharing number as well !
Now a days even network can be with very high throughput, and might out
perform the disk, but as Sean mentioned data on network will have other
dependencies like network hops, like if its across rack, which can have
switch in between.
But yes people are discuss
ition where
> edge data are.
>
> // maropu
>
>
> On Tue, Nov 15, 2016 at 5:19 AM, Manish Malhotra <
> manish.malhotra.w...@gmail.com> wrote:
>
> sending again.
> any help is appreciated !
>
> thanks in advance.
>
> On Thu, Nov 10, 2016 at 8:42 AM, Man
sending again.
any help is appreciated !
thanks in advance.
On Thu, Nov 10, 2016 at 8:42 AM, Manish Malhotra <
manish.malhotra.w...@gmail.com> wrote:
> Hello Spark Devs/Users,
>
> Im trying to solve the use case with Spark Streaming 1.6.2 where for every
> batch ( say 2 mins
Hello Spark Devs/Users,
Im trying to solve the use case with Spark Streaming 1.6.2 where for every
batch ( say 2 mins) data needs to go to the same reducer node after
grouping by key.
The underlying storage is Cassandra and not HDFS.
This is a map-reduce job, where also trying to use the partitio