Re: Spark streaming to kafka exactly once

2017-03-23 Thread Maurin Lenglart
Ok, Thanks for your answers On 3/22/17, 1:34 PM, "Cody Koeninger" wrote: If you're talking about reading the same message multiple times in a failure situation, see https://github.com/koeninger/kafka-exactly-once If you're talking about producing the same message multip

Re: Spark streaming to kafka exactly once

2017-03-22 Thread Cody Koeninger
If you're talking about reading the same message multiple times in a failure situation, see https://github.com/koeninger/kafka-exactly-once If you're talking about producing the same message multiple times in a failure situation, keep an eye on https://cwiki.apache.org/confluence/display/KAFKA/K

Re: Spark streaming to kafka exactly once

2017-03-22 Thread Matt Deaver
You have to handle de-duplication upstream or downstream. It might technically be possible to handle this in Spark but you'll probably have a better time handling duplicates in the service that reads from Kafka. On Wed, Mar 22, 2017 at 1:49 PM, Maurin Lenglart wrote: > Hi, > we are trying to bui

Spark streaming to kafka exactly once

2017-03-22 Thread Maurin Lenglart
Hi, we are trying to build a spark streaming solution that subscribe and push to kafka. But we are running into the problem of duplicates events. Right now, I am doing a “forEachRdd” and loop over the message of each partition and send those message to kafka. Is there any good way of solving tha

Re: Spark Streaming to Kafka

2015-05-19 Thread twinkle sachdeva
Thanks Saisai. On Wed, May 20, 2015 at 11:23 AM, Saisai Shao wrote: > I think here is the PR https://github.com/apache/spark/pull/2994 you > could refer to. > > 2015-05-20 13:41 GMT+08:00 twinkle sachdeva : > >> Hi, >> >> As Spark streaming is being nicely integrated with consuming messages >> f

Re: Spark Streaming to Kafka

2015-05-19 Thread Saisai Shao
I think here is the PR https://github.com/apache/spark/pull/2994 you could refer to. 2015-05-20 13:41 GMT+08:00 twinkle sachdeva : > Hi, > > As Spark streaming is being nicely integrated with consuming messages from > Kafka, so I thought of asking the forum, that is there any implementation > ava

Spark Streaming to Kafka

2015-05-19 Thread twinkle sachdeva
Hi, As Spark streaming is being nicely integrated with consuming messages from Kafka, so I thought of asking the forum, that is there any implementation available for pushing data to Kafka from Spark Streaming too? Any link(s) will be helpful. Thanks and Regards, Twinkle