If you care about the semantics of those writes to Kafka, then you should be aware of two things. 1. There are no transactional writes to Kafka. 2. So, when tasks get reexecuted due to any failure, your mapping function will also be reexecuted, and the writes to kafka can happen multiple times. So you may only get at least once guarantee about those Kafka writes
On Mon, Jan 30, 2017 at 10:02 AM, shyla deshpande <deshpandesh...@gmail.com> wrote: > Hello, > > TD, your suggestion works great. Thanks > > I have 1 more question, I need to write to kafka from within the > mapWithState function. Just wanted to check if this a bad pattern in any > way. > > Thank you. > > > > > > On Sat, Jan 28, 2017 at 9:14 AM, shyla deshpande <deshpandesh...@gmail.com > > wrote: > >> Thats a great idea. I will try that. Thanks. >> >> On Sat, Jan 28, 2017 at 2:35 AM, Tathagata Das < >> tathagata.das1...@gmail.com> wrote: >> >>> 1 state object for each user. >>> union both streams into a single DStream, and apply mapWithState on it >>> to update the user state. >>> >>> On Sat, Jan 28, 2017 at 12:30 AM, shyla deshpande < >>> deshpandesh...@gmail.com> wrote: >>> >>>> Can multiple DStreams manipulate a state? I have a stream that gives >>>> me total minutes the user spent on a course material. I have another >>>> stream that gives me chapters completed and lessons completed by the user. >>>> I >>>> want to keep track for each user total_minutes, chapters_completed and >>>> lessons_completed. I am not sure if I should have 1 state or 2 states. Can >>>> I lookup the state for a given key just like a map outside the mapfunction? >>>> >>>> Appreciate your input. Thanks >>>> >>> >>> >> >