People may be calling commit from listeners or who knows where. Point is it's not thread safe. If it's really important to you, it should be pretty straightforward for you to hack on it to allow it at your own risk. There is a check for concurrent access in the consumer, so worst case scenario you should get an exception.
On Sat, Oct 8, 2016 at 9:18 PM, Srikanth <srikanth...@gmail.com> wrote: > If I call commit in foreachrdd at the end of a batch, is there still a > possibility of another thread using the same consumer? Assuming I've not > configured scheduler to run parallel jobs. > > > On Oct 8, 2016 8:39 PM, "Cody Koeninger" <c...@koeninger.org> wrote: >> >> The underlying kafka consumer isn't thread safe. Calling the actual >> commit in compute means it's called in the same thread as the other >> consumer calls. >> >> Using kafka as an offset store only works with correctly with >> idempotent datastore writes anyway, so the question of when the commit >> happens shouldn't be an issue. >> >> On Sat, Oct 8, 2016 at 7:25 PM, Srikanth <srikanth...@gmail.com> wrote: >> > Hello, >> > >> > Spark streaming kafka 0.10 integ provides an option to commit offset to >> > kafka using commitAsyn() API. >> > This only records the offset commit request. The actual commit is >> > performed >> > in compute() after RDD for next batch is created. >> > Why is this so? Why not do a commit right when the API is called? >> > Anyway the commit process itself is async with an option to provide >> > callback >> > handler. >> > >> > This adds a window where application does a commit but it is not >> > recorded in >> > kafka internal topic. >> > Any failure during that window will cause the last batch to be >> > recomputed. >> > >> > My app does a sink to external source that can't be idempotent. As such >> > the >> > operations are assumed to be atleast once. >> > This seems to be one place where duplicates and be reduced. >> > >> > Srikanth --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org