Re: Kafka 0.10 integ offset commit

2016-10-09 Thread Cody Koeninger
That's cool, just be aware that all you're affecting is the time between commits, not overall correctness. Good call on the iterator not draining the queue, I'll fix that. On Sun, Oct 9, 2016 at 12:22 PM, Srikanth wrote: > I'll probably add this behavior. It's a good balance between not having t

Re: Kafka 0.10 integ offset commit

2016-10-09 Thread Srikanth
I'll probably add this behavior. It's a good balance between not having to rely on another external system just for offset management and reducing duplicates. I was more worried about the underlying framework using the consumer in parallel. Will watch out for concurrent mod exp. BTW, the commitQue

Re: Kafka 0.10 integ offset commit

2016-10-08 Thread Cody Koeninger
People may be calling commit from listeners or who knows where. Point is it's not thread safe. If it's really important to you, it should be pretty straightforward for you to hack on it to allow it at your own risk. There is a check for concurrent access in the consumer, so worst case scenario y

Re: Kafka 0.10 integ offset commit

2016-10-08 Thread Srikanth
If I call commit in foreachrdd at the end of a batch, is there still a possibility of another thread using the same consumer? Assuming I've not configured scheduler to run parallel jobs. On Oct 8, 2016 8:39 PM, "Cody Koeninger" wrote: > The underlying kafka consumer isn't thread safe. Calling t

Re: Kafka 0.10 integ offset commit

2016-10-08 Thread Cody Koeninger
The underlying kafka consumer isn't thread safe. Calling the actual commit in compute means it's called in the same thread as the other consumer calls. Using kafka as an offset store only works with correctly with idempotent datastore writes anyway, so the question of when the commit happens shou