Will this work if we are using a TopicFilter, that can map to multiple topics. Can I create multiple connectors, and have each use the same Regex for the TopicFilter? Will each connector share the set of available topics? Is this safe to do?
Or is it necessary to create mutually non-intersecting regex's for each connector? It seems I have a similar issue. I have been using auto commit mode, but it doesn't guarantee that all messages committed have been successfully processed (seems a change to the connector itself might expose a way to use auto offset commit, and have it never commit a message until it is processed). But that would be a change to the ZookeeperConsumerConnector....Essentially, it would be great if after processing each message, we could mark the message as 'processed', and thus use that status as the max offset to commit when the auto offset commit background thread wakes up each time. Jason On Thu, Aug 29, 2013 at 11:58 AM, Yu, Libo <libo...@citi.com> wrote: > Thanks, Neha. That is a great answer. > > Regards, > > Libo > > > -----Original Message----- > From: Neha Narkhede [mailto:neha.narkh...@gmail.com] > Sent: Thursday, August 29, 2013 1:55 PM > To: users@kafka.apache.org > Subject: Re: is it possible to commit offsets on a per stream basis? > > 1 We can create multiple connectors. From each connector create only one > stream. > 2 Use a single thread for a stream. In this case, the connector in each > thread can commit freely without any dependence on the other threads. Is > this the right way to go? Will it introduce any dead lock when multiple > connectors commit at the same time? > > This is a better approach as there is no complex locking involved. > > Thanks, > Neha > > > On Thu, Aug 29, 2013 at 10:28 AM, Yu, Libo <libo...@citi.com> wrote: > > > Hi team, > > > > This is our current use case: > > Assume there is a topic with multiple partitions. > > 1 Create a connector first and create multiple streams from the > > connector for a topic. > > 2 Create multiple threads, one for each stream. You can assume the > > thread's job is to save the message into the database. > > 3 When it is time to commit offsets, all threads have to synchronize > > on a barrier before committing the offsets. This is to ensure no > > message loss in case of process crash. > > > > As all threads need to synchronize before committing, it is not > efficient. > > This is a workaround: > > > > 1 We can create multiple connectors. From each connector create only > > one stream. > > 2 Use a single thread for a stream. In this case, the connector in > > each thread can commit freely without any dependence on the other > > threads. Is this the right way to go? Will it introduce any dead lock > > when multiple connectors commit at the same time? > > > > It would be great to allow committing on a per stream basis. > > > > Regards, > > > > Libo > > > > >