also 4. some apps may do their own offset bookkeeping On Tue, Jan 3, 2017 at 5:29 PM, radai <radai.rosenbl...@gmail.com> wrote:
> the issue with tracking committed offsets is whos offsets do you track? > > 1. some topics have multiple groups > 2. some "groups" are really one-offs like developers spinning up console > consumer "just to see if there's data" > 3. there are use cases where you want to deliberately "wipe" data EVEN IF > its still being consumed > > #1 is a configuration mess, since there are multiple possible strategies. > #2 is problematic without a definition of "liveliness" or special handling > for console consumer? and #3 is flat out impossible with committed-offset > tracking > > On Tue, Jan 3, 2017 at 3:56 PM, Ewen Cheslack-Postava <e...@confluent.io> > wrote: > >> Dong, >> >> Looks like that's an internal link, >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-107% >> 3A+Add+purgeDataBefore%28%29+API+in+AdminClient >> is the right one. >> >> I have a question about one of the rejected alternatives: >> >> > Using committed offset instead of an extra API to trigger data purge >> operation. >> >> The KIP says this would be more complicated to implement. Why is that? I >> think brokers would have to consume the entire offsets topic, but the data >> stored in memory doesn't seem to change and applying this when updated >> offsets are seen seems basically the same. It might also be possible to >> make it work even with multiple consumer groups if that was desired >> (although that'd require tracking more data in memory) as a generalization >> without requiring coordination between the consumer groups. Given the >> motivation, I'm assuming this was considered unnecessary since this >> specifically targets intermediate stream processing topics. >> >> Another question is why expose this via AdminClient (which isn't public >> API >> afaik)? Why not, for example, expose it on the Consumer, which is >> presumably where you'd want access to it since the functionality depends >> on >> the consumer actually having consumed the data? >> >> -Ewen >> >> On Tue, Jan 3, 2017 at 2:45 PM, Dong Lin <lindon...@gmail.com> wrote: >> >> > Hi all, >> > >> > We created KIP-107 to propose addition of purgeDataBefore() API in >> > AdminClient. >> > >> > Please find the KIP wiki in the link https://iwww.corp.linkedin. >> > com/wiki/cf/display/ENGS/Kafka+purgeDataBefore%28%29+API+ >> design+proposal. >> > We >> > would love to hear your comments and suggestions. >> > >> > Thanks, >> > Dong >> > >> > >