Hi All,

Thanks for the inputs. I have updated the KIP based on the discussion so far.

Few of the changes:
1. I have updated the KIP to use ISO-8601 format for defining duration.
2. Add support for the earliest-local option, which will reset to
local-log-start-offset.
2. Changes to ConsumerGroupCommand to support reset to earliest-local
3. Add new config option support to Kafka Streams

I have adopted the changes suggested by Andrew to add a new config to
set the duration (auto.offset.reset.by.duration).
Let me know your thoughts.

Thanks

On Tue, Nov 5, 2024 at 7:30 PM Andrew Schofield
<andrew_schofield_j...@outlook.com> wrote:
>
> Hi Manikumar,
> Thanks for the KIP.
>
> AS1: As Apoorv points out, the values of auto.offset.reset correspond to the 
> enum
> OffsetResetStrategy. I would say that's a useful thing to maintain and I 
> suggest
> adding a fourth value "by.duration" (BY_DURATION), and then another 
> configuration
> for the duration "auto.offset.reset.by.duration". This would correspond with
> kafka-consumer-groups.sh --reset-offsets --by-duration.
>
> I don't think that --reset-offsets --shift-by or --to-datetime really work as 
> well
> for a config-based solution, and are much easier for interactive use with
> kafka-consumer-groups.sh.
>
> AS2: Typically, durations in client configs are specified as milliseconds,
> but this is unlikely to be that useable in this case where I expect the time 
> period
> is more likely in the scale of hours or days. So, I'd like to reject your 
> suggestion
> of a rejected alternative for ISO-8601. How about this?
>
> "auto.offset.reset.by.duration=P1D" is 1 day.
> "auto.offset.reset.by.duration=PT3H" is 3 hours.
> "auto.offset.reset.by.duration=PT1M" is 1 minute.
>
> Not that hard. The validation for this config could parse it into a Duration
> to make sure it is well formed.
>
> AS3: I like Matthias's suggestion about "latest-local". That seems very 
> sensible
> in the context of tiered storage.
>
> Thanks,
> Andrew
> ________________________________________
> From: Matthias J. Sax <mj...@apache.org>
> Sent: 05 November 2024 01:55
> To: dev@kafka.apache.org <dev@kafka.apache.org>
> Subject: Re: [DISCUSS] KIP-1106: Add duration based offset reset option for 
> consumer clients
>
> Thanks for the KIP.
>
> A somewhat orthogonal question I have is, if we should try to merge this
> KIP with the existing KIP-842:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-842%3A+Add+richer+group+offset+reset+mechanisms
>
> While KIP-842 has a different intention, ie, separating the case of "no
> offset during initial startup" vs "offset out of bounds during
> processing", it might still be worth it? My motivation for this question
> is, that `auto.offset.reset` is one of the most central configs we have,
> and thus, doing a single larger change (ie, both KIPs together) might be
> better (less noise) than doing two independent changes? Thoughts? I know
> it might be a little bit of a stretch, but asking cannot hurt :)
>
>
> About the KIP itself, did we consider to add something like
> "latest-local" to say reset to oldest local, but not fetch from tiered
> storage?
>
> For the format of the config, did we consider what we do in
> `bin/kafka-consumer-group.sh`? It has multiple different options like
>
>    --by-duration <duration: format `PnDTnHnMnS`>
>    --shift-by <number-of-records>
>
>    --to-datetime <fixed point in time: format `YYYY-MM-DDTHH:mm:SS.sss`>
>    --to-offset <absolute offset>
>
> As it also uses ISO format, it might be good to also use them (even if I
> can see the appeal and simplicity of what you proposed).
>
>
> I also want to add to (AM1), as there is also `Topology.AutoOffsetReset`
> enum in Kafka Streams. I think we would need to convert this into a
> class. While it could be done is a follow up KIP, too, it seems best to
> do this holistically in a single KIP, because KS is not something on-top
> of Kafka, but it's part of Kafka. I am happy to help with the design and
> even PRs if necessary, but would strongly prefer to do it all in a
> single KIP.
>
>
> Btw: if we add something like "latest-local", it might also be good to
> extend `bin/kafka-consumer-group.sh` accordingly (even if the tool works
> slightly different, as it does commit an offset and there could be some
> race condition between committing "latest-local", tiering, and when the
> consumer is actually started?
>
>
> -Matthias
>
>
> On 11/4/24 7:38 AM, Apoorv Mittal wrote:
> > Hi Manikumar,
> > Thanks for the KIP, this new strategy would be helpful in specifying fetch
> > behaviour.
> >
> > AM1: The config `auto.offset.reset` is currently applied as per the enum
> > class OffsetResetStarategy which is part of kafka-clients javadoc
> > <https://javadoc.io/doc/org.apache.kafka/kafka-clients/latest/index.html>.
> > Are we also proposing to somehow add new definitions in the same class?
> > However as new configurations will be a string representation hence are we
> > moving away from OffsetResetStartegy enum class altogether? Should we
> > include the change in the KIP as OffsetResetStarategy is part of public
> > javadoc?
> >
> > AM2: While I can see the ISO-8601 format is in the rejected alternative,
> > should we not follow some standard of defining duration which has already
> > been adopted in other systems?
> >
> > AM3: We've introduced new config values using the format minus-n-hours,
> > minus-n-days, minus-n-months, and minus-n-years. Should we explicitly
> > define the "minus" prefix, or is it implied?
> >
> > AM4: When supporting duration-based resets, should we also allow users to
> > specify a specific checkpoint time? For example, if a checkpoint occurs 2
> > days, 5 hours and 30 minutes earlier, the current four formats
> > (minus-n-hours, minus-n-days, minus-n-months, minus-n-years) might not be
> > sufficient. Should we consider adding a format to accommodate specific
> > checkpoint times, or is there a reason to limit the supported formats?
> >
> >
> > Regards,
> > Apoorv Mittal
> >
> >
> > On Mon, Nov 4, 2024 at 9:23 AM Manikumar <manikumar.re...@gmail.com> wrote:
> >
> >> Hi everyone,
> >> I would like to start a discussion on KIP-1106:
> >>
> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1106%3A+Add+duration+based+offset+reset+option+for+consumer+clients
> >>
> >> This KIP proposes to add an additional auto offset reset strategy for
> >> consumer clients.
> >>
> >> Regards,
> >> Manikumar
> >>
> >

Reply via email to