>> Would it be worth returning transactional.id.expiration.ms in the
DescribeProducersResponse?

> That's an interesting thought as well. Are you trying to avoid the need to
specify it through the command line? The tool could also query the value
with DescribeConfigs I suppose.

Basically. I'm not sure how useful this will be in practice, though it
might help when debugging.

Lucas

On Thu, Aug 27, 2020 at 11:00 AM Jason Gustafson <ja...@confluent.io> wrote:

> Hey Lucas,
>
> Thanks for the comments. Responses below:
>
> > Given that it's possible for replica producer states to diverge from each
> other, it would be very useful if DescribeProducers(Request,Response) and
> tooling is able to query all partition replicas for their producers
>
> Yes, it makes sense to me to let DescribeProducers work on both followers
> and leaders. In fact, I'm encouraged that there are use cases for this work
> other than detecting hanging transactions. That was indeed the hope, but I
> didn't have anything specific in mind. I will update the proposal.
>
> > Would it be worth returning transactional.id.expiration.ms in the
> DescribeProducersResponse?
>
> That's an interesting thought as well. Are you trying to avoid the need to
> specify it through the command line? The tool could also query the value
> with DescribeConfigs I suppose.
>
> Thanks,
> Jason
>
> On Thu, Aug 27, 2020 at 10:48 AM Lucas Bradstreet <lu...@confluent.io>
> wrote:
>
> > Hi Jason,
> >
> > This looks like a very useful tool, thanks for writing it up.
> >
> > Given that it's possible for replica producer states to diverge from each
> > other, it would be very useful if DescribeProducers(Request,Response) and
> > tooling is able to query all partition replicas for their producers. One
> > way I can see this being used immediately is in kafka's system tests,
> > especially the ones that inject failures. At the end of the test we can
> > query all replicas and make sure that their states have not diverged. I
> can
> > also see it being useful when debugging production clusters too.
> >
> > Would it be worth returning transactional.id.expiration.ms in the
> > DescribeProducersResponse?
> >
> > Cheers,
> >
> > Lucas
> >
> >
> >
> > On Wed, Aug 26, 2020 at 12:12 PM Ron Dagostino <rndg...@gmail.com>
> wrote:
> >
> > > Yes, that definitely sounds reasonable.  Thanks, Jason!
> > >
> > > Ron
> > >
> > > On Wed, Aug 26, 2020 at 3:03 PM Jason Gustafson <ja...@confluent.io>
> > > wrote:
> > >
> > > > Hey Ron,
> > > >
> > > > We do not typically backport new APIs to older versions. I think we
> can
> > > > however make the --abort command compatible with older versions. It
> > would
> > > > require a user to do some analysis on their own to identify a hanging
> > > > transaction, but then they can use the tool from a new release to
> > > recover.
> > > > For example, users could detect a hanging transaction through the
> > > existing
> > > > "LastStableOffsetLag" metric and then collect the needed information
> > > from a
> > > > dump of the log (or producer snapshot). It's more work, but at least
> > it's
> > > > possible. Does that sound fair?
> > > >
> > > > Thanks,
> > > > Jason
> > > >
> > > > On Wed, Aug 26, 2020 at 11:51 AM Ron Dagostino <rndg...@gmail.com>
> > > wrote:
> > > >
> > > > > Hi Jason.  Thanks for the excellently-written KIP.
> > > > >
> > > > > Will the implementation be backported to prior Kafka versions?  The
> > > > reason
> > > > > I ask is because if it is not backported and similar functionality
> is
> > > not
> > > > > otherwise made available for older versions, then the only recourse
> > > > (aside
> > > > > from deleting and recreating the topic as you pointed out) may be
> to
> > > > > upgrade to 2.7 (or whatever version ends up getting this
> > > functionality).
> > > > > Such an upgrade may not be desirable, especially if the number of
> > > > > intermediate versions is considerable. I understand the mantra of
> > > "never
> > > > > fall too many versions behind" but the reality of it is that it
> isn't
> > > > > always the case.  Even if the version is relatively recent, an
> > upgrade
> > > > may
> > > > > still not be possible for some time, and a quicker resolution may
> be
> > > > > necessary.
> > > > >
> > > > > Ron
> > > > >
> > > > > On Wed, Aug 26, 2020 at 2:33 PM Jason Gustafson <
> ja...@confluent.io>
> > > > > wrote:
> > > > >
> > > > > > Hi All,
> > > > > >
> > > > > > I've added a proposal to handle the problem of hanging
> > transactions:
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-664%3A+Provide+tooling+to+detect+and+abort+hanging+transactions
> > > > > > .
> > > > > > In theory, this should never happen. In practice, we have hit one
> > bug
> > > > > where
> > > > > > it was possible and there are few good options today to recover.
> > > Take a
> > > > > > look and let me know what you think.
> > > > > >
> > > > > > Thanks,
> > > > > > Jason
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to