Hi Bob, Thanks for the comment.
> I'm not sure how much value the MaxActiveTransactionDuration metric adds, given that we have the --find-hanging option in the tool. As you mention, instances of these transactions are expected to be rare, and a partition-level metric, which can generate a lot of data, seems very heavyweight for such a rare occurrence. I think "alert on PartitionsWithLateTransactionsCount" followed by "run kafka-transactions --find-hanging on the relevant broker" is a reasonable process for cluster operators to follow. Yeah, I see your point. My original thinking was that this metric might help an operator identify partitions specifically, but that was before I added the --find-hanging command. I guess we can remove it. -Jason On Mon, Aug 31, 2020 at 12:28 PM Jason Gustafson <ja...@confluent.io> wrote: > Hey Guozhang, > > Thanks for the detailed comments. Responses inline: > > > 1. I'd like to clarify how we can make "--abort" work with old brokers, > since without the additional field "Partitions" the tool needs to set the > coordinator epoch correctly instead of "-1"? Arguably that's still doable > but would require different call paths, and it's not clear whether that's > worth doing for old versions. > > That's a good question. What I had in mind was to write the marker using > the last coordinator epoch that was used by the respective ProducerId. I > realized that I left the coordinator epoch out of the `DescribeProducers` > response, so I have updated the KIP to include it. It is possible that > there is no coordinator epoch associated with a given ProducerId (e.g. if > it is the first transaction from that producer), but in this case we can > use 0. > > As for whether this is worth doing, I guess I would be more inclined to > leave it out if users had a reasonable alternative today to address this > problem. > > > 2. Why do we have to enforce "DescribeProducers" to be sent to only > leaders > while ListTransactions can be sent to any brokers? Or is it really > "ListTransactions to be sent to coordinators only"? From the workflow > you've described, based on the results back from DescribeProducers, we > should just immediately send ListTransactions to the > corresponding coordinators based on the collected producer ids, instead of > trying to send to any brokers right? > > I'm going to change `DescribeProducers` so that it can be handled by any > replica of a topic partition. This was suggested by Lucas in order to allow > this API to be used for replica consistency testing. As far as > `ListTransactions`, I was treating this similarly to `ListGroups`. Although > we know that the coordinators are the leaders of the __transaction_state > partitions, this is more of an implementation detail. From an API > perspective, we say that any broker could be a transaction coordinator. > > > 3. One thing I'm a bit hesitant about is that, is `Describe` permission > on > the associated topic sufficient to allow any users to get all producer > information writing to the specific topic-partitions including last > timestamp, txn-start-timestamp etc, which may be considered sensitive? > Should we require "ClusterAction" to only allow operators only? > > That's a fair point. Do you think `Read` permission would be reasonable? > This is all information that could be obtained by reading the topic. > > > 4. From the example it seems "TxnStartOffset" should be included in the > DescribeTransaction response schema? Otherwise the user would not get it in > the following WriteTxnMarker request. > > The `DescribeTransaction` API is sent to the transaction coordinator, > which does not know the start offset of a transaction in each topic > partition. That is why we need `DescribeProducers`. > > > 5. It is a bit easier for readers to highlight the added fields in the > existing WriteTxnMarkerRequest (btw I read is that we are only adding > "Partitions" with the starting offset, right?). Also as for its response it > seems we do not make any schema changes except adding one more potential > error code "INVALID_TXN_STATE" to it, right? If that's the case we can just > state that explicitly. > > I highlighted the new field in the request. For the response, the KIP > states the following: "There are no changes to the response schema, but it > will be bumped. Note that we are also enabling flexible version support." > > > 6. It is not clear to me for the overloaded function that the following > option classes are not specified, what should be the default options? > ... > > I was just trying to stick with existing conventions, but I will add some > more detail here. I think we should probably still include > `AbortTransactionOptions`. The `Options` classes are how users override > timeouts. > > > 7.1 Is "--broker" a required or optional (in that case I presume we would > just query all brokers iteratively) in "--find-hanging"? > > I think it should be required as a reasonable way to limit the scope of > the search. This is meant to be guided by metrics after all. If we do not > limit the scope to a single broker, then the behavior might get worse as > the cluster grows. I will clarify this. > > > 7.2 Seems "list-producers" is not exposed as a standalone feature in the > cmd but only used in the wrapping "--find-hanging", is that intentional? > Personally I feel exposing a "--list-producers" may be useful too: if we > believe the user has the right ACL, it is legitimate to return the producer > information to her anyways. But that is debatable in the meta point 3) > above. > > Yeah, I was planning to add this to support the use case that Lucas > mentioned. There is some awkwardness since it is a little difficult to > convey different sources of information through the same command. I guess > we can do `--list producers` and `--list transactions` and explain in the > documentation. Maybe that is good enough. > > > 7.3 "Describing Transactions": we should also explain how that would be > executed, e.g. at least we should clarify that we would first find the > coordinator based on the transactional.id and hence users do not need to > specify one. > > Sure, makes sense. > > > 7.4. In "Aborting Transactions", should we also specify the "--broker" > node > as a required option? Otherwise we would not know which broker to send to. > > The --topic and --partition arguments are required, so the target is > always the leader of that partition. > > > Thanks, > Jason > > > > On Fri, Aug 28, 2020 at 8:13 AM Robert Barrett <bob.barr...@confluent.io> > wrote: > >> Hi Jason, >> >> Thanks for this KIP, I think this will be a huge operational improvement >> and overall it looks great to me. >> >> I'm not sure how much value the MaxActiveTransactionDuration metric adds, >> given that we have the --find-hanging option in the tool. As you mention, >> instances of these transactions are expected to be rare, and a >> partition-level metric, which can generate a lot of data, seems very >> heavyweight for such a rare occurrence. I think "alert on >> PartitionsWithLateTransactionsCount" followed by "run kafka-transactions >> --find-hanging on the relevant broker" is a reasonable process for cluster >> operators to follow. >> >> Thanks, >> Bob >> >> On Thu, Aug 27, 2020 at 9:23 PM Guozhang Wang <wangg...@gmail.com> wrote: >> >> > Hi Jason, >> > >> > Thanks for the written KIP. I think this is going to be a very useful >> tool >> > for operational improvements since with eos in its current stage, we >> cannot >> > confidently assert that we are bug-free, and even in the future when we >> are >> > confident this is still going to be leveraged by older versioned >> brokers. >> > Regarding the solution, I've also debated myself whether Kafka should >> > "self-heal" automatically when detected in such situations, or should we >> > instead build into ecosystem tooling to let operators do it. And I've >> also >> > convinced myself that the latter should be a better solution to keep >> Kafka >> > software itself simpler. >> > >> > Regarding the KIP itself, I have a few meta comments below: >> > >> > 1. I'd like to clarify how we can make "--abort" work with old brokers, >> > since without the additional field "Partitions" the tool needs to set >> the >> > coordinator epoch correctly instead of "-1"? Arguably that's still >> doable >> > but would require different call paths, and it's not clear whether >> that's >> > worth doing for old versions. >> > >> > 2. Why do we have to enforce "DescribeProducers" to be sent to only >> leaders >> > while ListTransactions can be sent to any brokers? Or is it really >> > "ListTransactions to be sent to coordinators only"? From the workflow >> > you've described, based on the results back from DescribeProducers, we >> > should just immediately send ListTransactions to the >> > corresponding coordinators based on the collected producer ids, instead >> of >> > trying to send to any brokers right? >> > >> > Also I'm a bit concerned if "ListTransactions" could potentially return >> too >> > much data with "StateFilters" set to all states, including completed >> ones. >> > Do we expect users ever want to know transactions that are not pending? >> On >> > the other hand, maybe we can just require users to specify the "pids[]" >> in >> > this request too to further filter those un-interested transactions. >> This >> > also works well with the workflow: we know exactly from >> "DescribeProducers" >> > which pids are we diagnosing right now, so in the follow-up >> > "ListTransactions" we should also only care for those partitions only. >> > >> > 3. One thing I'm a bit hesitant about is that, is `Describe` permission >> on >> > the associated topic sufficient to allow any users to get all producer >> > information writing to the specific topic-partitions including last >> > timestamp, txn-start-timestamp etc, which may be considered sensitive? >> > Should we require "ClusterAction" to only allow operators only? >> > >> > Below are more detailed comments: >> > >> > 4. From the example it seems "TxnStartOffset" should be included in the >> > DescribeTransaction response schema? Otherwise the user would not get >> it in >> > the following WriteTxnMarker request. >> > >> > 5. It is a bit easier for readers to highlight the added fields in the >> > existing WriteTxnMarkerRequest (btw I read is that we are only adding >> > "Partitions" with the starting offset, right?). Also as for its >> response it >> > seems we do not make any schema changes except adding one more potential >> > error code "INVALID_TXN_STATE" to it, right? If that's the case we can >> just >> > state that explicitly. >> > >> > 6. It is not clear to me for the overloaded function that the following >> > option classes are not specified, what should be the default options? >> > >> > * ListTransactionsOptions >> > * DescribeTransactionsOptions >> > * DescribeProducersOptions >> > >> > Also, it seems AbortTransactionOptions would just be empty? If yes do we >> > really need this option class for now? >> > >> > 7. A couple questions from the cmd tool examples: >> > 7.1 Is "--broker" a required or optional (in that case I presume we >> would >> > just query all brokers iteratively) in "--find-hanging"? >> > 7.2 Seems "list-producers" is not exposed as a standalone feature in the >> > cmd but only used in the wrapping "--find-hanging", is that intentional? >> > Personally I feel exposing a "--list-producers" may be useful too: if we >> > believe the user has the right ACL, it is legitimate to return the >> producer >> > information to her anyways. But that is debatable in the meta point 3) >> > above. >> > 7.3 "Describing Transactions": we should also explain how that would be >> > executed, e.g. at least we should clarify that we would first find the >> > coordinator based on the transactional.id and hence users do not need >> to >> > specify one. >> > 7.4. In "Aborting Transactions", should we also specify the "--broker" >> node >> > as a required option? Otherwise we would not know which broker to send >> to. >> > >> > >> > Overall, nice written one, thanks Jason. >> > >> > Guozhang >> > >> > >> > On Thu, Aug 27, 2020 at 11:44 AM Lucas Bradstreet <lu...@confluent.io> >> > wrote: >> > >> > > >> Would it be worth returning transactional.id.expiration.ms in the >> > > DescribeProducersResponse? >> > > >> > > > That's an interesting thought as well. Are you trying to avoid the >> need >> > > to >> > > specify it through the command line? The tool could also query the >> value >> > > with DescribeConfigs I suppose. >> > > >> > > Basically. I'm not sure how useful this will be in practice, though it >> > > might help when debugging. >> > > >> > > Lucas >> > > >> > > On Thu, Aug 27, 2020 at 11:00 AM Jason Gustafson <ja...@confluent.io> >> > > wrote: >> > > >> > > > Hey Lucas, >> > > > >> > > > Thanks for the comments. Responses below: >> > > > >> > > > > Given that it's possible for replica producer states to diverge >> from >> > > each >> > > > other, it would be very useful if >> DescribeProducers(Request,Response) >> > and >> > > > tooling is able to query all partition replicas for their producers >> > > > >> > > > Yes, it makes sense to me to let DescribeProducers work on both >> > followers >> > > > and leaders. In fact, I'm encouraged that there are use cases for >> this >> > > work >> > > > other than detecting hanging transactions. That was indeed the hope, >> > but >> > > I >> > > > didn't have anything specific in mind. I will update the proposal. >> > > > >> > > > > Would it be worth returning transactional.id.expiration.ms in the >> > > > DescribeProducersResponse? >> > > > >> > > > That's an interesting thought as well. Are you trying to avoid the >> need >> > > to >> > > > specify it through the command line? The tool could also query the >> > value >> > > > with DescribeConfigs I suppose. >> > > > >> > > > Thanks, >> > > > Jason >> > > > >> > > > On Thu, Aug 27, 2020 at 10:48 AM Lucas Bradstreet < >> lu...@confluent.io> >> > > > wrote: >> > > > >> > > > > Hi Jason, >> > > > > >> > > > > This looks like a very useful tool, thanks for writing it up. >> > > > > >> > > > > Given that it's possible for replica producer states to diverge >> from >> > > each >> > > > > other, it would be very useful if >> DescribeProducers(Request,Response) >> > > and >> > > > > tooling is able to query all partition replicas for their >> producers. >> > > One >> > > > > way I can see this being used immediately is in kafka's system >> tests, >> > > > > especially the ones that inject failures. At the end of the test >> we >> > can >> > > > > query all replicas and make sure that their states have not >> > diverged. I >> > > > can >> > > > > also see it being useful when debugging production clusters too. >> > > > > >> > > > > Would it be worth returning transactional.id.expiration.ms in the >> > > > > DescribeProducersResponse? >> > > > > >> > > > > Cheers, >> > > > > >> > > > > Lucas >> > > > > >> > > > > >> > > > > >> > > > > On Wed, Aug 26, 2020 at 12:12 PM Ron Dagostino <rndg...@gmail.com >> > >> > > > wrote: >> > > > > >> > > > > > Yes, that definitely sounds reasonable. Thanks, Jason! >> > > > > > >> > > > > > Ron >> > > > > > >> > > > > > On Wed, Aug 26, 2020 at 3:03 PM Jason Gustafson < >> > ja...@confluent.io> >> > > > > > wrote: >> > > > > > >> > > > > > > Hey Ron, >> > > > > > > >> > > > > > > We do not typically backport new APIs to older versions. I >> think >> > we >> > > > can >> > > > > > > however make the --abort command compatible with older >> versions. >> > It >> > > > > would >> > > > > > > require a user to do some analysis on their own to identify a >> > > hanging >> > > > > > > transaction, but then they can use the tool from a new >> release to >> > > > > > recover. >> > > > > > > For example, users could detect a hanging transaction through >> the >> > > > > > existing >> > > > > > > "LastStableOffsetLag" metric and then collect the needed >> > > information >> > > > > > from a >> > > > > > > dump of the log (or producer snapshot). It's more work, but at >> > > least >> > > > > it's >> > > > > > > possible. Does that sound fair? >> > > > > > > >> > > > > > > Thanks, >> > > > > > > Jason >> > > > > > > >> > > > > > > On Wed, Aug 26, 2020 at 11:51 AM Ron Dagostino < >> > rndg...@gmail.com> >> > > > > > wrote: >> > > > > > > >> > > > > > > > Hi Jason. Thanks for the excellently-written KIP. >> > > > > > > > >> > > > > > > > Will the implementation be backported to prior Kafka >> versions? >> > > The >> > > > > > > reason >> > > > > > > > I ask is because if it is not backported and similar >> > > functionality >> > > > is >> > > > > > not >> > > > > > > > otherwise made available for older versions, then the only >> > > recourse >> > > > > > > (aside >> > > > > > > > from deleting and recreating the topic as you pointed out) >> may >> > be >> > > > to >> > > > > > > > upgrade to 2.7 (or whatever version ends up getting this >> > > > > > functionality). >> > > > > > > > Such an upgrade may not be desirable, especially if the >> number >> > of >> > > > > > > > intermediate versions is considerable. I understand the >> mantra >> > of >> > > > > > "never >> > > > > > > > fall too many versions behind" but the reality of it is >> that it >> > > > isn't >> > > > > > > > always the case. Even if the version is relatively recent, >> an >> > > > > upgrade >> > > > > > > may >> > > > > > > > still not be possible for some time, and a quicker >> resolution >> > may >> > > > be >> > > > > > > > necessary. >> > > > > > > > >> > > > > > > > Ron >> > > > > > > > >> > > > > > > > On Wed, Aug 26, 2020 at 2:33 PM Jason Gustafson < >> > > > ja...@confluent.io> >> > > > > > > > wrote: >> > > > > > > > >> > > > > > > > > Hi All, >> > > > > > > > > >> > > > > > > > > I've added a proposal to handle the problem of hanging >> > > > > transactions: >> > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-664%3A+Provide+tooling+to+detect+and+abort+hanging+transactions >> > > > > > > > > . >> > > > > > > > > In theory, this should never happen. In practice, we have >> hit >> > > one >> > > > > bug >> > > > > > > > where >> > > > > > > > > it was possible and there are few good options today to >> > > recover. >> > > > > > Take a >> > > > > > > > > look and let me know what you think. >> > > > > > > > > >> > > > > > > > > Thanks, >> > > > > > > > > Jason >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> > >> > -- >> > -- Guozhang >> > >> >