Thanks for the KIP Randall! Really happy to see this feature coming along finally. Will indeed resolve an unintuitive aspect of Connect's REST API.
I'm in favor of the current approach overall. Just a few comments below that could use a brief discussion: 1) Is there a specific reason why you choose the config topic to persist the restart requests? I know it doesn't really matter which internal topic is used for backwards compatibility, because the worker ignores records that does not understand. But are there any implications upon worker restart? I read you specifically mention that a worker will read restart requests upon its own restart. But is this really desirable? The config topic offers a way to persist configurations of the latest running connectors and upon cluster restart to bring these connectors back up. However, I'm not sure we want to persist recent restart attempts, the same way we persist configurations. Additionally configurations might get additional backups in order to allow for the config topic to be recovered for disaster recovery reasons. To that respect, I'm not sure that the restart requests require the same guarantees. Given the above, would the status topic offer a better alternative in order to broadcast the restart requests across the workers? Haven't looked closely what would be the implications of writing these new records to the status topic, but this topic tends to be more transient. Or is it that you've preferred the config topic for the workers to be aware of the interleaving between restart requests and requests for reconfigurations? If that's the case an explicit call out might be worthwhile in the KIP itself. 2) You mention in the KIP the status STOPPED. But this is not currently part of the task or connector states that get persisted to the status topic, as these are defined in AbstractStatus#State enum. Should we remove any reference to a STOPPED state to avoid confusion? The only new state seems to be the RESTARTING state. And a few minor comments. a) The sentence "We need to keep the existing behavior of the method to just restart the Connector object must be retained for backward compatibility," doesn't read well for me. b) Subsection title: "Use REST API for Worker-to-Worker Communication and" is possibly an unfinished sentence. c) In the paragraph right below, the last sentence "especially when the number of workers " might also be missing something. Best, Konstantine On Thu, Jun 3, 2021 at 8:42 AM Kalpesh Patel <kpa...@confluent.io.invalid> wrote: > Agreed, this would be a great enhancement. > > On Thu, Jun 3, 2021 at 9:26 AM Nikita Kretov <kretov...@gmail.com> wrote: > > > Hello! It's really not intuitive that you need to restart connect and > > tasks separately! I'd like to see this KIP landed in 3.0.0 release! > > > > On 6/2/21 7:40 PM, Randall Hauch wrote: > > > Hello all, > > > > > > Many users struggle with the connector restart REST API only restarting > > the > > > Connector instance rather than everything they associated with a > "named" > > > connector. I've created a KIP that attempts to solve this problem via > new > > > query parameters on an existing REST API method, though by default it > > > remains backward compatible with older behavior: > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-745%3A+Connect+API+to+restart+connector+and+tasks > > > > > > Please take a look and respond here with questions. I'd love to get > this > > > into AK 3.0.0, and the KIP freeze is coming up soon. > > > > > > Best regards, > > > > > > Randall > > > > > >