Chris, all, I've just read KIP-618, and let me congratulate you first of all for this impressive piece of work! Here's a few small suggestions and questions I had while reading:
* TransactionContext: What's the use case for the methods accepting a source record (commitTransaction(SourceRecord record), abortTransaction(SourceRecord record))? * SourceTaskContext: Typo in "when the sink connector is deployed" -> source task * SourceTaskContext: Instead of guarding against NSME, is there a way for a connector to query the KC version and thus derive its capabilities? Going forward, a generic API for querying capabilities could be nice, so a connector can query for capabilities of the runtime in a safe and compatible way. * SourceConnector: exactlyOnceSupport() -> false return value doesn't match * SourceConnector: Would it make sense to merge the two methods perhaps and return one enum of { SUPPORTED, NOT_SUPPORTED, SUPPORTED_WITH_BOUNDARIES }? Or, alternatively return an enum from canDefineTransactionBoundaries(), too; even if it only has two values now, that'd allow for extension in the future And one general question: in Debezium, we have some connectors that produce records "out-of-bands" to a schema history topic via their own custom producer. Is there any way envisionable where such a producer would participate in the transaction managed by the KC runtime environment? Thanks a lot, --Gunnar Am Mo., 24. Mai 2021 um 18:45 Uhr schrieb Chris Egerton <chr...@confluent.io.invalid>: > Hi all, > > Wanted to note here that I've updated the KIP document to include the > changes discussed recently. They're mostly located in the "Public > Interfaces" section. I suspect discussion hasn't concluded yet and there > will probably be a few more changes to come, but wanted to take the > opportunity to provide a snapshot of what the current design looks like. > > Cheers, > > Chris > > On Fri, May 21, 2021 at 4:32 PM Chris Egerton <chr...@confluent.io> wrote: > > > Hi Tom, > > > > Wow, I was way off base! I was thinking that the intent of the fencible > > producer was to employ it by default with 3.0, as opposed to only after > the > > worker-level > > "exactly.once.source.enabled" property was flipped on. You are correct > > that with the case you were actually describing, there would be no > > heightened ACL requirements, and that it would leave room in the future > for > > exactly-once to be disabled on a per-connector basis (as long as all the > > workers in the cluster already had "exactly.once.source.enabled" set to > > "true") with no worries about breaking changes. > > > > I agree that this is something for another KIP; even if we could squeeze > > it in in time for this release, it might be a bit much for new users to > > take in all at once. But I can add it to the doc as "future work" since > > it's a promising idea that could prove valuable to someone who might need > > per-connector granularity in the future. > > > > Thanks for clearing things up; in retrospect your comments make a lot > more > > sense now, and I hope I've sufficiently addressed them by now. > > > > PSA for you and everyone else--I plan on updating the doc next week with > > the new APIs for connector-defined transaction boundaries, > > user-configurable transaction boundaries (i.e., poll vs. interval vs. > > connectors), and preflight checks for exactly-once validation (required > vs. > > requested). > > > > Cheers, > > > > Chris > > > > On Fri, May 21, 2021 at 7:14 AM Tom Bentley <tbent...@redhat.com> wrote: > > > >> Hi Chris, > >> > >> Thanks for continuing to entertain some of these ideas. > >> > >> On Fri, May 14, 2021 at 5:06 PM Chris Egerton > <chr...@confluent.io.invalid > >> > > >> wrote: > >> > >> > [...] > >> > > >> That's true, but we do go from three static ACLs (write/describe on a > >> fixed > >> > transactional ID, and idempotent write on a fixed cluster) to a > dynamic > >> > collection of ACLs. > >> > > >> > >> I'm not quite sure I follow, maybe I've lost track. To be clear, I was > >> suggesting the use of a 'fencing producer' only in clusters with > >> exactly.once.source.enabled=true where I imagined the key difference > >> between the exactly once and fencing cases was how the producer was > >> configured/used (transactional vs this new fencing semantic). I think > the > >> ACL requirements for connector producer principals would therefore be > the > >> same as currently described in the KIP. The same is true for the worker > >> principals (which is the only breaking change you give in the KIP). So I > >> don't think the fencing idea changes the backwards compatibility story > >> that's already in the KIP, just allows a safe per-connector > >> exactly.once=disabled option to be supported (with required as requested > >> as > >> we already discussed). > >> > >> But I'm wondering whether I've overlooked something. > >> > >> Ultimately I think it may behoove us to err on the side of reducing the > >> > breaking changes here for now and saving them for 4.0 (or some later > >> major > >> > release), but would be interested in thoughts from you and others. > >> > > >> > >> Difficult to answer (given I think I might be missing something). > >> If there are breaking changes then I don't disagree. It's difficult to > >> reason about big changes like this without some practical experience. > >> If there are not, then I think we could also implement the whole > >> exactly.once=disabled thing in a later KIP without additional breaking > >> changes (i.e. some time in 3.x), right? > >> > >> > >> > > Gouzhang also has a (possible) use case for a fencing-only producer > ( > >> > https://issues.apache.org/jira/browse/KAFKA-12693), and as he points > >> out > >> > there, you should be able to get these semantics today by calling > >> > initTransactions() and then just using the producer as normal (no > >> > beginTransaction()/abortTransaction()/endTransaction()). > >> > > >> > I tested this locally and was not met with success; transactional > >> producers > >> > do a check right now to ensure that any calls to "KafkaProducer::send" > >> > occur within a transaction (see > >> > > >> > > >> > https://github.com/apache/kafka/blob/29c55fdbbc331bbede17908ccc878953a1b15d87/clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java#L957-L959 > >> > and > >> > > >> > > >> > https://github.com/apache/kafka/blob/29c55fdbbc331bbede17908ccc878953a1b15d87/clients/src/main/java/org/apache/kafka/clients/producer/internals/TransactionManager.java#L450-L451 > >> > ). > >> > Not a blocker, just noting that we'd have to do some legwork to make > >> this > >> > workable with the producer API. > >> > > >> > >> Ah, sorry, I should have actually tried it rather than just taking a > quick > >> look at the code. > >> > >> Rather than remove those safety checks I suppose we'd need a way of > >> distinguishing, in the config, the difference in semantics. E.g. > Something > >> like a fencing.id config, which was mutually exclusive with > >> transactional.id. > >> Likewise perhaps initFencing() alongside initTransactions() in the API. > >> But > >> I think at this point it's something for another KIP. > >> > >> Kind regards, > >> > >> Tom > >> > > >