Hello all, I've updated a KIP again to add a few sentences about the general problem we were facing in the motivation section. Please let me know if there is any further feedback.
On Tue, Apr 3, 2018 at 1:46 PM, Matt Farmer <m...@frmr.me> wrote: > Hey Randall, > > Devil's advocate sparring is always a fun game so I'm down. =) > > Rebalance caused by connectivity failure is the case that caused us to > notice the issue. But the issue > is really more about giving connectors the tools to facilitate rebalances > or a Kafka connect shutdown > cleanly. Perhaps that wasn't clear in the KIP. > > In our case timeouts were *not* uniformly affecting tasks. But every time > a timeout occurred in one task, > all tasks lost whatever forward progress they had made. So, yes, in the > specific case of timeouts a > backoff jitter in the connector is a solution for that particular issue. > However, this KIP *also* gives connectors > enough information to behave intelligently during any kind of rebalance or > shutdown because they're able > to discover that preCommit is being invoked for that specific reason. (As > opposed to being invoked > during normal operation.) > > On Tue, Apr 3, 2018 at 12:36 PM, Randall Hauch <rha...@gmail.com> wrote: > >> Matt, >> >> Let me play devil's advocate. Do we need this additional complexity? The >> motivation section talked about needing to deal with task failures due to >> connectivity problems. Generally speaking, isn't it possible that if one >> task has connectivity problems with either the broker or the external >> system that other tasks would as well? And in the particular case of S3, >> is >> it possible to try and prevent the task shutdown in the first place, >> perhaps by improving how the S3 connector retries? (We did this in the >> Elasticsearch connector using backoff with jitter; see >> https://github.com/confluentinc/kafka-connect-elasticsearch/pull/116 for >> details.) >> >> Best regards, >> >> Randall >> >> On Tue, Apr 3, 2018 at 8:39 AM, Matt Farmer <m...@frmr.me> wrote: >> >> > I have made the requested updates to the KIP! :) >> > >> > On Mon, Apr 2, 2018 at 11:02 AM, Matt Farmer <m...@frmr.me> wrote: >> > >> > > Ugh >> > > >> > > * I can update >> > > >> > > I need more caffeine... >> > > >> > > On Mon, Apr 2, 2018 at 11:01 AM, Matt Farmer <m...@frmr.me> wrote: >> > > >> > >> I'm can update the rejected alternatives section as you describe. >> > >> >> > >> Also, adding a paragraph to the preCommit javadoc also seems like a >> > >> Very Very Good Idea™ so I'll make that update to the KIP as well. >> > >> >> > >> On Mon, Apr 2, 2018 at 10:48 AM, Randall Hauch <rha...@gmail.com> >> > wrote: >> > >> >> > >>> Thanks for the KIP proposal, Matt. >> > >>> >> > >>> You mention in the "Rejected Alternatives" section that you >> considered >> > >>> changing the signature of the `preCommit` method but rejected it >> > because >> > >>> it >> > >>> would break backward compatibility. Strictly speaking, it is >> possible >> > to >> > >>> do >> > >>> this without breaking compatibility by introducing a new `preCommit` >> > >>> method, deprecating the old one, and having the new implementation >> call >> > >>> the >> > >>> old one. Such an approach would be complicated, and I'm not sure it >> > adds >> > >>> any value. In fact, one of the benefits of having a context object >> is >> > >>> that >> > >>> we can add methods like the one you're proposing without causing any >> > >>> compatibility issues. Anyway, it probably is worth updating this >> > rejected >> > >>> alternative to be a bit more precise. >> > >>> >> > >>> Otherwise, I think this is a good approach, though I'd request that >> we >> > >>> update the `preCommit` JavaDoc to add a paragraph that explains this >> > >>> scenario. Thoughts? >> > >>> >> > >>> Randall >> > >>> >> > >>> On Wed, Mar 28, 2018 at 9:29 PM, Ted Yu <yuzhih...@gmail.com> >> wrote: >> > >>> >> > >>> > I looked at WorkerSinkTask and it seems using a boolean for >> KIP-275 >> > >>> should >> > >>> > suffice for now. >> > >>> > >> > >>> > Thanks >> > >>> > >> > >>> > On Wed, Mar 28, 2018 at 7:20 PM, Matt Farmer <m...@frmr.me> >> wrote: >> > >>> > >> > >>> > > Hey Ted, >> > >>> > > >> > >>> > > I have not, actually! >> > >>> > > >> > >>> > > Do you think that we're likely to add multiple states here soon? >> > >>> > > >> > >>> > > My instinct is to keep it simple until there are multiple states >> > >>> that we >> > >>> > > would want >> > >>> > > to consider. I really like the simplicity of just getting a >> boolean >> > >>> and >> > >>> > the >> > >>> > > implementation of WorkerSinkTask already passes around a >> boolean to >> > >>> > > indicate this is happening internally. We're really just >> shuttling >> > >>> that >> > >>> > > value into >> > >>> > > the context at the correct moments. >> > >>> > > >> > >>> > > Once we have multiple states, we could choose to provide a more >> > >>> > > appropriately >> > >>> > > named method (e.g. getState?) and reimplement isClosing by >> checking >> > >>> that >> > >>> > > enum >> > >>> > > without breaking compatibility. >> > >>> > > >> > >>> > > However, if we think multiple states here are imminent for some >> > >>> reason, I >> > >>> > > would >> > >>> > > be pretty easy to convince adding that would be worth the extra >> > >>> > complexity! >> > >>> > > :) >> > >>> > > >> > >>> > > Matt >> > >>> > > >> > >>> > > — >> > >>> > > Matt Farmer | Blog <http://farmdawgnation.com/> | Twitter >> > >>> > > <http://twitter.com/farmdawgnation> >> > >>> > > GPG: CD57 2E26 F60C 0A61 E6D8 FC72 4493 8917 D667 4D07 >> > >>> > > >> > >>> > > On Wed, Mar 28, 2018 at 10:02 PM, Ted Yu <yuzhih...@gmail.com> >> > >>> wrote: >> > >>> > > >> > >>> > > > The enhancement gives SinkTaskContext state information. >> > >>> > > > >> > >>> > > > Have you thought of exposing the state retrieval as an enum >> > >>> (initially >> > >>> > > with >> > >>> > > > two values) ? >> > >>> > > > >> > >>> > > > Thanks >> > >>> > > > >> > >>> > > > On Wed, Mar 28, 2018 at 6:55 PM, Matt Farmer <m...@frmr.me> >> > wrote: >> > >>> > > > >> > >>> > > > > Hello all, >> > >>> > > > > >> > >>> > > > > I am proposing KIP-275 to improve Connect's SinkTaskContext >> so >> > >>> that >> > >>> > > Sinks >> > >>> > > > > can be informed >> > >>> > > > > in their preCommit hook if the hook is being invoked as a >> part >> > >>> of a >> > >>> > > > > rebalance or Connect >> > >>> > > > > shutdown. >> > >>> > > > > >> > >>> > > > > The KIP is here: >> > >>> > > > > https://cwiki.apache.org/confluence/pages/viewpage. >> > >>> > > > action?pageId=75977607 >> > >>> > > > > >> > >>> > > > > Please let me know what feedback y'all have. Thanks! >> > >>> > > > > >> > >>> > > > >> > >>> > > >> > >>> > >> > >>> >> > >> >> > >> >> > > >> > >> > >