Bumping this thread as I'd love to get a bit more feedback on the general approach before proceeding.
On Fri, Feb 10, 2023 at 11:41 AM David Mariassy <david.maria...@gmail.com> wrote: > Hi Ahmed, > > Thanks for taking a look at the KIP, and for your insightful feedback! > > I don't disagree with the sentiment that in-band interceptors could be a > potential source of bugs in a cluster. > > Having said that, I don't necessarily think that an in-band interceptor is > significantly riskier than an out-of-band pre-processor. Let's take the > example of platform-wide privacy scrubbing. In my opinion it doesn't really > matter if this feature is deployed as an out-of-band stream processor app > that consumes from all topics OR if the logic is implemented as an in-ban > interceptor. Either way, a faulty release of the scrubber will result in > the platform-wide disruption of data flows. Thus, I'd argue that from the > perspective of the platform's overall health, the level of risk is very > comparable in both cases. However in-band interceptors have a couple of > advantages in my opinion: > 1. They are significantly cheaper (don't require duplicating data between > raw and sanitized topics. There are also a lot of potential savings in > network costs) > 2. They are easier to maintain (no need to set up additional > infrastructure for out-of-band processing) > 3. They can provide accurate produce responses to clients (since there is > no downstream processing that could render a client's messages invalid > async) > > Also, in-band interceptors could be as safe or risky as their authors > design them to be. There's nothing stopping someone from catching all > exceptions in a `processRecord` method, and letting all unprocessed > messages go through or sending them to a DLQ. Once the interceptor is > fixed, those unprocessed messages could get re-ingested into Kafka to > re-attempt pre-processing. > > Thanks and happy Friday, > David > > > > > > On Fri, Feb 10, 2023 at 8:23 AM Ahmed Abdalla <eng.a.abda...@gmail.com> > wrote: > >> Hi David, >> >> That's a very interesting KIP and I wanted to share my two cents. I >> believe >> there's a lot of value and use cases for the ability to intercept, mutate >> and filter Kafka's messages, however I'm not sure if trying to achieve >> that >> via in-band interceptors is the best approach for this. >> >> - My mental model around one of Kafka's core values is the brokers' >> focus on a single functionality (more or less): highly available and >> fault >> tolerant commit log. I see this in many design decisions such as >> off-loading responsibilities to the clients (partitioner, assignor, >> consumer groups coordination etc). >> - And the impact of this KIP on the Kafka server would be adding >> another >> moving part to their "state of the world" that they try to maintain. >> What >> if an interceptor goes bad? What if there're version-mismatch? etc, a >> lot >> of responsibilities that can be managed very efficiently out-of-band >> IMHO. >> - The comparison to NginX and Kubernetes is IMHO comparing apples to >> oranges >> - NginX >> - Doesn't maintain persisted data. >> - It's designed as a middleware, it's an interceptor by nature. >> - Kubernetes >> - CRDs extend the API surface, they don't "augment" existing >> APIs. >> I think admission webhooks >> < >> https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/ >> > >> is >> Kubernetes' solution for providing interceptors. >> - The admission webhooks are out-of-band, and in fact they're a >> great example of "opening up your cluster for extensibility" >> going wrong. >> Installing a misbehaving webhook can brick the whole cluster. >> >> As I mentioned, I see a value for users being able to intercept and >> transform Kafka's messages. But I'm worried that having this as a core >> Kafka feature might not be the best approach for achieving that. >> >> Thanks, >> -- >> Ahmed Abdalla >> T: @devguyio <https://twitter.com/devguyio> >> >> >> On Thu, Feb 9, 2023 at 8:28 PM David Mariassy <david.maria...@gmail.com> >> wrote: >> >> > Hi everyone, >> > >> > I'd like to get a discussion going for KIP-905 >> > < >> > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-905%3A+Broker+interceptors >> > >, >> > which proposes the addition of broker interceptors to the stack. >> > >> > The KIP contains the motivation, and lists the new public interfaces >> that >> > this change would entail. Since my company had its quarterly hack days >> this >> > week, I also took the liberty to throw together a first prototype of the >> > proposed new feature here: https://github.com/apache/kafka/pull/13224. >> > >> > Looking forward to the group's feedback! >> > >> > Thanks, >> > David >> > >> >> >> -- >> *Ahmed Abdalla* >> >