Hi Maxim, Server-side interceptors for messages have been discussed several times, but the KIP never got accepted. People often pushed back on what they saw as the risks and performance problems of allowing user-supplied code in the hot path of accepting records.
Your KIP is a bit different because it just specifies accessing the ProduceRequest. But I think you still have similar problems. What happens if the user-supplied code crashes or takes a long time? If we change the format of ProduceRequest, is it acceptable for all these interceptors to start failing? I suspect that if we ever add server-side interceptors, we'll want to give access to records, not to internal data structures like Requests. This could involve a big performance hit, but perhaps people will be willing to pay it. best, Colin On Thu, Aug 29, 2024, at 09:36, Maxim Fortun wrote: > Oooh! All great questions! Answering inline. > >> On Aug 29, 2024, at 12:00 PM, Kirk True <k...@kirktrue.pro> wrote: >> >> Hi Maxim, >> >> Thanks for the KIP! >> >> This is not my area, but I have some questions/comments: >> >> 1. CMIIW, but I believe that specifying this custom parser class should be >> done via configuration, not system properties or environment variables, >> right? > There are several considerations that were involved here. > 1. Currently ProduceRequest.parse() is a static method, and as such has > access only to static members of the class. > 2. I found no precedent in Kafka for using configs in static > initializers. I did find precedents for props and envs. > 3. Static members are initialized during the class load time and do not > yet have access to instances of the configuration class. > 4. Kafka can be run in multiple ways and in multiple environments that > pass configuration differently from just files. Docker and K8s envs > pass configs with env vars. > > Ultimately, above was my reasoning for doing it via props and envs. If > there is a better suggestion in this scenario, I'll adopt the code. I > just do not currently see how this is possible. > >> 2. What happens if the custom parser can’t be loaded? Say the class isn’t >> found on the classpath, etc.? > Custom parsers are rather advanced and require quite a bit of Kafka > internals knowledge. As such, only advanced users would be using this > feature. > If such an advanced user is specifying a custom parser class that is > either unavailable or has problems Kafka will fail to process requests > and report the errors in the log file. > The user will then be able to troubleshoot and adjust their > configuration as appropriate. > >> 3. What is the behavior around exception handling? Right now the parser >> handles malformed data, but what if the custom parser wants to enforce a >> policy requirement? > The whole purpose of allowing the custom parser is to allow users to > build one for their own use–cases. Each use-case is unique and we can't > enforce it on a broker level. If the custom parser will not handle, but > throw an exception, the exception will be handled the way it is handled > currently. The error will be reported and the request will be dropped. > >> 4. Given the above, I do believe there are some additional cases that >> deserve mentioning in the test plan section. > I presume you mean to test for a custom class specified via props, via > envs, being absent, throwing exceptions at init, throwing exceptions at > parse time? > I'll look into creating these. > Thanks, > Max > >> >> Thanks, >> Kirk >> >>> On Aug 29, 2024, at 7:25 AM, Maxim Fortun <m...@maxf.net> wrote: >>> >>> Hi all, >>> I would like to introduce a minor code change to allow custom produce >>> request parsers. >>> >>> KIP: >>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=318606528 >>> JIRA: https://issues.apache.org/jira/browse/KAFKA-17348 >>> PR: https://github.com/apache/kafka/pull/16812 >>> >>> There are many potential benefits for this feature. A custom produce >>> request parser would allow to intercept all incoming messages before they >>> get into the broker and apply broker wide logic to the messages. This could >>> be a trace, a filter, a transform(such as lineage), forcing required >>> headers across all messages, compression, signing, encryption, or any other >>> message manipulation before it gets into the broker. >>> >>> Please take a look. >>> Any and all feedback is greatly appreciated. >>> Thanks, >>> Max >>> >>> >>> >>> >>