Hi Maxim, The issue is not the number of lines of code, but the need to support this indefinitely in the future. For example, if we want to move to a new request type that is different than ProduceRequest, or create a new version of ProduceRequest, we wouldn't be able to do that in the future while maintaining the compatibility promised in this KIP. So this would be a very significant constraint on development, which we probably wouldn't adopt unless there was no other way forward.
best, Colin On Thu, Aug 29, 2024, at 11:27, Maxim Fortun wrote: > Hi Colin, > Thank you for the background. As a new user of Kafka I appreciate being > caught up. I tried looking for this in PRs and KIPs but did not find > any references. > Since my change is tiny, doesn't really change the way the broker > works, and only adds an ability to customize the broker without > actually changing how the broker does things - I am hoping I'll be able > to persuade the community to adopt it. > This feature is actually critical for a rather large-scale > high-throughput high-volume project that the company I work for is > currently POCing. We are maintaining a fork with this feature, but I'd > prefer not to maintain a fork, and also hoping the community may > benefit from this. > Thanks, > Max > >> On Aug 29, 2024, at 2:02 PM, Colin McCabe <cmcc...@apache.org> wrote: >> >> Hi Maxim, >> >> Server-side interceptors for messages have been discussed several times, but >> the KIP never got accepted. People often pushed back on what they saw as the >> risks and performance problems of allowing user-supplied code in the hot >> path of accepting records. >> >> Your KIP is a bit different because it just specifies accessing the >> ProduceRequest. But I think you still have similar problems. What happens if >> the user-supplied code crashes or takes a long time? If we change the format >> of ProduceRequest, is it acceptable for all these interceptors to start >> failing? >> >> I suspect that if we ever add server-side interceptors, we'll want to give >> access to records, not to internal data structures like Requests. This could >> involve a big performance hit, but perhaps people will be willing to pay it. >> >> best, >> Colin >> >> >> On Thu, Aug 29, 2024, at 09:36, Maxim Fortun wrote: >>> Oooh! All great questions! Answering inline. >>> >>>> On Aug 29, 2024, at 12:00 PM, Kirk True <k...@kirktrue.pro> wrote: >>>> >>>> Hi Maxim, >>>> >>>> Thanks for the KIP! >>>> >>>> This is not my area, but I have some questions/comments: >>>> >>>> 1. CMIIW, but I believe that specifying this custom parser class should be >>>> done via configuration, not system properties or environment variables, >>>> right? >>> There are several considerations that were involved here. >>> 1. Currently ProduceRequest.parse() is a static method, and as such has >>> access only to static members of the class. >>> 2. I found no precedent in Kafka for using configs in static >>> initializers. I did find precedents for props and envs. >>> 3. Static members are initialized during the class load time and do not >>> yet have access to instances of the configuration class. >>> 4. Kafka can be run in multiple ways and in multiple environments that >>> pass configuration differently from just files. Docker and K8s envs >>> pass configs with env vars. >>> >>> Ultimately, above was my reasoning for doing it via props and envs. If >>> there is a better suggestion in this scenario, I'll adopt the code. I >>> just do not currently see how this is possible. >>> >>>> 2. What happens if the custom parser can’t be loaded? Say the class isn’t >>>> found on the classpath, etc.? >>> Custom parsers are rather advanced and require quite a bit of Kafka >>> internals knowledge. As such, only advanced users would be using this >>> feature. >>> If such an advanced user is specifying a custom parser class that is >>> either unavailable or has problems Kafka will fail to process requests >>> and report the errors in the log file. >>> The user will then be able to troubleshoot and adjust their >>> configuration as appropriate. >>> >>>> 3. What is the behavior around exception handling? Right now the parser >>>> handles malformed data, but what if the custom parser wants to enforce a >>>> policy requirement? >>> The whole purpose of allowing the custom parser is to allow users to >>> build one for their own use–cases. Each use-case is unique and we can't >>> enforce it on a broker level. If the custom parser will not handle, but >>> throw an exception, the exception will be handled the way it is handled >>> currently. The error will be reported and the request will be dropped. >>> >>>> 4. Given the above, I do believe there are some additional cases that >>>> deserve mentioning in the test plan section. >>> I presume you mean to test for a custom class specified via props, via >>> envs, being absent, throwing exceptions at init, throwing exceptions at >>> parse time? >>> I'll look into creating these. >>> Thanks, >>> Max >>> >>>> >>>> Thanks, >>>> Kirk >>>> >>>>> On Aug 29, 2024, at 7:25 AM, Maxim Fortun <m...@maxf.net> wrote: >>>>> >>>>> Hi all, >>>>> I would like to introduce a minor code change to allow custom produce >>>>> request parsers. >>>>> >>>>> KIP: >>>>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=318606528 >>>>> JIRA: https://issues.apache.org/jira/browse/KAFKA-17348 >>>>> PR: https://github.com/apache/kafka/pull/16812 >>>>> >>>>> There are many potential benefits for this feature. A custom produce >>>>> request parser would allow to intercept all incoming messages before they >>>>> get into the broker and apply broker wide logic to the messages. This >>>>> could be a trace, a filter, a transform(such as lineage), forcing >>>>> required headers across all messages, compression, signing, encryption, >>>>> or any other message manipulation before it gets into the broker. >>>>> >>>>> Please take a look. >>>>> Any and all feedback is greatly appreciated. >>>>> Thanks, >>>>> Max >>>>> >>>>> >>>>> >>>>> >>>> >>