Re: [DISCUSS] KIP-1086: Add ability to specify a custom produce request parser.

Maxim Fortun Thu, 29 Aug 2024 11:54:23 -0700

I totally understand the burden of maintaining features going forward. The 
backward compatibility breaking behavior you are describing is a normal part of 
software evolution, just like deprecation. The potential for having to break 
backward compatibility sometime in the future is hardly a reason not to add 
features today. With this reasoning we'd never add any features :) Any of them 
may change in the future :)




> On Aug 29, 2024, at 2:36 PM, Colin McCabe <cmcc...@apache.org> wrote:
> 
> Hi Maxim,
> 
> The issue is not the number of lines of code, but the need to support this 
> indefinitely in the future. For example, if we want to move to a new request 
> type that is different than ProduceRequest, or create a new version of 
> ProduceRequest, we wouldn't be able to do that in the future while 
> maintaining the compatibility promised in this KIP. So this would be a very 
> significant constraint on development, which we probably wouldn't adopt 
> unless there was no other way forward.
> 
> best,
> Colin
> 
> On Thu, Aug 29, 2024, at 11:27, Maxim Fortun wrote:
>> Hi Colin,
>> Thank you for the background. As a new user of Kafka I appreciate being 
>> caught up. I tried looking for this in PRs and KIPs but did not find 
>> any references.
>> Since my change is tiny, doesn't really change the way the broker 
>> works, and only adds an ability to customize the broker without 
>> actually changing how the broker does things - I am hoping I'll be able 
>> to persuade the community to adopt it.
>> This feature is actually critical for a rather large-scale 
>> high-throughput high-volume project that the company I work for is 
>> currently POCing. We are maintaining a fork with this feature, but I'd 
>> prefer not to maintain a fork, and also hoping the community may 
>> benefit from this.
>> Thanks,
>> Max
>> 
>>> On Aug 29, 2024, at 2:02 PM, Colin McCabe <cmcc...@apache.org> wrote:
>>> 
>>> Hi Maxim,
>>> 
>>> Server-side interceptors for messages have been discussed several times, 
>>> but the KIP never got accepted. People often pushed back on what they saw 
>>> as the risks and performance problems of allowing user-supplied code in the 
>>> hot path of accepting records.
>>> 
>>> Your KIP is a bit different because it just specifies accessing the 
>>> ProduceRequest. But I think you still have similar problems. What happens 
>>> if the user-supplied code crashes or takes a long time? If we change the 
>>> format of ProduceRequest, is it acceptable for all these interceptors to 
>>> start failing?
>>> 
>>> I suspect that if we ever add server-side interceptors, we'll want to give 
>>> access to records, not to internal data structures like Requests. This 
>>> could involve a big performance hit, but perhaps people will be willing to 
>>> pay it.
>>> 
>>> best,
>>> Colin
>>> 
>>> 
>>> On Thu, Aug 29, 2024, at 09:36, Maxim Fortun wrote:
>>>> Oooh! All great questions! Answering inline.
>>>> 
>>>>> On Aug 29, 2024, at 12:00 PM, Kirk True <k...@kirktrue.pro> wrote:
>>>>> 
>>>>> Hi Maxim,
>>>>> 
>>>>> Thanks for the KIP!
>>>>> 
>>>>> This is not my area, but I have some questions/comments:
>>>>> 
>>>>> 1. CMIIW, but I believe that specifying this custom parser class should 
>>>>> be done via configuration, not system properties or environment 
>>>>> variables, right?
>>>> There are several considerations that were involved here.
>>>> 1. Currently ProduceRequest.parse() is a static method, and as such has 
>>>> access only to static members of the class.
>>>> 2. I found no precedent in Kafka for using configs in static 
>>>> initializers. I did find precedents for props and envs.
>>>> 3. Static members are initialized during the class load time and do not 
>>>> yet have access to instances of the configuration class.
>>>> 4. Kafka can be run in multiple ways and in multiple environments that 
>>>> pass configuration differently from just files. Docker and K8s envs 
>>>> pass configs with env vars.
>>>> 
>>>> Ultimately, above was my reasoning for doing it via props and envs. If 
>>>> there is a better suggestion in this scenario, I'll adopt the code. I 
>>>> just do not currently see how this is possible.
>>>> 
>>>>> 2. What happens if the custom parser can’t be loaded? Say the class isn’t 
>>>>> found on the classpath, etc.?
>>>> Custom parsers are rather advanced and require quite a bit of Kafka 
>>>> internals knowledge. As such, only advanced users would be using this 
>>>> feature. 
>>>> If such an advanced user is specifying a custom parser class that is 
>>>> either unavailable or has problems Kafka will fail to process requests 
>>>> and report the errors in the log file. 
>>>> The user will then be able to troubleshoot and adjust their 
>>>> configuration as appropriate.
>>>> 
>>>>> 3. What is the behavior around exception handling? Right now the parser 
>>>>> handles malformed data, but what if the custom parser wants to enforce a 
>>>>> policy requirement?
>>>> The whole purpose of allowing the custom parser is to allow users to 
>>>> build one for their own use–cases. Each use-case is unique and we can't 
>>>> enforce it on a broker level. If the custom parser will not handle, but 
>>>> throw an exception, the exception will be handled the way it is handled 
>>>> currently. The error will be reported and the request will be dropped.
>>>> 
>>>>> 4. Given the above, I do believe there are some additional cases that 
>>>>> deserve mentioning in the test plan section.
>>>> I presume you mean to test for a custom class specified via props, via 
>>>> envs, being absent, throwing exceptions at init, throwing exceptions at 
>>>> parse time?
>>>> I'll look into creating these. 
>>>> Thanks,
>>>> Max
>>>> 
>>>>> 
>>>>> Thanks,
>>>>> Kirk
>>>>> 
>>>>>> On Aug 29, 2024, at 7:25 AM, Maxim Fortun <m...@maxf.net> wrote:
>>>>>> 
>>>>>> Hi all,
>>>>>> I would like to introduce a minor code change to allow custom produce 
>>>>>> request parsers. 
>>>>>> 
>>>>>> KIP: 
>>>>>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=318606528
>>>>>> JIRA: https://issues.apache.org/jira/browse/KAFKA-17348
>>>>>> PR: https://github.com/apache/kafka/pull/16812
>>>>>> 
>>>>>> There are many potential benefits for this feature. A custom produce 
>>>>>> request parser would allow to intercept all incoming messages before 
>>>>>> they get into the broker and apply broker wide logic to the messages. 
>>>>>> This could be a trace, a filter, a transform(such as lineage), forcing 
>>>>>> required headers across all messages, compression, signing, encryption, 
>>>>>> or any other message manipulation before it gets into the broker. 
>>>>>> 
>>>>>> Please take a look.
>>>>>> Any and all feedback is greatly appreciated.
>>>>>> Thanks,
>>>>>> Max
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>> 
>

Re: [DISCUSS] KIP-1086: Add ability to specify a custom produce request parser.

Reply via email to