Re: [DISCUSS] KIP-1086: Add ability to specify a custom produce request parser.

Colin McCabe Thu, 29 Aug 2024 11:37:19 -0700

Hi Maxim,

The issue is not the number of lines of code, but the need to support this 
indefinitely in the future. For example, if we want to move to a new request 
type that is different than ProduceRequest, or create a new version of 
ProduceRequest, we wouldn't be able to do that in the future while maintaining 
the compatibility promised in this KIP. So this would be a very significant 
constraint on development, which we probably wouldn't adopt unless there was no 
other way forward.


best,
Colin

On Thu, Aug 29, 2024, at 11:27, Maxim Fortun wrote:
> Hi Colin,
> Thank you for the background. As a new user of Kafka I appreciate being 
> caught up. I tried looking for this in PRs and KIPs but did not find 
> any references.
> Since my change is tiny, doesn't really change the way the broker 
> works, and only adds an ability to customize the broker without 
> actually changing how the broker does things - I am hoping I'll be able 
> to persuade the community to adopt it.
> This feature is actually critical for a rather large-scale 
> high-throughput high-volume project that the company I work for is 
> currently POCing. We are maintaining a fork with this feature, but I'd 
> prefer not to maintain a fork, and also hoping the community may 
> benefit from this.
> Thanks,
> Max
>
>> On Aug 29, 2024, at 2:02 PM, Colin McCabe <cmcc...@apache.org> wrote:
>> 
>> Hi Maxim,
>> 
>> Server-side interceptors for messages have been discussed several times, but 
>> the KIP never got accepted. People often pushed back on what they saw as the 
>> risks and performance problems of allowing user-supplied code in the hot 
>> path of accepting records.
>> 
>> Your KIP is a bit different because it just specifies accessing the 
>> ProduceRequest. But I think you still have similar problems. What happens if 
>> the user-supplied code crashes or takes a long time? If we change the format 
>> of ProduceRequest, is it acceptable for all these interceptors to start 
>> failing?
>> 
>> I suspect that if we ever add server-side interceptors, we'll want to give 
>> access to records, not to internal data structures like Requests. This could 
>> involve a big performance hit, but perhaps people will be willing to pay it.
>> 
>> best,
>> Colin
>> 
>> 
>> On Thu, Aug 29, 2024, at 09:36, Maxim Fortun wrote:
>>> Oooh! All great questions! Answering inline.
>>> 
>>>> On Aug 29, 2024, at 12:00 PM, Kirk True <k...@kirktrue.pro> wrote:
>>>> 
>>>> Hi Maxim,
>>>> 
>>>> Thanks for the KIP!
>>>> 
>>>> This is not my area, but I have some questions/comments:
>>>> 
>>>> 1. CMIIW, but I believe that specifying this custom parser class should be 
>>>> done via configuration, not system properties or environment variables, 
>>>> right?
>>> There are several considerations that were involved here.
>>> 1. Currently ProduceRequest.parse() is a static method, and as such has 
>>> access only to static members of the class.
>>> 2. I found no precedent in Kafka for using configs in static 
>>> initializers. I did find precedents for props and envs.
>>> 3. Static members are initialized during the class load time and do not 
>>> yet have access to instances of the configuration class.
>>> 4. Kafka can be run in multiple ways and in multiple environments that 
>>> pass configuration differently from just files. Docker and K8s envs 
>>> pass configs with env vars.
>>> 
>>> Ultimately, above was my reasoning for doing it via props and envs. If 
>>> there is a better suggestion in this scenario, I'll adopt the code. I 
>>> just do not currently see how this is possible.
>>> 
>>>> 2. What happens if the custom parser can’t be loaded? Say the class isn’t 
>>>> found on the classpath, etc.?
>>> Custom parsers are rather advanced and require quite a bit of Kafka 
>>> internals knowledge. As such, only advanced users would be using this 
>>> feature. 
>>> If such an advanced user is specifying a custom parser class that is 
>>> either unavailable or has problems Kafka will fail to process requests 
>>> and report the errors in the log file. 
>>> The user will then be able to troubleshoot and adjust their 
>>> configuration as appropriate.
>>> 
>>>> 3. What is the behavior around exception handling? Right now the parser 
>>>> handles malformed data, but what if the custom parser wants to enforce a 
>>>> policy requirement?
>>> The whole purpose of allowing the custom parser is to allow users to 
>>> build one for their own use–cases. Each use-case is unique and we can't 
>>> enforce it on a broker level. If the custom parser will not handle, but 
>>> throw an exception, the exception will be handled the way it is handled 
>>> currently. The error will be reported and the request will be dropped.
>>> 
>>>> 4. Given the above, I do believe there are some additional cases that 
>>>> deserve mentioning in the test plan section.
>>> I presume you mean to test for a custom class specified via props, via 
>>> envs, being absent, throwing exceptions at init, throwing exceptions at 
>>> parse time?
>>> I'll look into creating these. 
>>> Thanks,
>>> Max
>>> 
>>>> 
>>>> Thanks,
>>>> Kirk
>>>> 
>>>>> On Aug 29, 2024, at 7:25 AM, Maxim Fortun <m...@maxf.net> wrote:
>>>>> 
>>>>> Hi all,
>>>>> I would like to introduce a minor code change to allow custom produce 
>>>>> request parsers. 
>>>>> 
>>>>> KIP: 
>>>>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=318606528
>>>>> JIRA: https://issues.apache.org/jira/browse/KAFKA-17348
>>>>> PR: https://github.com/apache/kafka/pull/16812
>>>>> 
>>>>> There are many potential benefits for this feature. A custom produce 
>>>>> request parser would allow to intercept all incoming messages before they 
>>>>> get into the broker and apply broker wide logic to the messages. This 
>>>>> could be a trace, a filter, a transform(such as lineage), forcing 
>>>>> required headers across all messages, compression, signing, encryption, 
>>>>> or any other message manipulation before it gets into the broker. 
>>>>> 
>>>>> Please take a look.
>>>>> Any and all feedback is greatly appreciated.
>>>>> Thanks,
>>>>> Max
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>> 
>>

Re: [DISCUSS] KIP-1086: Add ability to specify a custom produce request parser.

Reply via email to