Hi Greg Thanks for your comments/questions.
1. You raise a good point and I agree that when compression is enabled on a large batch size, then the impact on message size will not be as significant. However, this would still be useful enhancement for scenarios where the JSON message does not contain a schema (as producers do not always create messages with the schema embedded in the payload). 2. Yes, I did consider this and I will update the KIP to reflect that along with my reasons for not taking that approach. When I tried to create a custom JsonConverter, I was having to duplicate a lot of code from this Json Converter (private methods etc.) for the custom converter to work. It would be much neater if we do not have to duplicate code if we can enhance it with this KIP instead. 3. Thanks for bringing this to my attention. I will update the KIP to incorporate this and ensure that we do not open up a route for unrestricted access to the file system. Thanks Priyanka From: Greg Harris <greg.har...@aiven.io.INVALID> Date: Friday, 7 June 2024 at 11:48 PM To: dev@kafka.apache.org <dev@kafka.apache.org> Subject: [EXTERNAL] Re: [DISCUSS] KIP-1054: Support external schemas in JSONConverter Hi Priyanka, Thanks for the KIP! I think you captured the motivation well: The Converter interface on its own implies a fairly large raw message size, and the ecosystem's strategies for deduplicating schema information are complex. I did have some questions/concerns. 1. Have you done a comparison between the external schemas approach and the current schemas.enable approach with compression enabled? I think that's currently the best alternative for users without an external schemas service, so I'd like to get an idea of how impactful this feature is in an example use-case. 2. This feature could be implemented without a KIP by creating and distributing a custom JsonConverter. Could you add that as a rejected alternative, with some justification for why it was rejected? 3. Recently we made an effort [1] to secure the FileConfigProvider and DirectoryConfigProvider plugins, which each have the capability of reading from the worker's disk. How do you think this could be made secure, such that a REST API user can't have unrestricted access to the filesystem via this plugin? And to synthesize points 2 and 3: It is very reasonable for "dangerous" features to be opted-in through installation of a custom Converter by operators that understand the risks. It is unreasonable to deliver a "dangerous" feature to operators without a simple way to opt-out, as the upstream plugins are somewhat difficult to remove. Thanks, Greg [1] https://cwiki.apache.org/confluence/display/KAFKA/KIP-993%3A+Allow+restricting+files+accessed+by+File+and+Directory+ConfigProviders On Fri, Jun 7, 2024 at 8:40 AM Fan Yang <fan...@hotmail.com> wrote: > Hi Priyanka, > > My suggestion is that the we can place schema file at any location, such > as file location, network location, etc. So, the new option could be just > schema.location > > > Best, > Fan > ________________________________ > From: Priyanka K U <priyanka....@ibm.com.INVALID> > Sent: Friday, June 7, 2024 16:22 > To: dev@kafka.apache.org <dev@kafka.apache.org> > Subject: [DISCUSS] KIP-1054: Support external schemas in JSONConverter > > Hi everyone, > > I'd like to start a discussion of KIP-1054, which aims to Support external > schemas in JSONConverter to Kafka Connect: > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1054%3A+Support+external+schemas+in+JSONConverter > > > > Looking forward for your feedback. > > Regards, > Priyanka >