Hi Greg

Thanks for your comments/questions.

  1.  You raise a good point and I agree that when compression is enabled on a 
large batch size, then the impact on message size will not be as significant. 
However, this would still be useful enhancement for scenarios where the JSON 
message does not contain a schema (as producers do not always create messages 
with the schema embedded in the payload).
  2.  Yes, I did consider this and I will update the KIP to reflect that along 
with my reasons for not taking that approach. When I tried to create a custom 
JsonConverter, I was having to duplicate a lot of code from this Json Converter 
(private methods etc.) for the custom converter to work. It would be much 
neater if we do not have to duplicate code if we can enhance it with this KIP 
instead.
  3.  Thanks for bringing this to my attention. I will update the KIP to 
incorporate this and ensure that we do not open up a route for unrestricted 
access to the file system.
Thanks
Priyanka

From: Greg Harris <greg.har...@aiven.io.INVALID>
Date: Friday, 7 June 2024 at 11:48 PM
To: dev@kafka.apache.org <dev@kafka.apache.org>
Subject: [EXTERNAL] Re: [DISCUSS] KIP-1054: Support external schemas in 
JSONConverter
Hi Priyanka,

Thanks for the KIP! I think you captured the motivation well: The Converter
interface on its own implies a fairly large raw message size, and the
ecosystem's strategies for deduplicating schema information are complex. I
did have some questions/concerns.

1. Have you done a comparison between the external schemas approach and the
current schemas.enable approach with compression enabled? I think that's
currently the best alternative for users without an external schemas
service, so I'd like to get an idea of how impactful this feature is in an
example use-case.

2. This feature could be implemented without a KIP by creating and
distributing a custom JsonConverter. Could you add that as a rejected
alternative, with some justification for why it was rejected?

3. Recently we made an effort [1] to secure the FileConfigProvider and
DirectoryConfigProvider plugins, which each have the capability of reading
from the worker's disk. How do you think this could be made secure, such
that a REST API user can't have unrestricted access to the filesystem via
this plugin?

And to synthesize points 2 and 3: It is very reasonable for "dangerous"
features to be opted-in through installation of a custom Converter by
operators that understand the risks. It is unreasonable to deliver a
"dangerous" feature to operators without a simple way to opt-out, as the
upstream plugins are somewhat difficult to remove.

Thanks,
Greg

[1]
https://cwiki.apache.org/confluence/display/KAFKA/KIP-993%3A+Allow+restricting+files+accessed+by+File+and+Directory+ConfigProviders

On Fri, Jun 7, 2024 at 8:40 AM Fan Yang <fan...@hotmail.com> wrote:

> Hi Priyanka,
>
> My suggestion is that the we can place schema file at any location, such
> as file location, network location, etc. So, the new option could be just
> schema.location
>
>
> Best,
> Fan
> ________________________________
> From: Priyanka K U <priyanka....@ibm.com.INVALID>
> Sent: Friday, June 7, 2024 16:22
> To: dev@kafka.apache.org <dev@kafka.apache.org>
> Subject: [DISCUSS] KIP-1054: Support external schemas in JSONConverter
>
> Hi everyone,
>
> I'd like to start a discussion of KIP-1054, which aims to Support external
> schemas in JSONConverter  to Kafka Connect:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1054%3A+Support+external+schemas+in+JSONConverter
>
>
>
> Looking forward for your feedback.
>
> Regards,
> Priyanka
>

Reply via email to