Hi Mario,

Thanks for your responses.

> Do you have some examples for this? In my experience I never had the need
> to add a new type.

Nobody has added a physical type because it's not possible. But we can see
the latent desire through very general types appearing as logical types
[1-2], and in unresolved type issues [3-9]. I picked the Confluent JDBC
connector out of familiarity, but these problems are not particular to that
connector.
Databases cause a lot of friction with our type system because database
type systems are much more mature and expressive. We should learn from the
decades that databases have spent building general data models and improve
our own data model before calling it general and promoting its common use.

> > There are carve-outs in the implementation for framework logical types
(java.util.Date etc)
> Yeah, java.util.Date is quite old these days.

My example of java.util.Date is not that it's old, it's that the framework
does things for its logical types which no third-party logical types can
[10].
But it is indeed old, and if someone were to wish to replace it with
java.time.Instant, they could not do so. Only the framework could add the
required carve-outs. And realistically the framework couldn't even do so,
because introducing it would cause errors and exceptions in plugins which
aren't aware of it.

> Any examples for this?

Whenever you see someone complaining that they got some primitive type
(string or bytes most commonly) instead of their actual type, that's a
failure mode unique to logical types [11-15]. Again don't take this as an
indictment of the JDBC connector, it's particularly exposed to this problem
as a sink connector.
Also, arbitrary logical types being unsupported by the framework
SchemaProjector has been reported multiple times [16]. The schema converter
is being expected by users to handle logical types, which it has no way of
knowing about.

Thanks,
Greg

[1]
https://github.com/confluentinc/schema-registry/blob/e461e9659ef1532a95b3f2dfd38e843d766aa955/schema-converter/src/main/java/io/confluent/connect/schema/ConnectEnum.java
[2]
https://github.com/confluentinc/schema-registry/blob/e461e9659ef1532a95b3f2dfd38e843d766aa955/schema-converter/src/main/java/io/confluent/connect/schema/ConnectUnion.java
[3] https://github.com/confluentinc/kafka-connect-jdbc/issues/1378
[4] https://github.com/confluentinc/kafka-connect-jdbc/issues/1006
[5] https://github.com/confluentinc/kafka-connect-jdbc/issues/1002
[6] https://github.com/confluentinc/kafka-connect-jdbc/issues/651
[7] https://github.com/confluentinc/kafka-connect-jdbc/issues/410
[8] https://github.com/confluentinc/kafka-connect-jdbc/issues/265
[9] https://github.com/confluentinc/kafka-connect-jdbc/issues/81
[10]
https://github.com/apache/kafka/blob/a6faec179a3685d66698a2dc3c6a1823bd0c87f9/connect/api/src/main/java/org/apache/kafka/connect/data/ConnectSchema.java#L68-L71
[11] https://github.com/confluentinc/kafka-connect-jdbc/issues/1155
[12] https://github.com/confluentinc/kafka-connect-jdbc/issues/788
[13] https://github.com/confluentinc/kafka-connect-jdbc/issues/749
[14] https://github.com/confluentinc/kafka-connect-jdbc/issues/563
[15] https://github.com/confluentinc/kafka-connect-jdbc/issues/393
[16] https://issues.apache.org/jira/browse/KAFKA-16257

On Fri, Jan 17, 2025 at 5:57 AM Andrew Schofield <
andrew_schofield_j...@outlook.com> wrote:

> Hi Mario,
> Thanks for the KIP.
>
> My view is that it would be nice for Kafka to have a data model so that
> Kafka could do natively the things that people use schema registries
> for today. I'm not convinced that the Kafka Connect data classes are
> really up to the job, so I'm not sure that there's a lot of value in making
> a separate module. Having said that, it's a pretty small thing to do
> and it probably doesn't preclude someone coming along with a
> Kafka data model in the future.
>
> Just my 2 cents.
>
> Thanks,
> Andrew
> ________________________________________
> From: Mario Fiore Vitale <mvit...@redhat.com>
> Sent: 17 January 2025 11:20
> To: dev@kafka.apache.org <dev@kafka.apache.org>
> Subject: Re: [DISCUSS] KIP-1122: Create a dedicated data module for Kafka
> Connect data classes
>
> Hi all,
>
> Any other feedback on this? I'll take the discussion open for another week
> and then start the vote.
>
> Thanks,
> Mario.
>
> On Fri, Jan 10, 2025 at 10:30 AM Mario Fiore Vitale <mvit...@redhat.com>
> wrote:
>
> > Hi Greg, thanks for giving a look.
> >
> > > It's a closed type system which cannot have new "physical" types added
> by
> > third-party developers
> >
> > Do you have some examples for this? In my experience I never had the need
> > to add a new type.
> >
> > > There are carve-outs in the implementation for framework logical types
> > (java.util.Date etc)
> >
> > Yeah, java.util.Date is quite old these days.
> >
> > > Compatibility between components isn't managed: custom types can be
> > misinterpreted as dumb physical types which is undesirable to users, who
> > may prefer to fail-fast instead.
> >
> > Any examples for this?
> >
> > > I think that we should evolve the Connect data model in the future to
> > resolve some or all of these problems.
> >
> > Yes, this for sure can have a positive impact.
> >
> > > This will be easier and less risky
> > if we change the data model first, before expanding it to non-Connect
> > users.
> >
> > understandable but I have then the feeling that the change to the data
> > model will be not so easy and fast with the risk to block the
> > expansion outside connect.
> > Maybe starting with the expansion and if there will be more traction also
> > outside the Connect community we will have more motivation to improve the
> > data model.
> >
> > > If there was a suitably general data model out there, we could promote
> > it within the Connect ecosystem and deprecate our current model.
> >
> > Honestly, I don't have anything in mind? Do you?
> >
> > Thanks,
> > Mario.
> >
> > On Thu, Jan 9, 2025 at 6:14 PM Greg Harris <greg.har...@aiven.io.invalid
> >
> > wrote:
> >
> >> Hi Mario,
> >>
> >> Thanks for the KIP! I think that using the same data model inside and
> >> outside of connect is valuable, and is a good motivation for this
> effort.
> >>
> >> However, I have seen many situations where the existing data model is
> >> insufficient for Connect plugin developers.
> >> * It's a closed type system which cannot have new "physical" types added
> >> by
> >> third-party developers
> >> * There are carve-outs in the implementation for framework logical types
> >> (java.util.Date etc)
> >> * Compatibility between components isn't managed: custom types can be
> >> misinterpreted as dumb physical types which is undesirable to users, who
> >> may prefer to fail-fast instead.
> >>
> >> I think that we should evolve the Connect data model in the future to
> >> resolve some or all of these problems. This will be easier and less
> risky
> >> if we change the data model first, before expanding it to non-Connect
> >> users.
> >>
> >> Another way to satisfy the requirements of your KIP would be to adopt an
> >> existing external data model in Connect. If there was a suitably general
> >> data model out there, we could promote it within the Connect ecosystem
> and
> >> deprecate our current model.
> >>
> >> Thanks,
> >> Greg
> >>
> >>
> >> On Thu, Jan 9, 2025 at 12:19 AM Mario Fiore Vitale <mvit...@redhat.com>
> >> wrote:
> >>
> >> > Hi all,
> >> >
> >> > I just want to bring the discussion up since it was open during the
> >> > Christmas holidays.
> >> >
> >> > Any other considerations?
> >> >
> >> > Thanks,
> >> > Mario.
> >> >
> >> > On Sat, Dec 28, 2024 at 3:04 PM Mario Fiore Vitale <
> mvital...@gmail.com
> >> >
> >> > wrote:
> >> >
> >> > > Hey Hector,
> >> > >
> >> > > the aim of this KIP is to separate the Connect data (Struct, Schema,
> >> > etc.)
> >> > > so it can be used without having a dependency con connect-api.
> >> > >
> >> > > So the proposal is more related to economics.
> >> > >
> >> > > As Kafka itself became an API during the years, maybe also the
> Connect
> >> > data
> >> > > can be used as a general data format not only for Connect.
> >> > >
> >> > > Is it more clear now?
> >> > >
> >> > > Il ven 20 dic 2024, 22:20 Hector Geraldino (BLOOMBERG/ 919 3RD A) <
> >> > > hgerald...@bloomberg.net> ha scritto:
> >> > >
> >> > > > Hey Mario,
> >> > > >
> >> > > > From a quick read of this KIP, it's not super clear to me what
> >> problem
> >> > it
> >> > > > is trying to solve. The motivation states that "current
> >> implementation
> >> > > > restricts the usage of data classes", and I'm not sure what you
> >> mean by
> >> > > > that.
> >> > > >
> >> > > > Can you maybe add one or two examples of things that are not
> >> possible
> >> > to
> >> > > > do today and could be achieved by this refactor. If this proposal
> is
> >> > more
> >> > > > about improving the ergonomics of users of these classes, that's
> >> also
> >> > > > valuable IMO.
> >> > > >
> >> > > > From: dev@kafka.apache.org At: 12/18/24 08:32:28 UTC-5:00To:
> >> > > > dev@kafka.apache.org
> >> > > > Subject: [DISCUSS] KIP-1122: Create a dedicated data module for
> >> Kafka
> >> > > > Connect data classes
> >> > > >
> >> > > > Hi Everyone,
> >> > > >
> >> > > > I would like to start a discussion on KIP-1122: Create a dedicated
> >> data
> >> > > > module for Kafka Connect data classes [1].
> >> > > >
> >> > > > [1]
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1122%3A+Create+a+dedicated
> >> > > > +data+module+for+Kafka+Connect+data+classes
> >> > > >
> >> > > > Regards,
> >> > > > --
> >> > > > Mario Fiore Vitale
> >> > > >
> >> > > >
> >> > > >
> >> > >
> >> >
> >> >
> >> > --
> >> >
> >> > Mario Fiore Vitale
> >> >
> >> > Senior Software Engineer
> >> >
> >> > Red Hat <https://www.redhat.com/>
> >> > <https://www.redhat.com/>
> >> >
> >>
> >
> >
> > --
> >
> > Mario Fiore Vitale
> >
> > Senior Software Engineer
> >
> > Red Hat <https://www.redhat.com/>
> > <https://www.redhat.com/>
> >
>
>
> --
>
> Mario Fiore Vitale
>
> Senior Software Engineer
>
> Red Hat <https://www.redhat.com/>
> <https://www.redhat.com/>
>

Reply via email to