Kafka does not have a schema registry to begin with. Confluent does have it.
On Fri, Nov 11, 2022 at 10:32 AM Dave Fisher <w...@apache.org> wrote: > > > > On Nov 11, 2022, at 6:56 AM, Elliot West > > <elliot.w...@streamnative.io.INVALID> > wrote: > > > > Hey Devin, > > > > *"Kafka conforms to the JSON Schema specification"* > > Only when using Confluent's Schema Registry. > > If that is true then Apache Kafka does NOT conform while Confluent does. > Can you point to some documentation? > > > > > *"if a producer makes a change or omission, such as in a value used for > > tracking, it might not surface until way down the line"* > > So let me understand this: Although the producer has a schema, it does > not > > use it for validation of JSON (as would implicitly occur for Avro? Is > this > > correct? > > > > I agree that robust support for schema, certainly at the edges, is a > > cornerstone for a data system. I also agree that it would be better to > > adopt existing standards rather than implement them in a bespoke manner. > > > > I'd be interested to hear your thoughts on concrete improvements that you > > believe would be necessary - for example: > > > > * Producer validation of JSON occurs using "JSON Schema" > > * Evolutions of JSON Schema conform to ... > > * Users can declare topic schema using a JSON Schema document > > * Users can query topic schema and have a JSON schema document returned > to > > them > > > > Thanks, > > > > Elliot. > > > > > > > > > > > > > > On Thu, 10 Nov 2022 at 16:51, Devin Bost <devin.b...@gmail.com> wrote: > > > >> One of the areas where Kafka has an advantage over Pulsar is around data > >> quality. Kafka conforms to the JSON Schema specification, which enables > >> integration with any technology that conforms to the standard, such as > for > >> data validation, discoverability, lineage, versioning, etc. > >> Pulsar's implementation is non-compliant with the standard, and > producers > >> and consumers have no built-in way in Pulsar to validate that values in > >> their messages match expectations. As a consequence, if a producer > makes a > >> change or omission, such as in a value used for tracking, it might not > >> surface until way down the line, and then it can be very difficult to > track > >> down the source of the problem, which kills the agility of teams > >> responsible for maintaining apps using Pulsar. It's also bad PR because > >> then incidents are associated with Pulsar, even though the business > might > >> not understand that the data problem wasn't necessarily caused by > Pulsar. > >> > >> What's the right way for us to address this problem? > >> > >> -- > >> Devin Bost > >> Sent from mobile > >> Cell: 801-400-4602 > >> > > > > > > -- > > > > Elliot West > > > > Senior Platform Engineer > > > > elliot.w...@streamnative.io > > > > streamnative.io > > > > <https://github.com/streamnative> > > <https://www.linkedin.com/company/streamnative> > > <https://twitter.com/streamnativeio> > > -- -- Matteo Merli <matteo.me...@gmail.com>