Re: Data quality problem

2023-03-14 Thread Devin Bost
> not in the client? > > The user experience around maintaining types/schemas between apps in > > Pulsar is not good > > > What are we comparing this to though? What would the ideal data developer > workflow look like? > > Thanks, > > Elliot. > >

Re: Data quality problem

2023-03-14 Thread Elliot West
Elliot. On Mon, 13 Mar 2023 at 16:58, Devin Bost wrote: > > Sorry. I do not fully understand here. Is it also related to the "data > > quality" problem > > that we discussed? For the consumer side, we can use the AUTO_CONSUME > schema > > to receive Generi

Re: Data quality problem

2023-03-13 Thread Devin Bost
> Sorry. I do not fully understand here. Is it also related to the "data > quality" problem > that we discussed? For the consumer side, we can use the AUTO_CONSUME schema > to receive GenericObject (For JSON schema, you can deal with JsonObject > directly). > For the pr

Re: Data quality problem

2022-11-20 Thread PengHui Li
rrent implementation of schemas for JSON is the requirement to always have a POCO or some kind of type builder to construct the schema. This requirement can be cumbersome for users who only care about a few fields on the object. Sorry. I do not fully understand here. Is it also related to the "d

Re: Data quality problem

2022-11-16 Thread Devin Bost
I appreciate all the thoughts and questions so far. One of the issues with Pulsar's current implementation of schemas for JSON is the requirement to always have a POCO or some kind of type builder to construct the schema. This requirement can be cumbersome for users who only care about a few field

Re: Data quality problem

2022-11-16 Thread 丛搏
hi, Devin: the first Kafka doesn't support schema. `confluent `does. pulsar schema supports validation and versioning. Are you encountering a schema version caused by automatic registration, and the data source is not clear? I think you can turn off the producer's automatic registration schema, and

Re: Data quality problem

2022-11-14 Thread Elliot West
While we can get caught up in the specifics of exactly how JSON Schema is supported in the Kafka ecosystem, it is ultimately possible if desired, and is common, even if not part of open-source Apache Kafka. Devin's assertion is that JSON Schema compliant payload validation and schema evolution are

Re: Data quality problem

2022-11-11 Thread Matteo Merli
Kafka does not have a schema registry to begin with. Confluent does have it. On Fri, Nov 11, 2022 at 10:32 AM Dave Fisher wrote: > > > > On Nov 11, 2022, at 6:56 AM, Elliot West > > > wrote: > > > > Hey Devin, > > > > *"Kafka conforms to the JSON Schema specification"* > > Only when using Conf

Re: Data quality problem

2022-11-11 Thread Dave Fisher
> On Nov 11, 2022, at 6:56 AM, Elliot West > wrote: > > Hey Devin, > > *"Kafka conforms to the JSON Schema specification"* > Only when using Confluent's Schema Registry. If that is true then Apache Kafka does NOT conform while Confluent does. Can you point to some documentation? > > *"if

Re: Data quality problem

2022-11-11 Thread Elliot West
Hey Devin, *"Kafka conforms to the JSON Schema specification"* Only when using Confluent's Schema Registry. *"if a producer makes a change or omission, such as in a value used for tracking, it might not surface until way down the line"* So let me understand this: Although the producer has a schem

Data quality problem

2022-11-10 Thread Devin Bost
One of the areas where Kafka has an advantage over Pulsar is around data quality. Kafka conforms to the JSON Schema specification, which enables integration with any technology that conforms to the standard, such as for data validation, discoverability, lineage, versioning, etc. Pulsar's implementa