I like Bo's suggestion - I haven't realized each schema type
compatibility check is actually a plugin.
It makes sense for any schema type checks to evolve, sometimes in a
non-backward compatible way hence having two plugins like
protobufNativeSchemaCompatabilityCheckV1 and then
protobufNativeSchemaCompatabilityCheckV2 and then
protobufNativeSchemaCompatabilityCheckV3 makes sense to me.



On Thu, Mar 9, 2023 at 5:49 AM 丛搏 <bog...@apache.org> wrote:

>  Hi siNan:
>
> From my point of view, it is just a plug-in. I don't think it is
> necessary to add configuration for the plugin.
> This is meaningless, and it will increase the difficulty of use for users.
>
>
> SiNan Liu <liusinan1...@gmail.com> 于2023年3月8日周三 15:54写道:
> >
> > Hi, bo.
> >
> > 1. I understand what you say, to develop a new
> > `ProtobufNativeAdvancedSchemaCompatibilityCheck`, rather than changing
> > existing `ProtobufNativeSchemaCompatibilityCheck`. But I found a few
> small
> > problems:
> >
> > (1)ProtobufNativeAdvancedSchemaCompatibilityCheck and
> > ProtobufNativeSchemaCompatibilityCheck schemaType is PROTOBUF_NATIVE. It
> > looks like both checkers are PROTOBUF not using AVRO-PROTOBUF's "native"
> > implementation, which leads to some problems or "unreasonable" and gives
> me
> > some extended thinking and questions.
> >
> `CompatibilityCheck ` its only a plugin.
> `ProtobufNativeSchemaCompatibilityCheck` may sooner or later leave the
> stage, when `ProtobufNativeAdvancedSchemaCompatibilityCheck` is
> stable, we can make it the default Checker.
>
> It is just a plug-in, users can change it at will and ensure that it
> is used correctly
> > (2)In broker.conf
> >
> > `schemaRegistryCompatibilityCheckers`. If
> > ProtobufNativeSchemaCompatibilityCheck and
> > ProtobufNativeAdvancedSchemaCompatibilityCheck all set. This is going to
> > overwrite each other. Because this is a map:
> >
> >
> https://github.com/apache/pulsar/blob/af1360fb167c1f9484fda5771df3ea9b21d1440b/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/schema/SchemaRegistryService.java#L36-L44
> >
> > ```java
> >
> > Map<SchemaType, SchemaCompatibilityCheck> checkers = new HashMap<>();
> >
> > for (String className : checkerClasses) {
> >
> > SchemaCompatibilityCheck schemaCompatibilityCheck =
> > Reflections.createInstance(className,
> >
> > SchemaCompatibilityCheck.class,
> > Thread.currentThread().getContextClassLoader());
> >
> > checkers.put(schemaCompatibilityCheck.getSchemaType(),
> > schemaCompatibilityCheck);
> >
> > ```
> >
> > Is this a big problem or a small one? Is it possible or unnecessary?
> Maybe
> > we can write in the documentation that protobufNative checkers can only
> > choose one of the two? Why are there two Checkers for different
> > implementations of the same schemaType? Why not the checker to create
> > different validator, so we don not have to change
> > schemaRegistryCompatibilityCheckers.
>
> users can only use one, not two, which will bring complexity to users
>
> >
> > (3)And after the update to
> ProtobufNativeAdvancedSchemaCompatibilityCheck.
> > Existing topics previously only checked the name of the root message, not
> > the content of protobuf.
> >
> > What if the user wants both Checkers?
> >
> > Set to ProtobufNativeAdvancedSchemaCompatibilityCheck, affect the topic
> of
> > the existing schema?
> >
> > Older topics still use the old checker, and newer topics or certain older
> > topics use the new advancedchecker.
> >
> when `ProtobufNativeAdvancedSchemaCompatibilityCheck` stable,
> users will not choose `ProtobufNativeSchemaCompatibilityCheck`.
> because it not a complete checker.
> > (4)So should we have one schemaType for a checker? protobufNativeChecker
> > can have as many different implementation classes as possible. This
> > classname configuration in PIP, let's see if it can be set at the topic
> > level. In the current PIP design I just load this parameter into the
> > checker when the broker is started and the checkers map is set up. Can I
> do
> > this in the new normal pr if I want to support topic level? Or perfect it
> > here?
> >
> > Add a call PROTOBUF_NATIVE_ADVANCE schemaType corresponding
> > ProtobufNativeAdvancedSchemaCompatibilityCheck? (Seems to be more
> trouble).
> >
> > Sorry I can not use the computer and network in the company, I use my
> > mobile phone to reply to the email, the format may be a bit messy. Please
> > understand.
> >
> > Thanks,
> >
> > sinan
> >
> >
> > 丛搏 <bog...@apache.org> 于 2023年3月7日周二 下午11:39写道:
> >
> > > SiNan Liu <liusinan1...@gmail.com> 于2023年3月7日周二 13:22写道:
> > > >
> > > > Great to see your comment, bo!
> > > >
> > > > 1. The first way. The protobuf website has a description of the
> rules,
> > > but
> > > > no plans to implement them.
> > > > https://protobuf.dev/programming-guides/proto/#updating
> > >
> > > https://groups.google.com/g/protobuf
> > > maybe ask here
> > >
> > > >
> > > > 2. I think this PIP can be divided into two parts.
> > > > (1) Add a flag(`ValidatorClassName`), load it into
> > > > `ProtobufNativeSchemaCompatibilityCheck` when the broker starts.
> > > > ValidatorClassName is empty by default, and the implementation
> continues
> > > as
> > > > before, with no change for the user.
> > >
> > > `ProtobufNativeSchemaCompatibilityCheck` is a plugin in `broker.conf`
> > > ```
> > >
> > >
> schemaRegistryCompatibilityCheckers=org.apache.pulsar.broker.service.schema.JsonSchemaCompatibilityCheck,org.apache.pulsar.broker.service.schema.AvroSchemaCompatibilityCheck,org.apache.pulsar.broker.service.schema.ProtobufNativeSchemaCompatibilityCheck
> > > ```
> > > I do not recommend that we directly modify this plugin and continue to
> > > add configuration items, which will cause trouble for users.
> > > We have a lot of configs and it's getting very unwieldy.
> > > in my opinion, we don't change
> > >
> > >
> `org.apache.pulsar.broker.service.schema.ProtobufNativeSchemaCompatibilityCheck`,
> > > it is a simple implementation, it doesn't go wrong very often, most
> > > users will use it. we can add another ProtobufNativeCheck named
> > > `ProtobufNativeAdvancedSchemaCompatibilityCheck ` or other. in this
> > > way, we don't need to add this flag. There is no need to consider
> > > compatibility, it is just a plug-in and will not affect current logic.
> > > If the user needs it, just change the plugin to the new implementation
> > >
> > > > ```java
> > > >     ProtobufNativeSchemaValidator DEFAULT = (fromDescriptors,
> > > toDescriptor)
> > > > -> {
> > > >         for (Descriptors.Descriptor fromDescriptor :
> fromDescriptors) {
> > > >             // The default implementation only checks if the root
> message
> > > > has changed.
> > > >             if
> > > > (!fromDescriptor.getFullName().equals(toDescriptor.getFullName())) {
> > > >                 throw new ProtoBufCanReadCheckException("Protobuf
> root
> > > > message isn't allow change!");
> > > >             }
> > > >         }
> > > >     };
> > > > ```
> > > > `ValidatorClassName` value also can be set to the current
> implementation
> > > of
> > > > PIP add
> > > >
> > >
> `org.apache.pulsar.broker.service.schema.validator.ProtobufNativeSchemaBreakValidatorImpl`.
> > > >
> > > > (2) Recoding the `ProtobufNativeSchemaCompatibilityCheck`. Through
> the
> > > flag
> > > > (`ValidatorClassName`) to build different
> > > `ProtobufNativeSchemaValidator`.
> > > > Isn't it just a plug-in? The user can develop and choose a different
> > > > `ProtobufNativeSchemaValidator`. I think it didn't change the logic,
> it
> > > > just allowed him to expand it.
> > > >
> > > >
> > > > I think this PIP should be an enhancement and supplement to the
> function,
> > > > and there is no such thing as unnecessary and meaningless.
> > > >
> > > >
> > > > Thanks,
> > > > sinan
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > 丛搏 <bog...@apache.org> 于2023年3月7日周二 11:53写道:
> > > >
> > > > > I think we have two ways to do that.
> > > > >
> > > > > First way: We need to advance the improvement of java in protobuf.
> Ask
> > > > > if they have plans to improve.
> > > > >
> > > > > Second way: the new PROTOBUF_NATIVE `SchemaCompatibilityCheck`
> should
> > > > > be implemented as a plugin, don't change any existing plugin logic
> > > > > (it's simple and already used). I don't recommend adding flags for
> > > > > rollback, it adds configuration and makes little sense.
> > > > >
> > > > > Thanks,
> > > > > Bo
> > > > >
> > > > > Asaf Mesika <asaf.mes...@gmail.com> 于2023年3月6日周一 23:00写道:
> > > > >
> > > > > >
> > > > > > Can you convert the code block which is actually a quote in the
> > > > > > beginning of the PIP to something which doesn't require to scroll
> > > > > > horizontally so much?
> > > > > > Use
> > > > > >
> > > > >
> > >
> https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax#quoting-text
> > > > > >
> > > > > > Let's improve the clarity of what you wrote:
> > > > > >
> > > > > > "the PROTOBUF uses avro struct to store."
> > > > > > -->
> > > > > > When Schema type PROTOBUF is used, Pulsar Client assumes the
> object
> > > given
> > > > > > to it as message data is an auto-generated POJO containing the
> > > > > annotations
> > > > > > encoding the schema. The client is using a converter, which
> converts
> > > a
> > > > > > Protobuf schema descriptor into an Avro schema and sends that as
> the
> > > > > Schema
> > > > > > of the producer/consumer.
> > > > > >
> > > > > > "On the broker side, protobuf and avro both use SchemaData
> converted
> > > to
> > > > > > org.apache.avro.Schema."
> > > > > > -->
> > > > > > Since the schema is an Avro schema, the implementation of
> > > compatibility
> > > > > > check on the broker side is to simply re-use the compatibility
> check
> > > of
> > > > > the
> > > > > > AVRO schema type.
> > > > > >
> > > > > > "ProtobufSchema is different from ProtobufNativeSchema in schema
> > > > > > compatibility check it uses avro-protobuf.
> > > > > >
> > > > >
> > >
> https://central.sonatype.com/artifact/org.apache.avro/avro-protobuf/1.11.1/overview
> > > > > > But the current implementation of ProtobufNative schema
> compatibility
> > > > > > check only
> > > > > > checked if the root message name is changed."
> > > > > >
> > > > > > -->
> > > > > > PROTOBUF_NATIVE schema type is different.
> > > > > > The client is actually using Protobuf Descriptor as the schema,
> as
> > > > > opposed
> > > > > > to Avro schema of PROTOBUF schema type. In the broker, the
> > > > > PROTOBUF_NATIVE
> > > > > > compatibility check actually hasn't implemented any rule, besides
> > > one:
> > > > > > checking if the root message name has changed.
> > > > > >
> > > > > >
> > > > > >
> > > > > > >    1. For now, there is no official or third-party solution for
> > > > > ProtoBuf
> > > > > > >    compatibility. If in the future have better solutions of a
> third
> > > > > party or
> > > > > > >    the official, we develop new ProtobufNativeSchemaValidator
> and
> > > use,
> > > > > so
> > > > > > >    add a flag.
> > > > > > >
> > > > > > > Who do you need to make that configurable? Once you found a
> third
> > > > > party,
> > > > > > just switch to it? Who knows, maybe you never will. Introduce it
> > > when you
> > > > > > find it, not now.
> > > > > >
> > > > > >
> > > > > > We improve in ProtobufNativeSchemaCompatibilityCheck BACKWARD,
> > > FORWARD
> > > > > > > these strategies. As with the AVRO implementation, protobuf
> > > > > compatibility
> > > > > > > checking need implementing the canRead method. *This will check
> > > that
> > > > > > > the writtenschema can be read by readSchema.*
> > > > > >
> > > > > >
> > > > > > I completely disagree.
> > > > > > Avro implementation is confusing for our use case. Don't copy
> that.
> > > > > >
> > > > > > You have
> > > > > >
> > > > > > public void checkCompatible(SchemaData from, SchemaData to,
> > > > > > SchemaCompatibilityStrategy strategy)
> > > > > >         throws IncompatibleSchemaException {
> > > > > >     Descriptor fromDescriptor =
> > > > > > ProtobufNativeSchemaUtils.deserialize(from.getData());
> > > > > >     Descriptor toDescriptor =
> > > > > > ProtobufNativeSchemaUtils.deserialize(to.getData());
> > > > > >     switch (strategy) {
> > > > > >         case BACKWARD_TRANSITIVE:
> > > > > >         case BACKWARD:
> > > > > >         case FORWARD_TRANSITIVE:
> > > > > >         case FORWARD:
> > > > > >         case FULL_TRANSITIVE:
> > > > > >         case FULL:
> > > > > >             checkRootMessageChange(fromDescriptor, toDescriptor,
> > > > > strategy);
> > > > > >             return;
> > > > > >         case ALWAYS_COMPATIBLE:
> > > > > >             return;
> > > > > >         default:
> > > > > >             throw new IncompatibleSchemaException("Unknown
> > > > > > SchemaCompatibilityStrategy.");
> > > > > >     }
> > > > > > }
> > > > > >
> > > > > > I would rename :
> > > > > > from --> currentSchema
> > > > > > to --> newSchema
> > > > > >
> > > > > > Use that switch case and have a method for each like:
> > > > > > validateBackwardsCompatibility(currentSchema, newSchema)
> > > > > >
> > > > > > I dislike canRead and usage of writtenSchema, since you have two
> > > > > completely
> > > > > > different use cases: from the producing side and the consumer
> side.
> > > > > >
> > > > > > schemaValidatorBuilder
> > > > > > >
> > > > > > > I dislike this proposal. IMO Avro implementation is way too
> > > > > complicated.
> > > > > > Why not have a simple function for validation for each switch
> case
> > > above?
> > > > > > Why do we need strategy and builder, and all this complexity?
> > > > > >
> > > > > >
> > > > > > *Here are the basic compatibility rules we've defined:*
> > > > > >
> > > > > >
> > > > > > IMO it's impossible to read the validation rules as you described
> > > them.
> > > > > > I wrote how they should be structured numerous times above.
> > > > > > I can't validate them.
> > > > > >
> > > > > >
> > > > > > IMO, the current design is very hard to read.
> > > > > > Please try to avoid jumping into code sections.
> > > > > > Write a high level design section, in which you describe in words
> > > what
> > > > > you
> > > > > > plan to do.
> > > > > > Write the validation rules in the structure that is easy to
> > > understand:
> > > > > > rules per each compatibility check, and use proper words (current
> > > schema,
> > > > > > new schema), since new schema can be once used for read and once
> > > used for
> > > > > > write.
> > > > > >
> > > > > > In its current form it takes too much time to understand the
> design,
> > > and
> > > > > it
> > > > > > shouldn't be the case.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Asaf
> > > > > >
> > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Sun, Mar 5, 2023 at 3:58 PM SiNan Liu <liusinan1...@gmail.com
> >
> > > wrote:
> > > > > >
> > > > > > > Hi! I updated the explanation of some things in the PIP issue.
> And
> > > also
> > > > > > > added a new “flag” in the conf is used as the different
> > > > > > > ProtobufNativeSchemaValidator implementation, also set
> > > > > > > ProtobufNativeSchemaValidator default only check whether the
> name
> > > of
> > > > > the
> > > > > > > root message is the same.
> > > > > > >
> > > > > > >
> > > > > > > Thanks,
> > > > > > > sinan
> > > > > > >
> > > > > > >
> > > > > > > Asaf Mesika <asaf.mes...@gmail.com> 于2023年3月5日周日 20:21写道:
> > > > > > >
> > > > > > > > On Wed, Mar 1, 2023 at 4:33 PM SiNan Liu <
> liusinan1...@gmail.com
> > > >
> > > > > wrote:
> > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Can you please explain how a Protobuf Schema descriptor
> can
> > > be
> > > > > > > > validated
> > > > > > > > > > for backward compatibility check using Avro based
> > > compatibility
> > > > > > > rules?
> > > > > > > > > > Doesn't it expect the schema to be Avro, but it is
> actually a
> > > > > > > Protobuf
> > > > > > > > > > descriptor?
> > > > > > > > > > Is there some translation happening?
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > 1. *You can take a quick look at the previous design, the
> > > PROTOBUF
> > > > > uses
> > > > > > > > > avro struct to store.*
> > > > > > > > > https://github.com/apache/pulsar/pull/1954
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > >
> https://github.com/apache/pulsar/blob/579f22c8449be287ee1209a477aeaad346495289/pulsar-client/src/main/java/org/apache/pulsar/client/impl/schema/ProtobufSchema.java#L59-L61
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > >
> https://github.com/apache/pulsar/blob/579f22c8449be287ee1209a477aeaad346495289/pulsar-client/src/main/java/org/apache/pulsar/client/impl/schema/ProtobufSchema.java#L110-L115
> > > > > > > >
> > > > > > > >
> > > > > > > > Ok. So to summarize your code (easier to write it than send
> > > links):
> > > > > > > > * Pulsar Client, when used with Protobuf Schema, actually
> > > converts
> > > > > the
> > > > > > > > Protobuf descriptor into an Avro Schema (using code found
> inside
> > > Avro
> > > > > > > > library) and saves that Avro schema as the schema. It's not
> > > saving
> > > > > the
> > > > > > > > protobuf descriptor at all. Very confusing I have to add -
> never
> > > > > expected
> > > > > > > > that.
> > > > > > > > This explains why In the ProtobufSchemaCompatibilityCheck
> they
> > > just
> > > > > > > extend
> > > > > > > > the Avro without doing any translation.
> > > > > > > >
> > > > > > > > Thanks for that.
> > > > > > > >
> > > > > > > > Now thatI finally understand this, I can say that: you *must*
> > > explain
> > > > > > > that
> > > > > > > > in the motivation part in your PIP.
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > 2. *On the broker side, protobuf and avro both use
> `SchemaData`
> > > > > > > converted
> > > > > > > > > to `org.apache.avro.Schema`.*
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > >
> https://github.com/apache/pulsar/blob/579f22c8449be287ee1209a477aeaad346495289/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/ServerCnx.java#L1280-L1293
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > >
> https://github.com/apache/pulsar/blob/579f22c8449be287ee1209a477aeaad346495289/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/schema/ProtobufSchemaCompatibilityCheck.java#L26-L31
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > >
> https://github.com/apache/pulsar/blob/579f22c8449be287ee1209a477aeaad346495289/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/schema/AvroSchemaBasedCompatibilityCheck.java#L47-L70
> > > > > > > >
> > > > > > > >
> > > > > > > > Actually those links don't really help.
> > > > > > > > The main link that helps is:
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > >
> https://github.com/apache/pulsar/blob/ec102fb024a6ea2b195826778300f20e330dff06/pulsar-client/src/main/java/org/apache/pulsar/client/impl/schema/ProtobufSchema.java#L102-L122
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > I'm sorry - I don't understand.
> > > > > > > > > > I understand the different compatibility check
> strategies.
> > > > > > > > > > If you just spell them out here, then as you say, just
> > > translate
> > > > > the
> > > > > > > > > > Protobuf Descriptor into an Avro schema and run the Avro
> > > > > > > > > > compatibility validation, no?
> > > > > > > > > > I believe the answer is no, since you may want to verify
> > > > > different
> > > > > > > > things
> > > > > > > > > > when it comes to Protobuf, which are different then Avro.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > 1.
> > > > > > > > > *ProtobufSchema is different from ProtobufNativeSchema in
> that
> > > it
> > > > > uses
> > > > > > > > > avro-protobuf.*
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > >
> https://central.sonatype.com/artifact/org.apache.avro/avro-protobuf/1.11.1/overview
> > > > > > > > > *ProtobufNativeSchema needs a native compatibility check,
> but
> > > > > there is
> > > > > > > no
> > > > > > > > > official or third party implementation. So this PIP does
> not
> > > use
> > > > > > > > > avro-protobuf for protobuf compatibility checking.*
> > > > > > > > >
> > > > > > > > > 2. *By the way, this is implemented in much the same way
> that
> > > > > Apache
> > > > > > > avro
> > > > > > > > > does compatibility checking.*
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > >
> https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/SchemaValidatorBuilder.java
> > > > > > > > > `canReadStrategy`,`canBeReadStrategy`,`mutualReadStrategy`
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > >
> https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/ValidateCanRead.java
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > >
> https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/ValidateCanBeRead.java
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > >
> https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/ValidateMutualRead.java
> > > > > > > > > *In `ValidateMutualRead.java`, the arguments of
> `canRead()` are
> > > > > > > > > writtenSchema and readSchema. We only need to change the
> order
> > > of
> > > > > > > > arguments
> > > > > > > > > we pass to `canRead()`.*
> > > > > > > > > ```java
> > > > > > > > > private void validateWithStrategy(Descriptors.Descriptor
> > > > > toValidate,
> > > > > > > > > Descriptors.Descriptor fromDescriptor) throws
> > > > > > > > ProtoBufCanReadCheckException
> > > > > > > > > {
> > > > > > > > > switch (strategy) {
> > > > > > > > > case CanReadExistingStrategy -> canRead(fromDescriptor,
> > > > > toValidate);
> > > > > > > > > case CanBeReadByExistingStrategy -> canRead(toValidate,
> > > > > > > fromDescriptor);
> > > > > > > > > case CanBeReadMutualStrategy -> {
> > > > > > > > > canRead(toValidate, fromDescriptor);
> > > > > > > > > canRead(fromDescriptor, toValidate);
> > > > > > > > > }
> > > > > > > > > }
> > > > > > > > > }
> > > > > > > > >
> > > > > > > > > private void canRead(Descriptors.Descriptor writtenSchema,
> > > > > > > > > Descriptors.Descriptor readSchema) throws
> > > > > > > ProtoBufCanReadCheckException {
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > >
> ProtobufNativeSchemaBreakCheckUtils.checkSchemaCompatibility(writtenSchema,
> > > > > > > > > readSchema);
> > > > > > > > > }
> > > > > > > > > ```
> > > > > > > > >
> > > > > > > > >
> > > > > > > > I get that you want to take inspiration from the existing
> Avro
> > > Schema
> > > > > > > > compatibility check, to do your code design.
> > > > > > > > I also understand you *won't* use any existing avro code for
> > > that.
> > > > > > > > I also understand, you have to write the validation check on
> your
> > > > > own,
> > > > > > > > since there is no 3rd party to explain that.
> > > > > > > >
> > > > > > > > The only thing I can't understand are the actual rules as I
> wrote
> > > > > before,
> > > > > > > > since they are written confusingly.
> > > > > > > > So, I repeat what I asked before:
> > > > > > > >
> > > > > > > > I think you should structure the validation rules
> differently:
> > > > > > > >
> > > > > > > > * Backward checks
> > > > > > > > ** List down rules, where use newSchema (the schema used by
> > > producer
> > > > > or
> > > > > > > > consumer) and existingSchema (last schema used)
> > > > > > > > * Forward
> > > > > > > > ** List down rules, where use newSchema (the schema used by
> > > producer
> > > > > or
> > > > > > > > consumer) and existingSchema (last schema used)
> > > > > > > >
> > > > > > > > Once that's accomplished I will be able to understand the
> > > different
> > > > > > > > validation rules for each compatibility check.
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > sinan
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Asaf Mesika <asaf.mes...@gmail.com> 于2023年3月1日周三 21:19写道:
> > > > > > > > > >
> > > > > > > > > > On Mon, Feb 27, 2023 at 3:47 PM SiNan Liu <
> > > > > liusinan1...@gmail.com>
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > I read it and they look identical. What's the
> difference
> > > > > between
> > > > > > > > > them?
> > > > > > > > > > >
> > > > > > > > > > > Current avro,json, and protobuf schemas are all
> implemented
> > > > > based
> > > > > > > on
> > > > > > > > > AVRO.
> > > > > > > > > > > > What do you mean, they are all implemented based on
> > > Avro? You
> > > > > > > mean
> > > > > > > > > the
> > > > > > > > > > > > protobuf schema is converted into an Avro Schema, and
> > > then
> > > > > you
> > > > > > > use
> > > > > > > > > Avro
> > > > > > > > > > > > compatibility validation?
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > >
> `org.apache.pulsar.broker.service.schema.ProtobufSchemaCompatibilityCheck`
> > > > > > > > > > >
> > > > > > > >
> > > > >
> `org.apache.pulsar.broker.service.schema.AvroSchemaCompatibilityCheck`
> > > > > > > > > > >
> > > > > > > >
> > > > >
> `org.apache.pulsar.broker.service.schema.JsonSchemaCompatibilityCheck`
> > > > > > > > > > > They all extends `AvroSchemaBasedCompatibilityCheck`,
> the
> > > > > > > > > > > `checkCompatible()` is the same implementation with
> `AVRO`.
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Can you please explain how a Protobuf Schema descriptor
> can
> > > be
> > > > > > > > validated
> > > > > > > > > > for backward compatibility check using Avro based
> > > compatibility
> > > > > > > rules?
> > > > > > > > > > Doesn't it expect the schema to be Avro, but it is
> actually a
> > > > > > > Protobuf
> > > > > > > > > > descriptor?
> > > > > > > > > > Is there some translation happening?
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > I think you should structure the validation rules
> > > differently:
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > The Compatibility check strategy is described on the
> > > website
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > >
> https://pulsar.apache.org/docs/next/schema-understand/#schema-compatibility-check-strategy
> > > > > > > > > > > 1. BACKWARD(CanReadExistingStrategy): Consumers using
> > > schema
> > > > > V3 can
> > > > > > > > > process
> > > > > > > > > > > data written by producers using the last schema version
> > > V2. So
> > > > > V2
> > > > > > > is
> > > > > > > > > > > "writtenSchema" and V3 is "readSchema".
> > > > > > > > > > > 2. FORWARD(CanBeReadByExistingStrategy): Consumers
> using
> > > the
> > > > > last
> > > > > > > > > schema
> > > > > > > > > > > version V2 can process data written by producers using
> a
> > > new
> > > > > schema
> > > > > > > > V3,
> > > > > > > > > > > even though they may not be able to use the full
> > > capabilities
> > > > > of
> > > > > > > the
> > > > > > > > > new
> > > > > > > > > > > schema. So V3 is "writtenSchema" and V2 is
> "readSchema".
> > > > > > > > > > > 3. FULL(CanBeReadMutualStrategy): Schemas are both
> > > backward and
> > > > > > > > forward
> > > > > > > > > > > compatible.
> > > > > > > > > > > Schema can evolve. The old version schema and the new
> > > version
> > > > > > > schema
> > > > > > > > > should
> > > > > > > > > > > be well understood.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > I'm sorry - I don't understand.
> > > > > > > > > > I understand the different compatibility check
> strategies.
> > > > > > > > > > If you just spell them out here, then as you say, just
> > > translate
> > > > > the
> > > > > > > > > > Protobuf Descriptor into an Avro schema and run the Avro
> > > > > > > > > > compatibility validation, no?
> > > > > > > > > > I believe the answer is no, since you may want to verify
> > > > > different
> > > > > > > > things
> > > > > > > > > > when it comes to Protobuf, which are different then Avro.
> > > > > > > > > >
> > > > > > > > > > At the current state, I can't understand your design at
> all.
> > > > > Please
> > > > > > > > help
> > > > > > > > > > clarify that.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > So each strategy should have its own section.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > The arguments of `canRead()` are writtenSchema and
> > > readSchema.
> > > > > As
> > > > > > > > we've
> > > > > > > > > > > just described, we only need to change the order of
> > > arguments
> > > > > we
> > > > > > > pass
> > > > > > > > > to
> > > > > > > > > > > `canRead()`.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > > sinan
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Asaf Mesika <asaf.mes...@gmail.com> 于2023年2月27日周一
> 20:49写道:
> > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > And you can see the difference between ProtoBuf and
> > > > > > > > ProtoBufNative:
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > >
> > > > >
> https://pulsar.apache.org/docs/next/schema-get-started/#protobufnative
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > >
> https://pulsar.apache.org/docs/next/schema-get-started/#protobuf
> > > > > > > > > > > > >
> > > > > > > > > > > >  I read it and they look identical. What's the
> difference
> > > > > between
> > > > > > > > > them?
> > > > > > > > > > > >
> > > > > > > > > > > > Current avro,json, and protobuf schemas are all
> > > implemented
> > > > > based
> > > > > > > > on
> > > > > > > > > > > AVRO.
> > > > > > > > > > > >
> > > > > > > > > > > > What do you mean, they are all implemented based on
> > > Avro? You
> > > > > > > mean
> > > > > > > > > the
> > > > > > > > > > > > protobuf schema is converted into an Avro Schema, and
> > > then
> > > > > you
> > > > > > > use
> > > > > > > > > Avro
> > > > > > > > > > > > compatibility validation?
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > > *Here are the basic compatibility rules we've
> defined:*
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > I think you should structure the validation rules
> > > > > differently:
> > > > > > > > > > > >
> > > > > > > > > > > > * Backward checks
> > > > > > > > > > > > ** List down rules, where use newSchema (the schema
> used
> > > by
> > > > > > > > producer
> > > > > > > > > or
> > > > > > > > > > > > consumer) and existingSchema (last schema used)
> > > > > > > > > > > > * Forward
> > > > > > > > > > > > ** List down rules, where use newSchema (the schema
> used
> > > by
> > > > > > > > producer
> > > > > > > > > or
> > > > > > > > > > > > consumer) and existingSchema (last schema used)
> > > > > > > > > > > >
> > > > > > > > > > > > So each strategy should have its own section.
> > > > > > > > > > > >
> > > > > > > > > > > > I'm saying this since you used "writttenSchema" word
> but
> > > it
> > > > > > > > > represents
> > > > > > > > > > > > something completely different if it's backward or
> > > forward
> > > > > check.
> > > > > > > > > > > >
> > > > > > > > > > > > Once you'll have that structure like that, I
> personally
> > > will
> > > > > be
> > > > > > > > able
> > > > > > > > > to
> > > > > > > > > > > > read and understand it.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > The motivation and problem statement are now good -
> > > thanks
> > > > > for
> > > > > > > > > improving
> > > > > > > > > > > > it.
> > > > > > > > > > > >
> > > > > > > > > > > > On Mon, Feb 27, 2023 at 8:20 AM SiNan Liu <
> > > > > > > liusinan1...@gmail.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi! I updated the PIP issue again. This time I've
> added
> > > > > some
> > > > > > > > > background
> > > > > > > > > > > > and
> > > > > > > > > > > > > some explanations.
> > > > > > > > > > > > >
> > > > > > > > > > > > > The compatibility check rules are already written
> in
> > > the
> > > > > > > > > > > Implementation.
> > > > > > > > > > > > > ProtoBufNative implements the same canRead method
> as
> > > Apache
> > > > > > > Avro.
> > > > > > > > > > > > > It does this by checking whether the schema for
> > > writing and
> > > > > > > > reading
> > > > > > > > > is
> > > > > > > > > > > > > compatible. I also indicate whether the
> writtenSchema
> > > and
> > > > > > > > > readSchema of
> > > > > > > > > > > > the
> > > > > > > > > > > > > Backward, Forward, and Full strategies are the old
> or
> > > the
> > > > > new
> > > > > > > > > version
> > > > > > > > > > > of
> > > > > > > > > > > > > the schema.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > sinan
> > > > > > > > > > > > >
> > > > > > > > > > > > > Asaf Mesika <asaf.mes...@gmail.com> 于2023年2月26日周日
> > > 23:24写道:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > I'm sorry, but this PIP lacks a lot of background
> > > > > knowledge,
> > > > > > > so
> > > > > > > > > you
> > > > > > > > > > > > need
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > add IMO for people to understand it. You don't
> need
> > > to
> > > > > > > explain
> > > > > > > > > the
> > > > > > > > > > > > entire
> > > > > > > > > > > > > > pulsar in this PIP, but at the very least a few
> > > > > paragraphs
> > > > > > > > > detailing
> > > > > > > > > > > > all
> > > > > > > > > > > > > > you need to know, to put you in context:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >    - Start by saying Pulsar as a built-in schema
> > > registry
> > > > > > > > inside
> > > > > > > > > > > Pulsar
> > > > > > > > > > > > > >    broker.
> > > > > > > > > > > > > >       - Every time the client updates the
> schema, it
> > > > > uploads
> > > > > > > it
> > > > > > > > > to
> > > > > > > > > > > the
> > > > > > > > > > > > > >       broker. When that happens, it has a feature
> > > which
> > > > > > > > validates
> > > > > > > > > if
> > > > > > > > > > > > the
> > > > > > > > > > > > > > new
> > > > > > > > > > > > > >       schema version is compatible with the
> previous
> > > > > > > versions.
> > > > > > > > > There
> > > > > > > > > > > > > > are 4 types
> > > > > > > > > > > > > >       of compatibility: Full, ... (complete and
> > > explain
> > > > > each
> > > > > > > > one
> > > > > > > > > > > > briefly)
> > > > > > > > > > > > > >    - Also explain Pulsar Schema registry supports
> > > various
> > > > > > > > schema
> > > > > > > > > > > > > >    protocols:  Avro, protobuf native, ...
> (complete
> > > the
> > > > > > > rest),
> > > > > > > > > each
> > > > > > > > > > > > > > protocol
> > > > > > > > > > > > > >    has a schema which dictates how to serialize
> and
> > > > > > > deserialize
> > > > > > > > > the
> > > > > > > > > > > > > message
> > > > > > > > > > > > > >    content into typed object.
> > > > > > > > > > > > > >    - Explain in short what is protobuf native
> > > (compare
> > > > > > > protobuf
> > > > > > > > > > > > > non-native)
> > > > > > > > > > > > > >    - Please don't paste code instead of
> explaining.
> > > > > > > > > > > > > >       - Explain that protobuf native current
> > > validation
> > > > > check
> > > > > > > > is
> > > > > > > > > only
> > > > > > > > > > > > > >       composed of checking the root message name
> is
> > > the
> > > > > same
> > > > > > > > > between
> > > > > > > > > > > > > > the current
> > > > > > > > > > > > > >       schema version and the new version.
> > > > > > > > > > > > > >          - Explain briefly what is a root message
> > > and its
> > > > > > > name.
> > > > > > > > > > > > > >       - Explain the problem (list scenarios)
> that we
> > > have
> > > > > > > > because
> > > > > > > > > > > > > protobuf
> > > > > > > > > > > > > >       native schema only supports FULL
> compatibility
> > > > > > > > validation.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Regarding high level design - as in what you
> plan to
> > > do.
> > > > > > > > > > > > > > I suggest you add "High Level Design" and in it
> > > detail
> > > > > how
> > > > > > > you
> > > > > > > > > plan
> > > > > > > > > > > to
> > > > > > > > > > > > > > validate, per protobuf version, per compatibility
> > > check
> > > > > > > > > (backward,
> > > > > > > > > > > > > forward,
> > > > > > > > > > > > > > full,...).
> > > > > > > > > > > > > > I tried reading the implementation - for me ,
> it's
> > > all
> > > > > over
> > > > > > > the
> > > > > > > > > > > place.
> > > > > > > > > > > > > Can
> > > > > > > > > > > > > > you please list in order what I wrote above, and
> > > list the
> > > > > > > > > validation
> > > > > > > > > > > > > rules
> > > > > > > > > > > > > > with a good explanation why you validate it like
> > > that?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Lastly, one you have all the validation rules
> clearly
> > > > > stated,
> > > > > > > > you
> > > > > > > > > can
> > > > > > > > > > > > use
> > > > > > > > > > > > > > it to document it properly so users can know what
> > > > > validation
> > > > > > > to
> > > > > > > > > > > expect.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Asaf
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Wed, Feb 22, 2023 at 5:10 PM SiNan Liu <
> > > > > > > > > liusinan1...@gmail.com>
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Sorry, my mistake. I removed the code and
> > > described the
> > > > > > > > design
> > > > > > > > > to
> > > > > > > > > > > > > improve
> > > > > > > > > > > > > > > the PROTOBUF_NATIVE schema compatibility
> checks.
> > > You
> > > > > can
> > > > > > > > have a
> > > > > > > > > > > look.
> > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Asaf Mesika <asaf.mes...@gmail.com>
> 于2023年2月22日周三
> > > > > 21:16写道:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I read it but you're almost directly diving
> into
> > > the
> > > > > > > code -
> > > > > > > > > it
> > > > > > > > > > > will
> > > > > > > > > > > > > > take
> > > > > > > > > > > > > > > me
> > > > > > > > > > > > > > > > hours just to reverse engineer your design.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Can you please include a "High Level Design"
> > > section
> > > > > in
> > > > > > > > which
> > > > > > > > > you
> > > > > > > > > > > > > > explain
> > > > > > > > > > > > > > > > how you plan to tackle any issue?
> > > > > > > > > > > > > > > > If I can read that section and explain to
> someone
> > > > > else
> > > > > > > how
> > > > > > > > > this
> > > > > > > > > > > > will
> > > > > > > > > > > > > > > work,
> > > > > > > > > > > > > > > > it means the section is complete.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Let's leave the code to the PRs.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Sun, Feb 19, 2023 at 2:59 PM SiNan Liu <
> > > > > > > > > > > liusinan1...@gmail.com>
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Hi all,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I made a PIP to discuss:
> > > > > > > > > > > > > > https://github.com/apache/pulsar/issues/19565
> > > > > > > > > > > > > > > .
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > We can talk about the current design here.
> > > > > Especially
> > > > > > > for
> > > > > > > > > the
> > > > > > > > > > > > field
> > > > > > > > > > > > > > > type
> > > > > > > > > > > > > > > > > change check rules, please give your
> valuable
> > > > > advice.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > Sinan
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > >
>

Reply via email to