Hi siNan: >From my point of view, it is just a plug-in. I don't think it is necessary to add configuration for the plugin. This is meaningless, and it will increase the difficulty of use for users.
SiNan Liu <liusinan1...@gmail.com> 于2023年3月8日周三 15:54写道: > > Hi, bo. > > 1. I understand what you say, to develop a new > `ProtobufNativeAdvancedSchemaCompatibilityCheck`, rather than changing > existing `ProtobufNativeSchemaCompatibilityCheck`. But I found a few small > problems: > > (1)ProtobufNativeAdvancedSchemaCompatibilityCheck and > ProtobufNativeSchemaCompatibilityCheck schemaType is PROTOBUF_NATIVE. It > looks like both checkers are PROTOBUF not using AVRO-PROTOBUF's "native" > implementation, which leads to some problems or "unreasonable" and gives me > some extended thinking and questions. > `CompatibilityCheck ` its only a plugin. `ProtobufNativeSchemaCompatibilityCheck` may sooner or later leave the stage, when `ProtobufNativeAdvancedSchemaCompatibilityCheck` is stable, we can make it the default Checker. It is just a plug-in, users can change it at will and ensure that it is used correctly > (2)In broker.conf > > `schemaRegistryCompatibilityCheckers`. If > ProtobufNativeSchemaCompatibilityCheck and > ProtobufNativeAdvancedSchemaCompatibilityCheck all set. This is going to > overwrite each other. Because this is a map: > > https://github.com/apache/pulsar/blob/af1360fb167c1f9484fda5771df3ea9b21d1440b/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/schema/SchemaRegistryService.java#L36-L44 > > ```java > > Map<SchemaType, SchemaCompatibilityCheck> checkers = new HashMap<>(); > > for (String className : checkerClasses) { > > SchemaCompatibilityCheck schemaCompatibilityCheck = > Reflections.createInstance(className, > > SchemaCompatibilityCheck.class, > Thread.currentThread().getContextClassLoader()); > > checkers.put(schemaCompatibilityCheck.getSchemaType(), > schemaCompatibilityCheck); > > ``` > > Is this a big problem or a small one? Is it possible or unnecessary? Maybe > we can write in the documentation that protobufNative checkers can only > choose one of the two? Why are there two Checkers for different > implementations of the same schemaType? Why not the checker to create > different validator, so we don not have to change > schemaRegistryCompatibilityCheckers. users can only use one, not two, which will bring complexity to users > > (3)And after the update to ProtobufNativeAdvancedSchemaCompatibilityCheck. > Existing topics previously only checked the name of the root message, not > the content of protobuf. > > What if the user wants both Checkers? > > Set to ProtobufNativeAdvancedSchemaCompatibilityCheck, affect the topic of > the existing schema? > > Older topics still use the old checker, and newer topics or certain older > topics use the new advancedchecker. > when `ProtobufNativeAdvancedSchemaCompatibilityCheck` stable, users will not choose `ProtobufNativeSchemaCompatibilityCheck`. because it not a complete checker. > (4)So should we have one schemaType for a checker? protobufNativeChecker > can have as many different implementation classes as possible. This > classname configuration in PIP, let's see if it can be set at the topic > level. In the current PIP design I just load this parameter into the > checker when the broker is started and the checkers map is set up. Can I do > this in the new normal pr if I want to support topic level? Or perfect it > here? > > Add a call PROTOBUF_NATIVE_ADVANCE schemaType corresponding > ProtobufNativeAdvancedSchemaCompatibilityCheck? (Seems to be more trouble). > > Sorry I can not use the computer and network in the company, I use my > mobile phone to reply to the email, the format may be a bit messy. Please > understand. > > Thanks, > > sinan > > > 丛搏 <bog...@apache.org> 于 2023年3月7日周二 下午11:39写道: > > > SiNan Liu <liusinan1...@gmail.com> 于2023年3月7日周二 13:22写道: > > > > > > Great to see your comment, bo! > > > > > > 1. The first way. The protobuf website has a description of the rules, > > but > > > no plans to implement them. > > > https://protobuf.dev/programming-guides/proto/#updating > > > > https://groups.google.com/g/protobuf > > maybe ask here > > > > > > > > 2. I think this PIP can be divided into two parts. > > > (1) Add a flag(`ValidatorClassName`), load it into > > > `ProtobufNativeSchemaCompatibilityCheck` when the broker starts. > > > ValidatorClassName is empty by default, and the implementation continues > > as > > > before, with no change for the user. > > > > `ProtobufNativeSchemaCompatibilityCheck` is a plugin in `broker.conf` > > ``` > > > > schemaRegistryCompatibilityCheckers=org.apache.pulsar.broker.service.schema.JsonSchemaCompatibilityCheck,org.apache.pulsar.broker.service.schema.AvroSchemaCompatibilityCheck,org.apache.pulsar.broker.service.schema.ProtobufNativeSchemaCompatibilityCheck > > ``` > > I do not recommend that we directly modify this plugin and continue to > > add configuration items, which will cause trouble for users. > > We have a lot of configs and it's getting very unwieldy. > > in my opinion, we don't change > > > > `org.apache.pulsar.broker.service.schema.ProtobufNativeSchemaCompatibilityCheck`, > > it is a simple implementation, it doesn't go wrong very often, most > > users will use it. we can add another ProtobufNativeCheck named > > `ProtobufNativeAdvancedSchemaCompatibilityCheck ` or other. in this > > way, we don't need to add this flag. There is no need to consider > > compatibility, it is just a plug-in and will not affect current logic. > > If the user needs it, just change the plugin to the new implementation > > > > > ```java > > > ProtobufNativeSchemaValidator DEFAULT = (fromDescriptors, > > toDescriptor) > > > -> { > > > for (Descriptors.Descriptor fromDescriptor : fromDescriptors) { > > > // The default implementation only checks if the root message > > > has changed. > > > if > > > (!fromDescriptor.getFullName().equals(toDescriptor.getFullName())) { > > > throw new ProtoBufCanReadCheckException("Protobuf root > > > message isn't allow change!"); > > > } > > > } > > > }; > > > ``` > > > `ValidatorClassName` value also can be set to the current implementation > > of > > > PIP add > > > > > `org.apache.pulsar.broker.service.schema.validator.ProtobufNativeSchemaBreakValidatorImpl`. > > > > > > (2) Recoding the `ProtobufNativeSchemaCompatibilityCheck`. Through the > > flag > > > (`ValidatorClassName`) to build different > > `ProtobufNativeSchemaValidator`. > > > Isn't it just a plug-in? The user can develop and choose a different > > > `ProtobufNativeSchemaValidator`. I think it didn't change the logic, it > > > just allowed him to expand it. > > > > > > > > > I think this PIP should be an enhancement and supplement to the function, > > > and there is no such thing as unnecessary and meaningless. > > > > > > > > > Thanks, > > > sinan > > > > > > > > > > > > > > > > > > 丛搏 <bog...@apache.org> 于2023年3月7日周二 11:53写道: > > > > > > > I think we have two ways to do that. > > > > > > > > First way: We need to advance the improvement of java in protobuf. Ask > > > > if they have plans to improve. > > > > > > > > Second way: the new PROTOBUF_NATIVE `SchemaCompatibilityCheck` should > > > > be implemented as a plugin, don't change any existing plugin logic > > > > (it's simple and already used). I don't recommend adding flags for > > > > rollback, it adds configuration and makes little sense. > > > > > > > > Thanks, > > > > Bo > > > > > > > > Asaf Mesika <asaf.mes...@gmail.com> 于2023年3月6日周一 23:00写道: > > > > > > > > > > > > > > Can you convert the code block which is actually a quote in the > > > > > beginning of the PIP to something which doesn't require to scroll > > > > > horizontally so much? > > > > > Use > > > > > > > > > > > https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax#quoting-text > > > > > > > > > > Let's improve the clarity of what you wrote: > > > > > > > > > > "the PROTOBUF uses avro struct to store." > > > > > --> > > > > > When Schema type PROTOBUF is used, Pulsar Client assumes the object > > given > > > > > to it as message data is an auto-generated POJO containing the > > > > annotations > > > > > encoding the schema. The client is using a converter, which converts > > a > > > > > Protobuf schema descriptor into an Avro schema and sends that as the > > > > Schema > > > > > of the producer/consumer. > > > > > > > > > > "On the broker side, protobuf and avro both use SchemaData converted > > to > > > > > org.apache.avro.Schema." > > > > > --> > > > > > Since the schema is an Avro schema, the implementation of > > compatibility > > > > > check on the broker side is to simply re-use the compatibility check > > of > > > > the > > > > > AVRO schema type. > > > > > > > > > > "ProtobufSchema is different from ProtobufNativeSchema in schema > > > > > compatibility check it uses avro-protobuf. > > > > > > > > > > > https://central.sonatype.com/artifact/org.apache.avro/avro-protobuf/1.11.1/overview > > > > > But the current implementation of ProtobufNative schema compatibility > > > > > check only > > > > > checked if the root message name is changed." > > > > > > > > > > --> > > > > > PROTOBUF_NATIVE schema type is different. > > > > > The client is actually using Protobuf Descriptor as the schema, as > > > > opposed > > > > > to Avro schema of PROTOBUF schema type. In the broker, the > > > > PROTOBUF_NATIVE > > > > > compatibility check actually hasn't implemented any rule, besides > > one: > > > > > checking if the root message name has changed. > > > > > > > > > > > > > > > > > > > > > 1. For now, there is no official or third-party solution for > > > > ProtoBuf > > > > > > compatibility. If in the future have better solutions of a third > > > > party or > > > > > > the official, we develop new ProtobufNativeSchemaValidator and > > use, > > > > so > > > > > > add a flag. > > > > > > > > > > > > Who do you need to make that configurable? Once you found a third > > > > party, > > > > > just switch to it? Who knows, maybe you never will. Introduce it > > when you > > > > > find it, not now. > > > > > > > > > > > > > > > We improve in ProtobufNativeSchemaCompatibilityCheck BACKWARD, > > FORWARD > > > > > > these strategies. As with the AVRO implementation, protobuf > > > > compatibility > > > > > > checking need implementing the canRead method. *This will check > > that > > > > > > the writtenschema can be read by readSchema.* > > > > > > > > > > > > > > > I completely disagree. > > > > > Avro implementation is confusing for our use case. Don't copy that. > > > > > > > > > > You have > > > > > > > > > > public void checkCompatible(SchemaData from, SchemaData to, > > > > > SchemaCompatibilityStrategy strategy) > > > > > throws IncompatibleSchemaException { > > > > > Descriptor fromDescriptor = > > > > > ProtobufNativeSchemaUtils.deserialize(from.getData()); > > > > > Descriptor toDescriptor = > > > > > ProtobufNativeSchemaUtils.deserialize(to.getData()); > > > > > switch (strategy) { > > > > > case BACKWARD_TRANSITIVE: > > > > > case BACKWARD: > > > > > case FORWARD_TRANSITIVE: > > > > > case FORWARD: > > > > > case FULL_TRANSITIVE: > > > > > case FULL: > > > > > checkRootMessageChange(fromDescriptor, toDescriptor, > > > > strategy); > > > > > return; > > > > > case ALWAYS_COMPATIBLE: > > > > > return; > > > > > default: > > > > > throw new IncompatibleSchemaException("Unknown > > > > > SchemaCompatibilityStrategy."); > > > > > } > > > > > } > > > > > > > > > > I would rename : > > > > > from --> currentSchema > > > > > to --> newSchema > > > > > > > > > > Use that switch case and have a method for each like: > > > > > validateBackwardsCompatibility(currentSchema, newSchema) > > > > > > > > > > I dislike canRead and usage of writtenSchema, since you have two > > > > completely > > > > > different use cases: from the producing side and the consumer side. > > > > > > > > > > schemaValidatorBuilder > > > > > > > > > > > > I dislike this proposal. IMO Avro implementation is way too > > > > complicated. > > > > > Why not have a simple function for validation for each switch case > > above? > > > > > Why do we need strategy and builder, and all this complexity? > > > > > > > > > > > > > > > *Here are the basic compatibility rules we've defined:* > > > > > > > > > > > > > > > IMO it's impossible to read the validation rules as you described > > them. > > > > > I wrote how they should be structured numerous times above. > > > > > I can't validate them. > > > > > > > > > > > > > > > IMO, the current design is very hard to read. > > > > > Please try to avoid jumping into code sections. > > > > > Write a high level design section, in which you describe in words > > what > > > > you > > > > > plan to do. > > > > > Write the validation rules in the structure that is easy to > > understand: > > > > > rules per each compatibility check, and use proper words (current > > schema, > > > > > new schema), since new schema can be once used for read and once > > used for > > > > > write. > > > > > > > > > > In its current form it takes too much time to understand the design, > > and > > > > it > > > > > shouldn't be the case. > > > > > > > > > > Thanks, > > > > > > > > > > Asaf > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Sun, Mar 5, 2023 at 3:58 PM SiNan Liu <liusinan1...@gmail.com> > > wrote: > > > > > > > > > > > Hi! I updated the explanation of some things in the PIP issue. And > > also > > > > > > added a new “flag” in the conf is used as the different > > > > > > ProtobufNativeSchemaValidator implementation, also set > > > > > > ProtobufNativeSchemaValidator default only check whether the name > > of > > > > the > > > > > > root message is the same. > > > > > > > > > > > > > > > > > > Thanks, > > > > > > sinan > > > > > > > > > > > > > > > > > > Asaf Mesika <asaf.mes...@gmail.com> 于2023年3月5日周日 20:21写道: > > > > > > > > > > > > > On Wed, Mar 1, 2023 at 4:33 PM SiNan Liu <liusinan1...@gmail.com > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > Can you please explain how a Protobuf Schema descriptor can > > be > > > > > > > validated > > > > > > > > > for backward compatibility check using Avro based > > compatibility > > > > > > rules? > > > > > > > > > Doesn't it expect the schema to be Avro, but it is actually a > > > > > > Protobuf > > > > > > > > > descriptor? > > > > > > > > > Is there some translation happening? > > > > > > > > > > > > > > > > > > > > > > > > 1. *You can take a quick look at the previous design, the > > PROTOBUF > > > > uses > > > > > > > > avro struct to store.* > > > > > > > > https://github.com/apache/pulsar/pull/1954 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/pulsar/blob/579f22c8449be287ee1209a477aeaad346495289/pulsar-client/src/main/java/org/apache/pulsar/client/impl/schema/ProtobufSchema.java#L59-L61 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/pulsar/blob/579f22c8449be287ee1209a477aeaad346495289/pulsar-client/src/main/java/org/apache/pulsar/client/impl/schema/ProtobufSchema.java#L110-L115 > > > > > > > > > > > > > > > > > > > > > Ok. So to summarize your code (easier to write it than send > > links): > > > > > > > * Pulsar Client, when used with Protobuf Schema, actually > > converts > > > > the > > > > > > > Protobuf descriptor into an Avro Schema (using code found inside > > Avro > > > > > > > library) and saves that Avro schema as the schema. It's not > > saving > > > > the > > > > > > > protobuf descriptor at all. Very confusing I have to add - never > > > > expected > > > > > > > that. > > > > > > > This explains why In the ProtobufSchemaCompatibilityCheck they > > just > > > > > > extend > > > > > > > the Avro without doing any translation. > > > > > > > > > > > > > > Thanks for that. > > > > > > > > > > > > > > Now thatI finally understand this, I can say that: you *must* > > explain > > > > > > that > > > > > > > in the motivation part in your PIP. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 2. *On the broker side, protobuf and avro both use `SchemaData` > > > > > > converted > > > > > > > > to `org.apache.avro.Schema`.* > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/pulsar/blob/579f22c8449be287ee1209a477aeaad346495289/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/ServerCnx.java#L1280-L1293 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/pulsar/blob/579f22c8449be287ee1209a477aeaad346495289/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/schema/ProtobufSchemaCompatibilityCheck.java#L26-L31 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/pulsar/blob/579f22c8449be287ee1209a477aeaad346495289/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/schema/AvroSchemaBasedCompatibilityCheck.java#L47-L70 > > > > > > > > > > > > > > > > > > > > > Actually those links don't really help. > > > > > > > The main link that helps is: > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/pulsar/blob/ec102fb024a6ea2b195826778300f20e330dff06/pulsar-client/src/main/java/org/apache/pulsar/client/impl/schema/ProtobufSchema.java#L102-L122 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I'm sorry - I don't understand. > > > > > > > > > I understand the different compatibility check strategies. > > > > > > > > > If you just spell them out here, then as you say, just > > translate > > > > the > > > > > > > > > Protobuf Descriptor into an Avro schema and run the Avro > > > > > > > > > compatibility validation, no? > > > > > > > > > I believe the answer is no, since you may want to verify > > > > different > > > > > > > things > > > > > > > > > when it comes to Protobuf, which are different then Avro. > > > > > > > > > > > > > > > > > > > > > > > > 1. > > > > > > > > *ProtobufSchema is different from ProtobufNativeSchema in that > > it > > > > uses > > > > > > > > avro-protobuf.* > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://central.sonatype.com/artifact/org.apache.avro/avro-protobuf/1.11.1/overview > > > > > > > > *ProtobufNativeSchema needs a native compatibility check, but > > > > there is > > > > > > no > > > > > > > > official or third party implementation. So this PIP does not > > use > > > > > > > > avro-protobuf for protobuf compatibility checking.* > > > > > > > > > > > > > > > > 2. *By the way, this is implemented in much the same way that > > > > Apache > > > > > > avro > > > > > > > > does compatibility checking.* > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/SchemaValidatorBuilder.java > > > > > > > > `canReadStrategy`,`canBeReadStrategy`,`mutualReadStrategy` > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/ValidateCanRead.java > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/ValidateCanBeRead.java > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/ValidateMutualRead.java > > > > > > > > *In `ValidateMutualRead.java`, the arguments of `canRead()` are > > > > > > > > writtenSchema and readSchema. We only need to change the order > > of > > > > > > > arguments > > > > > > > > we pass to `canRead()`.* > > > > > > > > ```java > > > > > > > > private void validateWithStrategy(Descriptors.Descriptor > > > > toValidate, > > > > > > > > Descriptors.Descriptor fromDescriptor) throws > > > > > > > ProtoBufCanReadCheckException > > > > > > > > { > > > > > > > > switch (strategy) { > > > > > > > > case CanReadExistingStrategy -> canRead(fromDescriptor, > > > > toValidate); > > > > > > > > case CanBeReadByExistingStrategy -> canRead(toValidate, > > > > > > fromDescriptor); > > > > > > > > case CanBeReadMutualStrategy -> { > > > > > > > > canRead(toValidate, fromDescriptor); > > > > > > > > canRead(fromDescriptor, toValidate); > > > > > > > > } > > > > > > > > } > > > > > > > > } > > > > > > > > > > > > > > > > private void canRead(Descriptors.Descriptor writtenSchema, > > > > > > > > Descriptors.Descriptor readSchema) throws > > > > > > ProtoBufCanReadCheckException { > > > > > > > > > > > > > > > > > > > > > > > > > > > ProtobufNativeSchemaBreakCheckUtils.checkSchemaCompatibility(writtenSchema, > > > > > > > > readSchema); > > > > > > > > } > > > > > > > > ``` > > > > > > > > > > > > > > > > > > > > > > > I get that you want to take inspiration from the existing Avro > > Schema > > > > > > > compatibility check, to do your code design. > > > > > > > I also understand you *won't* use any existing avro code for > > that. > > > > > > > I also understand, you have to write the validation check on your > > > > own, > > > > > > > since there is no 3rd party to explain that. > > > > > > > > > > > > > > The only thing I can't understand are the actual rules as I wrote > > > > before, > > > > > > > since they are written confusingly. > > > > > > > So, I repeat what I asked before: > > > > > > > > > > > > > > I think you should structure the validation rules differently: > > > > > > > > > > > > > > * Backward checks > > > > > > > ** List down rules, where use newSchema (the schema used by > > producer > > > > or > > > > > > > consumer) and existingSchema (last schema used) > > > > > > > * Forward > > > > > > > ** List down rules, where use newSchema (the schema used by > > producer > > > > or > > > > > > > consumer) and existingSchema (last schema used) > > > > > > > > > > > > > > Once that's accomplished I will be able to understand the > > different > > > > > > > validation rules for each compatibility check. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > sinan > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Asaf Mesika <asaf.mes...@gmail.com> 于2023年3月1日周三 21:19写道: > > > > > > > > > > > > > > > > > > On Mon, Feb 27, 2023 at 3:47 PM SiNan Liu < > > > > liusinan1...@gmail.com> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I read it and they look identical. What's the difference > > > > between > > > > > > > > them? > > > > > > > > > > > > > > > > > > > > Current avro,json, and protobuf schemas are all implemented > > > > based > > > > > > on > > > > > > > > AVRO. > > > > > > > > > > > What do you mean, they are all implemented based on > > Avro? You > > > > > > mean > > > > > > > > the > > > > > > > > > > > protobuf schema is converted into an Avro Schema, and > > then > > > > you > > > > > > use > > > > > > > > Avro > > > > > > > > > > > compatibility validation? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > `org.apache.pulsar.broker.service.schema.ProtobufSchemaCompatibilityCheck` > > > > > > > > > > > > > > > > > > > > > `org.apache.pulsar.broker.service.schema.AvroSchemaCompatibilityCheck` > > > > > > > > > > > > > > > > > > > > > `org.apache.pulsar.broker.service.schema.JsonSchemaCompatibilityCheck` > > > > > > > > > > They all extends `AvroSchemaBasedCompatibilityCheck`, the > > > > > > > > > > `checkCompatible()` is the same implementation with `AVRO`. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Can you please explain how a Protobuf Schema descriptor can > > be > > > > > > > validated > > > > > > > > > for backward compatibility check using Avro based > > compatibility > > > > > > rules? > > > > > > > > > Doesn't it expect the schema to be Avro, but it is actually a > > > > > > Protobuf > > > > > > > > > descriptor? > > > > > > > > > Is there some translation happening? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I think you should structure the validation rules > > differently: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The Compatibility check strategy is described on the > > website > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://pulsar.apache.org/docs/next/schema-understand/#schema-compatibility-check-strategy > > > > > > > > > > 1. BACKWARD(CanReadExistingStrategy): Consumers using > > schema > > > > V3 can > > > > > > > > process > > > > > > > > > > data written by producers using the last schema version > > V2. So > > > > V2 > > > > > > is > > > > > > > > > > "writtenSchema" and V3 is "readSchema". > > > > > > > > > > 2. FORWARD(CanBeReadByExistingStrategy): Consumers using > > the > > > > last > > > > > > > > schema > > > > > > > > > > version V2 can process data written by producers using a > > new > > > > schema > > > > > > > V3, > > > > > > > > > > even though they may not be able to use the full > > capabilities > > > > of > > > > > > the > > > > > > > > new > > > > > > > > > > schema. So V3 is "writtenSchema" and V2 is "readSchema". > > > > > > > > > > 3. FULL(CanBeReadMutualStrategy): Schemas are both > > backward and > > > > > > > forward > > > > > > > > > > compatible. > > > > > > > > > > Schema can evolve. The old version schema and the new > > version > > > > > > schema > > > > > > > > should > > > > > > > > > > be well understood. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I'm sorry - I don't understand. > > > > > > > > > I understand the different compatibility check strategies. > > > > > > > > > If you just spell them out here, then as you say, just > > translate > > > > the > > > > > > > > > Protobuf Descriptor into an Avro schema and run the Avro > > > > > > > > > compatibility validation, no? > > > > > > > > > I believe the answer is no, since you may want to verify > > > > different > > > > > > > things > > > > > > > > > when it comes to Protobuf, which are different then Avro. > > > > > > > > > > > > > > > > > > At the current state, I can't understand your design at all. > > > > Please > > > > > > > help > > > > > > > > > clarify that. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > So each strategy should have its own section. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The arguments of `canRead()` are writtenSchema and > > readSchema. > > > > As > > > > > > > we've > > > > > > > > > > just described, we only need to change the order of > > arguments > > > > we > > > > > > pass > > > > > > > > to > > > > > > > > > > `canRead()`. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > sinan > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Asaf Mesika <asaf.mes...@gmail.com> 于2023年2月27日周一 20:49写道: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > And you can see the difference between ProtoBuf and > > > > > > > ProtoBufNative: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://pulsar.apache.org/docs/next/schema-get-started/#protobufnative > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://pulsar.apache.org/docs/next/schema-get-started/#protobuf > > > > > > > > > > > > > > > > > > > > > > > I read it and they look identical. What's the difference > > > > between > > > > > > > > them? > > > > > > > > > > > > > > > > > > > > > > Current avro,json, and protobuf schemas are all > > implemented > > > > based > > > > > > > on > > > > > > > > > > AVRO. > > > > > > > > > > > > > > > > > > > > > > What do you mean, they are all implemented based on > > Avro? You > > > > > > mean > > > > > > > > the > > > > > > > > > > > protobuf schema is converted into an Avro Schema, and > > then > > > > you > > > > > > use > > > > > > > > Avro > > > > > > > > > > > compatibility validation? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Here are the basic compatibility rules we've defined:* > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I think you should structure the validation rules > > > > differently: > > > > > > > > > > > > > > > > > > > > > > * Backward checks > > > > > > > > > > > ** List down rules, where use newSchema (the schema used > > by > > > > > > > producer > > > > > > > > or > > > > > > > > > > > consumer) and existingSchema (last schema used) > > > > > > > > > > > * Forward > > > > > > > > > > > ** List down rules, where use newSchema (the schema used > > by > > > > > > > producer > > > > > > > > or > > > > > > > > > > > consumer) and existingSchema (last schema used) > > > > > > > > > > > > > > > > > > > > > > So each strategy should have its own section. > > > > > > > > > > > > > > > > > > > > > > I'm saying this since you used "writttenSchema" word but > > it > > > > > > > > represents > > > > > > > > > > > something completely different if it's backward or > > forward > > > > check. > > > > > > > > > > > > > > > > > > > > > > Once you'll have that structure like that, I personally > > will > > > > be > > > > > > > able > > > > > > > > to > > > > > > > > > > > read and understand it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The motivation and problem statement are now good - > > thanks > > > > for > > > > > > > > improving > > > > > > > > > > > it. > > > > > > > > > > > > > > > > > > > > > > On Mon, Feb 27, 2023 at 8:20 AM SiNan Liu < > > > > > > liusinan1...@gmail.com> > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > Hi! I updated the PIP issue again. This time I've added > > > > some > > > > > > > > background > > > > > > > > > > > and > > > > > > > > > > > > some explanations. > > > > > > > > > > > > > > > > > > > > > > > > The compatibility check rules are already written in > > the > > > > > > > > > > Implementation. > > > > > > > > > > > > ProtoBufNative implements the same canRead method as > > Apache > > > > > > Avro. > > > > > > > > > > > > It does this by checking whether the schema for > > writing and > > > > > > > reading > > > > > > > > is > > > > > > > > > > > > compatible. I also indicate whether the writtenSchema > > and > > > > > > > > readSchema of > > > > > > > > > > > the > > > > > > > > > > > > Backward, Forward, and Full strategies are the old or > > the > > > > new > > > > > > > > version > > > > > > > > > > of > > > > > > > > > > > > the schema. > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > sinan > > > > > > > > > > > > > > > > > > > > > > > > Asaf Mesika <asaf.mes...@gmail.com> 于2023年2月26日周日 > > 23:24写道: > > > > > > > > > > > > > > > > > > > > > > > > > I'm sorry, but this PIP lacks a lot of background > > > > knowledge, > > > > > > so > > > > > > > > you > > > > > > > > > > > need > > > > > > > > > > > > to > > > > > > > > > > > > > add IMO for people to understand it. You don't need > > to > > > > > > explain > > > > > > > > the > > > > > > > > > > > entire > > > > > > > > > > > > > pulsar in this PIP, but at the very least a few > > > > paragraphs > > > > > > > > detailing > > > > > > > > > > > all > > > > > > > > > > > > > you need to know, to put you in context: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Start by saying Pulsar as a built-in schema > > registry > > > > > > > inside > > > > > > > > > > Pulsar > > > > > > > > > > > > > broker. > > > > > > > > > > > > > - Every time the client updates the schema, it > > > > uploads > > > > > > it > > > > > > > > to > > > > > > > > > > the > > > > > > > > > > > > > broker. When that happens, it has a feature > > which > > > > > > > validates > > > > > > > > if > > > > > > > > > > > the > > > > > > > > > > > > > new > > > > > > > > > > > > > schema version is compatible with the previous > > > > > > versions. > > > > > > > > There > > > > > > > > > > > > > are 4 types > > > > > > > > > > > > > of compatibility: Full, ... (complete and > > explain > > > > each > > > > > > > one > > > > > > > > > > > briefly) > > > > > > > > > > > > > - Also explain Pulsar Schema registry supports > > various > > > > > > > schema > > > > > > > > > > > > > protocols: Avro, protobuf native, ... (complete > > the > > > > > > rest), > > > > > > > > each > > > > > > > > > > > > > protocol > > > > > > > > > > > > > has a schema which dictates how to serialize and > > > > > > deserialize > > > > > > > > the > > > > > > > > > > > > message > > > > > > > > > > > > > content into typed object. > > > > > > > > > > > > > - Explain in short what is protobuf native > > (compare > > > > > > protobuf > > > > > > > > > > > > non-native) > > > > > > > > > > > > > - Please don't paste code instead of explaining. > > > > > > > > > > > > > - Explain that protobuf native current > > validation > > > > check > > > > > > > is > > > > > > > > only > > > > > > > > > > > > > composed of checking the root message name is > > the > > > > same > > > > > > > > between > > > > > > > > > > > > > the current > > > > > > > > > > > > > schema version and the new version. > > > > > > > > > > > > > - Explain briefly what is a root message > > and its > > > > > > name. > > > > > > > > > > > > > - Explain the problem (list scenarios) that we > > have > > > > > > > because > > > > > > > > > > > > protobuf > > > > > > > > > > > > > native schema only supports FULL compatibility > > > > > > > validation. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Regarding high level design - as in what you plan to > > do. > > > > > > > > > > > > > I suggest you add "High Level Design" and in it > > detail > > > > how > > > > > > you > > > > > > > > plan > > > > > > > > > > to > > > > > > > > > > > > > validate, per protobuf version, per compatibility > > check > > > > > > > > (backward, > > > > > > > > > > > > forward, > > > > > > > > > > > > > full,...). > > > > > > > > > > > > > I tried reading the implementation - for me , it's > > all > > > > over > > > > > > the > > > > > > > > > > place. > > > > > > > > > > > > Can > > > > > > > > > > > > > you please list in order what I wrote above, and > > list the > > > > > > > > validation > > > > > > > > > > > > rules > > > > > > > > > > > > > with a good explanation why you validate it like > > that? > > > > > > > > > > > > > > > > > > > > > > > > > > Lastly, one you have all the validation rules clearly > > > > stated, > > > > > > > you > > > > > > > > can > > > > > > > > > > > use > > > > > > > > > > > > > it to document it properly so users can know what > > > > validation > > > > > > to > > > > > > > > > > expect. > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > > > > > > > > > Asaf > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Feb 22, 2023 at 5:10 PM SiNan Liu < > > > > > > > > liusinan1...@gmail.com> > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > Sorry, my mistake. I removed the code and > > described the > > > > > > > design > > > > > > > > to > > > > > > > > > > > > improve > > > > > > > > > > > > > > the PROTOBUF_NATIVE schema compatibility checks. > > You > > > > can > > > > > > > have a > > > > > > > > > > look. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Asaf Mesika <asaf.mes...@gmail.com> 于2023年2月22日周三 > > > > 21:16写道: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I read it but you're almost directly diving into > > the > > > > > > code - > > > > > > > > it > > > > > > > > > > will > > > > > > > > > > > > > take > > > > > > > > > > > > > > me > > > > > > > > > > > > > > > hours just to reverse engineer your design. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Can you please include a "High Level Design" > > section > > > > in > > > > > > > which > > > > > > > > you > > > > > > > > > > > > > explain > > > > > > > > > > > > > > > how you plan to tackle any issue? > > > > > > > > > > > > > > > If I can read that section and explain to someone > > > > else > > > > > > how > > > > > > > > this > > > > > > > > > > > will > > > > > > > > > > > > > > work, > > > > > > > > > > > > > > > it means the section is complete. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Let's leave the code to the PRs. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Sun, Feb 19, 2023 at 2:59 PM SiNan Liu < > > > > > > > > > > liusinan1...@gmail.com> > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I made a PIP to discuss: > > > > > > > > > > > > > https://github.com/apache/pulsar/issues/19565 > > > > > > > > > > > > > > . > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We can talk about the current design here. > > > > Especially > > > > > > for > > > > > > > > the > > > > > > > > > > > field > > > > > > > > > > > > > > type > > > > > > > > > > > > > > > > change check rules, please give your valuable > > > > advice. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > Sinan > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >