Hi, all. Are there any other additions and modifications to this PIP?
Thanks, sinan SiNan Liu <liusinan1...@gmail.com> 于2023年3月26日周日 14:25写道: > *Hi, Asaf! This week I in the research about `SchemaRegistryServiceImpl` > optimization, sorry to reply later.* > > 1. > >> Well, this is exactly the argument we had in this thread. This is why you >> make 2 implementation classes ( I called them v1, v2 if you remember). The >> user can choose between them. This is exactly while I liked the _v1 _v2 >> naming, so they will know they are not backward compatible. >> Also, you're talking about improving right? Checking the message name is >> outright wrong, so you're actually fixing a big mistake someone made back >> then. > > > This is just a very simple check the root message name of the two proto, > does it really matter? > This PIP is a more stringent check, so the previous implementation rules > for checking the root message are also needed. > This is a very small problem, and I don't think it's necessary to go into > such a discussion. > > > > 2. > >> Your link to Stackoverflow will fail *my* validation: Since `int foo = 1` >> --> `int foo = 2;` : same name, same type, different field number --> >> fail >> validation. >> So it doesn't prove anything, or help this argument. > > > If I have in readSchema >> int32 justAnID = 1 >> int32 customerId = 2 >> and in writeSchema I have >> int32 justAnID = 1 >> string customerId = 3 >> This is valid. >> You will fail this validation check, since customerId has different field >> numbers (2 and 3), but I decided to remove field number 2 and add a new >> field with the same name, different type, with field number 3. When I >> read, >> field number 2 will get default value of 0, and ignore field number 3. >> Valid. >> If you compare by field type, then `int32 customerId = 2` to `int32 >> customerId = 3`, would fail, since that looks super strange right? Why >> remove and add the same field exactly? > > > If there exists a field named customerId with number 2 in readSchema and > there also exists a field named customerId with number 3 in writeSchema, > then the two schemas are incompatible. > Because when reading data encoded with writeSchema, the reader will try to > get the customerId with number 2, but it's actually number 3, which will > cause the read to fail. > > > > (4) The writtenSchema can not change the field name of any field in >> > > > readSchema (the field number is the same, but the field name is >> > > > different). >> > > This is incorrect. >> > > Fields names are encoded into the wire. I don't see this in any best >> > > practice. > > > If the writer and reader use the same field number, Protobuf will > correctly serialize and deserialize the data, even if the field names are > different, because serialization and deserialization use the field number > to determine the field. > *So you're right. We can rename it.* > > *Name renaming of the same number is compatible. * > *However, the numbering of the fields is different, which will result in a > mismatch between the field number when the message is written and the field > number when the message is read, * > *resulting in the data not being parsed correctly when the message is > read.* > > > 3. > >> > > > - The writtenSchema cannot add required fields, but optional or >> > > > duplicate fields can be added (The field number must be new). >> > > > >> > > > That's not true. >> > > You can have a required field in writeSchema, and not have that >> field in >> > > readSchema (based on tag number). >> > >> > >> > The required field must exist. In your case, where does readSchema go >> to >> > read its required fields? It's not in writtenSchema. >> > The second and third sentences in the official website say, add and >> delete >> > do not operate required field! > > > Just like on the official website about why not add the required field. > You can check it out. > > https://protobuf.dev/programming-guides/dos-donts/#dont-add-a-required-field > > Required field to be so harmful that it's being removed. > > > 4. > >> > > So if my write schema is >> > > message SearchRequest { required string query = 1; optional int32 >> > > page_number = 2; optional int32 result_per_page = 3;} >> > > and my read schema is >> > > message SearchRequest { optional int32 page_number = 2; optional >> > > int32 result_per_page = 3;} >> > > You can see I removed a field from writeSchema that does not have a >> > default >> > > value. >> > > First, I can read it without any problems: Field 2 might exists, no >> > > problem. Field 3 might exists, no problem. I ignore field 1. >> > > So the validation you wrote is no correct at all, without any regard >> to >> > > optional value. >> > >> > >> > *It looks like you've added a required field here, which is not >> allowed (in >> > 3 i am said that).* >> > *It also uses the number 1 of the deleted field.* >> > >> > I removed the required field "required string query = 1" from >> writeSchema >> compared with readSchema. >> Of course you can say I added "required string query = 1" to writeSchema >> compared with readSchema. >> The main question: can I use readSchema to read message written by >> writeSchema? >> Yes I can. >> I wrote why above. >> I didn't use the number of 1 of deleted field. The read schema is using >> number 2 and 3. > > > This is a problem with 3 and does not allow the required field to be added. > > > 5. > >> > What if I do the other way around? >> > > My write schema is: >> > > message SearchRequest { optional int32 page_number = 2; optional >> > > int32 result_per_page = 3;} >> > > My read schema is: >> > > message SearchRequest { required string query = 1; optional int32 >> > > page_number = 2; optional int32 result_per_page = 3;} >> > > Here I will fail, since it is required I will provide value for >> field 1 >> > but >> > > alas it will never be there. >> > >> > >> > Yes, the required field has been removed, which is incompatible. >> > The check for changes to the required field starts at the beginning, at >> > which point an incompatible exception has been thrown and the following >> > compatibility check is not performed. >> > >> You can't say "The required field " - it's plain wrong. >> Look at your rule: >> The writtenSchema cannot remove required fields in the readSchema. >> You can say: the writeSchema has removed a required field compared with >> readSchema. >> This rule will fail the validation. >> *not* the default rule we're discussing. > > > Here we are talking about changing the default value. But in this case the > writtenSchema removes the required field so it is not compatible. In > implementation, the rules check are in order. Changes to the required field > are checked at the beginning. So it is not necessary to discuss changes to > the required field until the default value is changed. > > > 6. > >> > *It looks like sixth item in PIP should be removed.* >> > *And Rule 7 in PIP should be removed:* >> > >> > Delete rule 6. >> I would keep rule 7. See: >> >> https://protobuf.dev/programming-guides/dos-donts/#dont-change-the-default-value-of-a-field >> It sounds like a good best practice. We can omit it if we don't want to be >> overly strict. > > > I think we should really keep rule 7(In PIP issue, it is `(5)` now). > Follow the instructions on the website. > ``` > (7) There can not be a field which exists both in `readSchema` and > `writtenSchema`, with same field number, having different default values. > **This rule applies to proto2. But in proto3, the default value cannot be > set by yourself! If you use proto3, it will not check for this rule** > ``` > By the way, changes to the required field are not allowed in `3` and > should be followed as well. > > > 7. > >> > > Changing a default value is generally OK, as long as you remember >> that >> > > default values are never sent over the wire. >> > > Thus, if a program receives a message in which a particular field >> isn’t >> > > set, the program will see the default value as it was defined in that >> > > program’s version of the protocol. >> > > It will NOT see the default value that was defined in the sender’s >> code. >> > >> > This is in the official document 11: >> > https://protobuf.dev/programming-guides/proto2/#updating >> > >> > >> > >> So what do you wish to state with this quote? > > > Oh, rule 7 has been retained. Let's ignore this. > > > > *I also took into account that proto2 was a really problematic version. > And the pulsar website seems to say that only proto3 is used.* > https://pulsar.apache.org/docs/next/schema-understand/#struct-schema > *```* > *ProtobufNativeSchema is based on protobuf native descriptor.* > *This allows Pulsar to:* > *- use native protobuf-v3 to serialize or deserialize data.* > *- use AutoConsume to deserialize data.* > *```* > *In my opinion, there are some differences between proto2 and proto3 check > rules, and this PIP also supports proto2. So for this PIP, we may need to > change the description of "ProtoBufNative" on the official website as well.* > > > > Thanks, > sinan > > > > > > Asaf Mesika <asaf.mes...@gmail.com> 于2023年3月20日周一 17:46写道: > >> On Sun, Mar 19, 2023 at 4:47 PM SiNan Liu <liusinan1...@gmail.com> wrote: >> >> > 1. >> > >> > > message SearchReq { string query = 1; int32 page_number = 2; int32 >> > > result_per_page = 3;} >> > > Then second version I use: >> > > message SearchRequest { string query = 1; int32 page_number = 2; >> > > int32 result_per_page = 3;} >> > >> > >> > The rule in PIP improve the previous implementation, so the previous >> > implementation needs to be added. >> > If the user switches from the previous implementation to the current >> > implementation of PIP, and the current implementation doesn't check for >> > root message name changes, isn't that changing the old behavior? This >> PIP >> > is to make compatibility checking more stringent. >> > >> > >> Well, this is exactly the argument we had in this thread. This is why you >> make 2 implementation classes ( I called them v1, v2 if you remember). The >> user can choose between them. This is exactly while I liked the _v1 _v2 >> naming, so they will know they are not backward compatible. >> >> Also, you're talking about improving right? Checking the message name is >> outright wrong, so you're actually fixing a big mistake someone made back >> then. >> >> >> > >> > 2. >> > >> > > This is a guideline for a human making the change, not for software >> > > validation right? >> > > When you write down the code for doing the comparison to know if the >> > field >> > > number has changed, you have to take into account both the field name >> AND >> > > field type. If both are equal but the field number is different then >> fail >> > > it. >> > >> > >> > I use the field name here to match, the name is the same, but the >> number is >> > not the same will be incompatible. >> > >> > >> https://github.com/apache/pulsar/pull/19566/files#diff-14b840259375cc8fcae564586fc1c2a188cb71822ab5995ed6807f7dab30c210R129-R132 >> > PIP description here is the rule, but how to implement it is a matter in >> > PR. I have considered the example you gave here, which is also >> incompatible >> > in this case. >> > >> > Your changes are also described here: >> > >> > >> https://stackoverflow.com/questions/65230623/safeness-of-changing-proto-field-number >> > >> > >> This sentence "the PIP description here is the rule, but how to >> implemented >> it is a matter in PR" is a completely invalid argument. >> For compatibility rules you specify the exact validation. >> We argue here about the rule. >> You can say in the PIP you will match by name and then in PR match by name >> and type. You have to be exact in the PIP. >> >> Regarding "I have considered the example you gave here, which is also >> incompatible" - please explain? I find it compatible - meaning you can >> read >> using the readSchema message written using writeSchema. >> >> Your link to Stackoverflow will fail *my* validation: Since `int foo = 1` >> --> `int foo = 2;` : same name, same type, different field number --> fail >> validation. >> So it doesn't prove anything, or help this argument. >> >> >> >> >> >> > >> > 3. >> > >> > > If you'll read sentence 2 from the guide, you'll see they write >> > > "This means that any messages serialized by code using your “old” >> message >> > > format can be parsed by your new generated code, as they won’t be >> missing >> > > any required elements." >> > > This is for the *other side*: when you add a required to readSchema >> > > compare >> > > to write schema. This of course is not allowed and won't work. >> > >> > >> > Note that in proto2: **Any new fields that you add should be optional or >> > repeated.** >> > https://protobuf.dev/programming-guides/proto2/#updating >> > >> > I also stated in PIP that proto3 will not check this rule for required >> > field changes. >> > This rule applies to proto2. However, proto3 removes required. If you >> use >> > proto3, it will not check for changes to the required field >> > >> > I'm talking about proto2. >> >> Listen. >> >> This is what you wrote in the PIP, ok? >> >> >> > - The writtenSchema cannot add required fields, but optional or >> > duplicate fields can be added (The field number must be new). >> > >> > I answered: >> >> That's not true. >> > You can have a required field in writeSchema, and not have that field in >> > readSchema (based on tag number). >> >> I also gave you an example showing what you wrote is not true (see >> previous >> mails), in which I add a required field to writeSchema compared with >> readSchema and I can still read it with readSchema. >> >> You mention a link to a guideline saying you can only add optional or >> repeated without reading the rest of the paragraph and without >> understanding the protocol buffer encoding and decoding. >> >> I'm not sure how to move forward here. >> This validation rule is invalid. >> >> >> >> >> > >> > 4. >> > >> > > My write schema is >> > > message SearchRequest { string query = 1; int32 page_number = 2; >> > > int32 result_per_page = 3;} >> > > my read schema is >> > > message SearchRequest { string query = 1; int32 page_number = 2; >> > > int32 resultPerPage = 3;} >> > > - Non-required fields can be removed, as long as the field number is >> not >> > > used again in your updated message type. You may want to rename the >> field >> > > instead, perhaps adding the prefix “OBSOLETE_”, or make the field >> number >> > > reserved <https://protobuf.dev/programming-guides/proto2/#reserved>, >> so >> > > that future users of your .proto can’t accidentally reuse the number. >> > > First, they are talking about removal of fields. They suggest you >> rename >> > > the field to OBSOLETE_myField instead of actually deleting it, to >> make it >> > > easier on compatability. They don't talk about rules that forbid you >> to >> > > rename a field. It's unrelated. >> > >> > >> > Maybe you are right that changing the field name is compatible. >> > >> > >> https://stackoverflow.com/questions/45431685/protocol-buffer-does-changing-field-name-break-the-message/45431953#comment84548234_45431953 >> > >> > But it is interesting I tested the use `avro-proto` implementation >> > `ProtobufSchemaCompatibilityCheck` compatibility check.( >> > >> `org.apache.pulsar.broker.service.schema.BaseAvroSchemaCompatibilityTest`) >> > >> > private static final String schemaJson1 = >> > >> > >> "{\"type\":\"record\",\"name\":\"DefaultTest\",\"namespace\":\"org.apache.pulsar.broker.service.schema" >> > + >> > ".AvroSchemaCompatibilityCheckTest\",\"fields\":[ >> > {\"name\":\"field1\",\"type\":\"string\"}]}"; >> > >> > private static final String schemaJson4 = >> > >> > >> "{\"type\":\"record\",\"name\":\"DefaultTest\",\"namespace\":\"org.apache.pulsar.broker.service.schema" >> > + >> > >> > >> ".AvroSchemaCompatibilityCheckTest\",\"fields\":[{\"name\":\"field1_v2\",\"type\":\"string\"," >> > + >> > "\"aliases\":[\"field1\"]}]}"; >> > >> > schemaCompatibilityCheck.isCompatible(schemaData1, schemaData4, >> > SchemaCompatibilityStrategy.BACKWARD); >> > This is compatible. >> > >> > But that's not compatible >> > schemaCompatibilityCheck.isCompatible(schemaData1, schemaData4, >> > SchemaCompatibilityStrategy.FORWARD); >> > >> > So, what do you think of AVRO implementation >> > `ProtobufSchemaCompatibilityCheck`, rules about renaming are wrong is >> it? >> > It looks something like this. >> > >> > >> You're asking why Avro doesn't allow when read schema is schema 4, and >> write schema is 1 and the difference between them is a single field >> rename? >> I don't know - need to debug to understand. >> I do know that Avro encoding is completely different. They don't have a >> field identifier like protobuf does. They completely rely on the schema of >> the writer and the reader. >> But it is besides the point, since they are supposed to allow field rename >> from my knowledge. >> You can debug and find out. >> >> >> >> > >> > 5. >> > >> > > I think my previous explanation in this email reply should be good >> enough >> > > to explain why the name is irrelevant. >> > > If not, please ping me on this. >> > > Regarding your example, can you quote it? I don't understand the read >> and >> > > write schema here. >> > > Also, regarding message names and why they are irrelevant. Same thing: >> > Read >> > > https://protobuf.dev/programming-guides/encoding/ >> > > In there you will see the names are not encoded thus can be changed >> > freely >> > > and still successfully decode binary data. >> > >> > >> > Reader.proto >> > ```protobuf >> > syntax = "proto2"; >> > package proto.reader; >> > import "ExternalReader.proto"; >> > option java_package = "org.apache.pulsar.client.schema.proto.reader"; >> > option java_outer_classname = "Reader"; >> > >> > enum WeekEnum { >> > ...... >> > } >> > >> > message ProtobufMessage { >> > required string protobufFoo = 1; >> > required double protobufBar = 2; >> > } >> > >> > message ProtobufSchema { >> > ...... >> > optional ProtobufMessage messageField = 12; >> > ...... >> > optional WeekEnum enumField = 15; >> > } >> > ``` >> > >> > WriterWithTypeNameChange.proto: >> > ```protobuf >> > syntax = "proto2"; >> > package proto.writerWithTypeNameChange; >> > import "ExternalReader.proto"; >> > option java_package = >> > "org.apache.pulsar.client.schema.proto.writerWithTypeNameChange"; >> > option java_outer_classname = "WriterWithTypeNameChange"; >> > >> > enum WeekEnum { >> > ...... >> > } >> > >> > message ProtobufMessage_V2 { >> > required string protobufFoo = 1; >> > required double protobufBar = 2; >> > } >> > >> > message ProtobufSchema { >> > ...... >> > optional ProtobufMessage_V2 messageField = 12; >> > ...... >> > optional WeekEnum enumField = 15; >> > ...... >> > } >> > ``` >> > >> > *Doesn't that change the field type? Should that be compatible?* >> > >> > >> It doesn't change the field type since you haven't changed the field types >> of the fields inside ProtobufMessage - they are still field 1 and field 2 >> double. >> It is compatible since the message type name is not encoded in the binary. >> >> >> > >> > 6. >> > >> > > So if my write schema is >> > > message SearchRequest { required string query = 1; optional int32 >> > > page_number = 2; optional int32 result_per_page = 3;} >> > > and my read schema is >> > > message SearchRequest { optional int32 page_number = 2; optional >> > > int32 result_per_page = 3;} >> > > You can see I removed a field from writeSchema that does not have a >> > default >> > > value. >> > > First, I can read it without any problems: Field 2 might exists, no >> > > problem. Field 3 might exists, no problem. I ignore field 1. >> > > So the validation you wrote is no correct at all, without any regard >> to >> > > optional value. >> > >> > >> > *It looks like you've added a required field here, which is not allowed >> (in >> > 3 i am said that).* >> > *It also uses the number 1 of the deleted field.* >> > >> > I removed the required field "required string query = 1" from >> writeSchema >> compared with readSchema. >> Of course you can say I added "required string query = 1" to writeSchema >> compared with readSchema. >> The main question: can I use readSchema to read message written by >> writeSchema? >> Yes I can. >> I wrote why above. >> I didn't use the number of 1 of deleted field. The read schema is using >> number 2 and 3. >> >> >> >> >> >> > What if I do the other way around? >> > > My write schema is: >> > > message SearchRequest { optional int32 page_number = 2; optional >> > > int32 result_per_page = 3;} >> > > My read schema is: >> > > message SearchRequest { required string query = 1; optional int32 >> > > page_number = 2; optional int32 result_per_page = 3;} >> > > Here I will fail, since it is required I will provide value for field >> 1 >> > but >> > > alas it will never be there. >> > >> > >> > Yes, the required field has been removed, which is incompatible. >> > The check for changes to the required field starts at the beginning, at >> > which point an incompatible exception has been thrown and the following >> > compatibility check is not performed. >> > >> >> You can't say "The required field " - it's plain wrong. >> >> Look at your rule: >> >> The writtenSchema cannot remove required fields in the readSchema. >> >> >> You can say: the writeSchema has removed a required field compared with >> readSchema. >> This rule will fail the validation. >> *not* the default rule we're discussing. >> >> >> >> >> > >> > Now let's focus on the optional field, ok? >> > > Write schema >> > > message SearchRequest { optional int32 page_number = 2; optional >> > > int32 result_per_page = 3;} >> > > Read schema: >> > > message SearchRequest { optional string query = 1; optional int32 >> > > page_number = 2; optional int32 result_per_page = 3;} >> > > When I read, I won't find field number 1 of course, so I will use the >> > > default value dictated by the type, as I quoted above, which is for >> > string >> > > is the empty byte string. >> > >> > >> > *It looks like sixth item in PIP should be removed.* >> > *And Rule 7 in PIP should be removed:* >> > >> > Delete rule 6. >> I would keep rule 7. See: >> >> https://protobuf.dev/programming-guides/dos-donts/#dont-change-the-default-value-of-a-field >> It sounds like a good best practice. We can omit it if we don't want to be >> overly strict. >> >> >> >> > > Changing a default value is generally OK, as long as you remember that >> > > default values are never sent over the wire. >> > > Thus, if a program receives a message in which a particular field >> isn’t >> > > set, the program will see the default value as it was defined in that >> > > program’s version of the protocol. >> > > It will NOT see the default value that was defined in the sender’s >> code. >> > >> > This is in the official document 11: >> > https://protobuf.dev/programming-guides/proto2/#updating >> > >> > >> > >> So what do you wish to state with this quote? >> >> >> > 7. >> > >> > > Read this and then please explain why rule number 3 should not be >> added: >> > > >> > > >> > >> https://protobuf.dev/programming-guides/dos-donts/#dont-go-from-repeated-to-scalar >> > >> > >> > It looks like this rule needs to be added. >> > >> > Thanks >> >> >> > >> > >> > Thanks, >> > sinan >> > >> > >> > >> > Asaf Mesika <asaf.mes...@gmail.com> 于2023年3月19日周日 19:11写道: >> > >> > > Also: >> > > >> > > (2) PROTOBUF_NATIVE was designed so that not use avro-protobuf for >> > protobuf >> > > > schema compatibility checking. >> > > >> > > The root message name is the class name we pass in when we create the >> > > > producer or consumer. ProtoBuf has many nested messages or >> > dependencies. >> > > > The current implementation only checks if the passed class name is >> the >> > > > same. It does not check if the fields in the file change in a way >> that >> > is >> > > > compatible with older versions of the schema. >> > > >> > > >> > > You missed the most important thing there: PROTOBUF_NATIVE uses >> Protobuf >> > > Descriptor when persisting the schema. It's not using Avro Schema >> > > definition. As I wrote you in previous email: >> > > >> > > PROTOBUF_NATIVE was created to fix that shortcoming, by actually >> > persisting >> > > > the Protobuf Descriptor and using Protobuf for encoding. >> > > > >> > > >> > > >> > > >> > > >> > > >> > > On Wed, Mar 15, 2023 at 5:46 PM SiNan Liu <liusinan1...@gmail.com> >> > wrote: >> > > >> > > > 1. >> > > > >> > > > > > Why? The root message name is not written over the wire to the >> best >> > > of >> > > > my >> > > > > > knowledge. I haven't found it written in the official doc. >> > > > >> > > > >> > > > The name of the root message check is the rules in the previous >> > > > `ProtobufNativeSchemaCompatibilityCheck`. Because if the root >> message >> > > has a >> > > > different name, there is no need to check its contents. "Same" >> schema, >> > > > their names must be the same. >> > > > >> > > > >> > > > 2. >> > > > >> > > > > >The writtenSchema can not change the field number of any field in >> > > > > readSchema (the > field name is the same, but the field number is >> > > > > different). >> > > > > >You have to take into account field type as well when comparing. >> > > > >> > > > >> > > > The first sentence on the website says that the number of fields >> cannot >> > > be >> > > > changed. >> > > > >> > > > > Don’t change the field numbers for any existing fields. >> > > > >> > > > >> > > > >> > > > 3. >> > > > >> > > > > > - The writtenSchema cannot add required fields, but optional or >> > > > > > duplicate fields can be added (The field number must be new). >> > > > > > >> > > > > > That's not true. >> > > > > You can have a required field in writeSchema, and not have that >> field >> > > in >> > > > > readSchema (based on tag number). >> > > > >> > > > >> > > > The required field must exist. In your case, where does readSchema >> go >> > to >> > > > read its required fields? It's not in writtenSchema. >> > > > The second and third sentences in the official website say, add and >> > > delete >> > > > do not operate required field! >> > > > >> > > > >> > > > 4. >> > > > >> > > > > (4) The writtenSchema can not change the field name of any field >> in >> > > > > > readSchema (the field number is the same, but the field name is >> > > > > > different). >> > > > > This is incorrect. >> > > > > Fields names are encoded into the wire. I don't see this in any >> best >> > > > > practice. >> > > > >> > > > >> > > > The third sentence on the website: >> > > > >> > > > > You may want to rename the field instead, perhaps adding the >> prefix >> > > > > “OBSOLETE_”, or make the field number reserved, so that future >> users >> > of >> > > > > your .proto can’t accidentally reuse the number. >> > > > >> > > > If you want to rename a field, or add a new field. To delete with >> the >> > new >> > > > number! >> > > > >> > > > >> > > > 5. >> > > > >> > > > > The writtenSchema does not change the field name and number, but >> it >> > > does >> > > > > change the field type. >> > > > > > Small correction: for the same field number you are not allowed >> to >> > > > change >> > > > > types. Name is irrelevant. >> > > > >> > > > >> > > > Why doesn't the name irrelevant? >> > > > Here is the change in type, which is the rule stated in >> Alternatives. >> > > There >> > > > is no check here, just a warning to the user. >> > > > Another change is that the name of enum is changed, or the name of >> > > MESSAGE >> > > > is changed, which is the same as the root message name check in 1, >> is >> > the >> > > > change still the same? This is not allowed to change! >> > > > >> > > > *Here is example:* >> > > > readSchema( >> > > > >> > > > >> > > >> > >> https://github.com/apache/pulsar/pull/19566/files#diff-a7006d73502e6064a80af02822f3a3072be498d8b677c4b838b0dafaea32dea4 >> > > > ) >> > > > writtenSchema( >> > > > >> > > > >> > > >> > >> https://github.com/apache/pulsar/pull/19566/files#diff-e3e7543624edaf1e0a4fd47947a2cad6e4b816b93843f71a367042ba6c3ec53f >> > > > ) >> > > > >> > > > >> > > > 6. >> > > > >> > > > > (6) The writtenSchema removes fields that do not have default >> values >> > in >> > > > > > readSchema. Then the schema is incompatible. >> > > > > Protobuf gives you its own default if you don't supply one. This >> is >> > > > > incorrect. >> > > > >> > > > >> > > > (1) This rule only applies if proto2 does not set the default >> value. If >> > > > proto3 does not check, the default value will always be there. >> > > > >> > > > (2) In PIP issue: >> > > > >> > > > > Proto3 canceled the required field, so there is no need to check >> the >> > > > > required field. We get the syntax(proto2 or proto3) of proto in >> the >> > > code, >> > > > > and skip the check of the required field if it is proto3. All >> other >> > > > > checking rules also apply to proto3. >> > > > >> > > > >> > > > *I made a mistake here. This default value check is not need in >> > proto3. I >> > > > will modify the rules later according to your suggestion.* >> > > > >> > > > > I would remove the proto2/proto3 sections, since they only differ >> in >> > 1 >> > > > > rule, and just mention that distinction inside that rule (less >> work >> > for >> > > > the >> > > > > reade). >> > > > >> > > > >> > > > (3) And add rules that look like they should be checked: >> > > > >> > > > > Rules that you don't have in the doc, but should IMO* >> > > > > ....... >> > > > >> > > > There can't be a field in writerSchema, that exists in readerSchema >> > (tag >> > > > > number based), which in writerSchema its type is scalar, but in >> > > > readSchema >> > > > > its type is scalar, it's repeated but with packed=true. >> > > > >> > > > >> > > > But I don't think rule number three needs to be added. >> > > > >> > > > >> > > > Thanks, >> > > > sinan >> > > > >> > > > >> > > > >> > > > Asaf Mesika <asaf.mes...@gmail.com> 于2023年3月14日周二 22:33写道: >> > > > >> > > > > Hi Sinan, >> > > > > >> > > > > The doc looks much better! >> > > > > >> > > > > I have a few additional comments: >> > > > > >> > > > > Pasting comment from previous emails: >> > > > > >> > > > > Can you convert the code block which is actually a quote in the >> > > > > beginning of the PIP to something which doesn't require to scroll >> > > > > horizontally so much? >> > > > > Use >> > > > > >> > > > > >> > > > >> > > >> > >> https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax#quoting-text >> > > > > >> > > > > *Validation Rules* >> > > > > >> > > > > (1) If the root message names of writtenSchema and readSchema are >> > > > > > different, then incompatible. >> > > > > >> > > > > Why? The root message name is not written over the wire to the >> best >> > of >> > > my >> > > > > knowledge. I haven't found it written in the official doc. >> > > > > >> > > > > >> > > > > > - The writtenSchema cannot add required fields, but optional >> or >> > > > > > duplicate fields can be added (The field number must be new). >> > > > > > >> > > > > > That's not true. >> > > > > You can have a required field in writeSchema, and not have that >> field >> > > in >> > > > > readSchema (based on tag number). >> > > > > >> > > > > The writtenSchema can not change the field number of any field in >> > > > > readSchema (the >> > > > > > field name is the same, but the field number is different). >> > > > > >> > > > > You have to take into account field type as well when comparing. >> > > > > >> > > > > If I have in readSchema >> > > > > int32 justAnID = 1 >> > > > > int32 customerId = 2 >> > > > > >> > > > > and in writeSchema I have >> > > > > int32 justAnID = 1 >> > > > > string customerId = 3 >> > > > > >> > > > > This is valid. >> > > > > >> > > > > (4) The writtenSchema can not change the field name of any field >> in >> > > > > > readSchema (the field number is the same, but the field name is >> > > > > > different). >> > > > > >> > > > > This is incorrect. >> > > > > Fields names are encoded into the wire. I don't see this in any >> best >> > > > > practice. >> > > > > >> > > > > ) The writtenSchema does not change the field name and number, >> but it >> > > > does >> > > > > > change the field type. >> > > > > > >> > > > > > - If the field type is ENUM or MESSAGE, the schema is not >> > > compatible >> > > > > > when the type name is changed >> > > > > > - If the type of the field is another type. The schemas under >> > this >> > > > > > rule are not incompatible, but warn the user.(There is >> another >> > way >> > > > of >> > > > > > testing in PIP issue Alternatives) >> > > > > > >> > > > > > Small correction: for the same field number you are not allowed >> to >> > > > change >> > > > > types. Name is irrelevant. >> > > > > >> > > > > (6) The writtenSchema removes fields that do not have default >> values >> > in >> > > > > > readSchema. Then the schema is incompatible. >> > > > > >> > > > > Protobuf gives you its own default if you don't supply one. This >> is >> > > > > incorrect. >> > > > > >> > > > > >> > > > > *Rules that you don't have in the doc, but should IMO* >> > > > > * There can not be a field which exists both in readSchema and >> > > > writeSchema, >> > > > > with same tag number, having different default values >> > > > > * There can't be a field in writerSchema, that exists in >> readerSchema >> > > > (tag >> > > > > number based), which in writerSchema is repeated and its type is >> > > Scalar ( >> > > > > https://protobuf.dev/programming-guides/proto/#scalar) but in >> > > readSchema >> > > > > it >> > > > > is not repeated anymore. >> > > > > * There can't be a field in writerSchema, that exists in >> readerSchema >> > > > (tag >> > > > > number based), which in writerSchema its type is scalar, but in >> > > > readSchema >> > > > > its type is scalar, it's repeated but with packed=true. >> > > > > >> > > > > *Rules you have , but I would phrase a bit differently* >> > > > > >> > > > > I would remove the proto2/proto3 sections, since they only differ >> in >> > 1 >> > > > > rule, and just mention that distinction inside that rule (less >> work >> > for >> > > > the >> > > > > reade). >> > > > > >> > > > > * readSchema has a field which doesn't exist in writerSchema >> (based >> > on >> > > > tag >> > > > > number). >> > > > > * Proto v2: >> > > > > * That field must be `optional` or `repeated` (must not be >> > > > > `required`) >> > > > > * Proto v3: >> > > > > * No problem. >> > > > > * There can not be a field which exists both in readSchema and >> > > > writeSchema, >> > > > > with the same tag number, but having different types. >> > > > > >> > > > > *Motivation* >> > > > > >> > > > > Basically in the motivation section you want people to understand >> the >> > > > > following: >> > > > > >> > > > > Pulsar has built-in support for typed messages. It allows >> specifying >> > an >> > > > > encoding scheme and its matching schema. >> > > > > For example, it supports Avro. You specify a schema for a given >> > topic, >> > > > > using Avro Schema Definition (i.e. a JSON describing the schema). >> > > > Everytime >> > > > > you produce a message, you first declare the schema definition you >> > wish >> > > > to >> > > > > use for your messages. The message data should be an avro-encoded >> > > binary >> > > > > data (which the client in some SDKs helps encode a given >> > > > > data-structure/object). >> > > > > The same applies when you consume a message. You first specify the >> > > schema >> > > > > you use to read the messages, and the client in some SDKs helps by >> > > > decoding >> > > > > the message binary data into an object/data-structure. >> > > > > >> > > > > Each time you specify a schema to be used, either by a producer >> or a >> > > > > consumer, the schema is persisted in Pulsar and given an >> increasing >> > > > version >> > > > > number. If the schema was the same as the previous version, it is >> not >> > > > > saved. When the message is persisted, the version number is >> encoded >> > in >> > > > the >> > > > > message headers. >> > > > > >> > > > > Pulsar provides a very useful feature named Schema Evolution >> > > > > < >> > > > >> > > >> > >> https://pulsar.apache.org/docs/2.11.x/schema-understand/#schema-evolution >> > > > > >. >> > > > > It allows us to check if a new schema version is compatible with >> > > previous >> > > > > versions or versions. When you configure the schema for the topic >> you >> > > > > decide the strategy to use for doing the validation check. The >> > > strategies >> > > > > validate the following: >> > > > > >> > > > > - BACKWARD strategy >> > > > > - A consumer with newSchema can read a message written using >> > > > > existingSchema >> > > > > - BACKWARD_TRANSITIVE strategy >> > > > > - A consumer with newSchema can read messages written using >> all >> > > > > existingSchema >> > > > > - FORWARD >> > > > > - A consumer with existingSchema can read messages written >> > using >> > > > > newSchema >> > > > > - FORWARD_TRANSITIVE >> > > > > - A consumer defined with any of the existingSchema can read >> > > > messages >> > > > > written using newSchema >> > > > > - FULL >> > > > > - A consumer defined with newSchema can read messages >> written >> > > using >> > > > > existingSchema >> > > > > - A consumer defined with existingSchema can read messages >> > > written >> > > > > using newSchema >> > > > > - FULL_TRANSITIVE >> > > > > - A consumer defined with newSchema can read messages >> written >> > > using >> > > > > any of the existingSchema >> > > > > - A consumer defined with any of the existingSchema can read >> > > > messages >> > > > > written using newSchema >> > > > > >> > > > > >> > > > > Aside from Avro, Pulsar also has two additional supported >> encodings: >> > > > > PROTOBUF and PROTOBUF_NATIVE. >> > > > > >> > > > > PROTOBUF is a bit strange. It encodes the messages using Protobuf >> > > > encoding, >> > > > > but the schema that is persisted to Pulsar is *not* Protobuf >> > Descriptor >> > > > as >> > > > > you would have expected. The saved schema is a translation of the >> > > > Protobuf >> > > > > Descriptor to an Avro Schema, so in fact an Avro schema >> definition is >> > > > saved >> > > > > as the schema. >> > > > > >> > > > > PROTOBUF_NATIVE was created to fix that shortcoming, by actually >> > > > persisting >> > > > > the Protobuf Descriptor and using Protobuf for encoding. >> > > > > The problem is that the authors of PROTOBUF_NATIVE haven't >> completed >> > it >> > > > > fully, and the backward compatibility validation code almost does >> not >> > > > > exist: It only checks if the root message name is the same between >> > > > > versions. >> > > > > >> > > > > GOALS >> > > > > The goal of this PIP is to amend PROTOBUF_NATIVE by adding a fully >> > > > > functional validation for any of the defined Schema Compatibility >> > > > > Strategies. >> > > > > A secondary goal is to allow the user to choose between different >> > > > > implementations: The new fully functional validation or the >> existing >> > > > > barebones validation. >> > > > > >> > > > > -------- END >> > > > > >> > > > > I'm ok with having links in the Motivation , as *further reading*. >> > > > > I'm against stacking up work for the reader to go read 5-6 >> different >> > > > links >> > > > > just to understand the motivation and background knowledge >> required >> > to >> > > > > understand the feature. >> > > > > >> > > > > I'm against putting code in the Motivation. Especially if it is >> > > supposed >> > > > to >> > > > > replace description in plain English making it easy to understand >> the >> > > > > design. >> > > > > Leave the code to the motivation. >> > > > > Paste code only if you absolutely can't use plain old >> descriptions to >> > > > > explain. >> > > > > >> > > > > >> > > > > >> > > > > On Sat, Mar 11, 2023 at 11:46 AM SiNan Liu < >> liusinan1...@gmail.com> >> > > > wrote: >> > > > > >> > > > > > *I guess that's right, too! * >> > > > > > >> > > > > > But the name `ProtobufNativeAdvancedSchemaCompatibilityCheck` is >> > > > better, >> > > > > > because we don't know whether the future will have V2, V3. The >> > > official >> > > > > > solution can be called >> > > > `ProtobufNativeOfficialSchemaCompatibilityCheck`, >> > > > > or >> > > > > > is a good `ProtobufNativeXXXXXXXXSchemaCompatibilityCheck` >> > > third-party >> > > > > > solution. >> > > > > > >> > > > > > I've updated my design in PIP issue. >> > > > > > 1. A new ProtobufNativeSchemaAdvanceCompatibilityCheck, rather >> > than a >> > > > > > ProtobufNativeSchemaCompatibilityCheck different validator >> > > > > implementation. >> > > > > > 2. Remove the 'builder' >> > > > > > 3. Clarify the relationship between newSchema, existingSchema, >> and >> > > > > > writtenSchema in canRead. >> > > > > > >> > > > > > Help to see if the description is comprehensive and what changes >> > and >> > > > > > improvements need to be made. >> > > > > > >> > > > > > Thanks, >> > > > > > sinan >> > > > > > >> > > > > > >> > > > > > >> > > > > > Asaf Mesika <asaf.mes...@gmail.com> 于2023年3月9日周四 17:35写道: >> > > > > > >> > > > > > > I like Bo's suggestion - I haven't realized each schema type >> > > > > > > compatibility check is actually a plugin. >> > > > > > > It makes sense for any schema type checks to evolve, sometimes >> > in a >> > > > > > > non-backward compatible way hence having two plugins like >> > > > > > > protobufNativeSchemaCompatabilityCheckV1 and then >> > > > > > > protobufNativeSchemaCompatabilityCheckV2 and then >> > > > > > > protobufNativeSchemaCompatabilityCheckV3 makes sense to me. >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > On Thu, Mar 9, 2023 at 5:49 AM 丛搏 <bog...@apache.org> wrote: >> > > > > > > >> > > > > > > > Hi siNan: >> > > > > > > > >> > > > > > > > From my point of view, it is just a plug-in. I don't think >> it >> > is >> > > > > > > > necessary to add configuration for the plugin. >> > > > > > > > This is meaningless, and it will increase the difficulty of >> use >> > > for >> > > > > > > users. >> > > > > > > > >> > > > > > > > >> > > > > > > > SiNan Liu <liusinan1...@gmail.com> 于2023年3月8日周三 15:54写道: >> > > > > > > > > >> > > > > > > > > Hi, bo. >> > > > > > > > > >> > > > > > > > > 1. I understand what you say, to develop a new >> > > > > > > > > `ProtobufNativeAdvancedSchemaCompatibilityCheck`, rather >> than >> > > > > > changing >> > > > > > > > > existing `ProtobufNativeSchemaCompatibilityCheck`. But I >> > found >> > > a >> > > > > few >> > > > > > > > small >> > > > > > > > > problems: >> > > > > > > > > >> > > > > > > > > (1)ProtobufNativeAdvancedSchemaCompatibilityCheck and >> > > > > > > > > ProtobufNativeSchemaCompatibilityCheck schemaType is >> > > > > PROTOBUF_NATIVE. >> > > > > > > It >> > > > > > > > > looks like both checkers are PROTOBUF not using >> > AVRO-PROTOBUF's >> > > > > > > "native" >> > > > > > > > > implementation, which leads to some problems or >> > "unreasonable" >> > > > and >> > > > > > > gives >> > > > > > > > me >> > > > > > > > > some extended thinking and questions. >> > > > > > > > > >> > > > > > > > `CompatibilityCheck ` its only a plugin. >> > > > > > > > `ProtobufNativeSchemaCompatibilityCheck` may sooner or later >> > > leave >> > > > > the >> > > > > > > > stage, when >> `ProtobufNativeAdvancedSchemaCompatibilityCheck` is >> > > > > > > > stable, we can make it the default Checker. >> > > > > > > > >> > > > > > > > It is just a plug-in, users can change it at will and ensure >> > that >> > > > it >> > > > > > > > is used correctly >> > > > > > > > > (2)In broker.conf >> > > > > > > > > >> > > > > > > > > `schemaRegistryCompatibilityCheckers`. If >> > > > > > > > > ProtobufNativeSchemaCompatibilityCheck and >> > > > > > > > > ProtobufNativeAdvancedSchemaCompatibilityCheck all set. >> This >> > is >> > > > > going >> > > > > > > to >> > > > > > > > > overwrite each other. Because this is a map: >> > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://github.com/apache/pulsar/blob/af1360fb167c1f9484fda5771df3ea9b21d1440b/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/schema/SchemaRegistryService.java#L36-L44 >> > > > > > > > > >> > > > > > > > > ```java >> > > > > > > > > >> > > > > > > > > Map<SchemaType, SchemaCompatibilityCheck> checkers = new >> > > > > HashMap<>(); >> > > > > > > > > >> > > > > > > > > for (String className : checkerClasses) { >> > > > > > > > > >> > > > > > > > > SchemaCompatibilityCheck schemaCompatibilityCheck = >> > > > > > > > > Reflections.createInstance(className, >> > > > > > > > > >> > > > > > > > > SchemaCompatibilityCheck.class, >> > > > > > > > > Thread.currentThread().getContextClassLoader()); >> > > > > > > > > >> > > > > > > > > checkers.put(schemaCompatibilityCheck.getSchemaType(), >> > > > > > > > > schemaCompatibilityCheck); >> > > > > > > > > >> > > > > > > > > ``` >> > > > > > > > > >> > > > > > > > > Is this a big problem or a small one? Is it possible or >> > > > > unnecessary? >> > > > > > > > Maybe >> > > > > > > > > we can write in the documentation that protobufNative >> > checkers >> > > > can >> > > > > > only >> > > > > > > > > choose one of the two? Why are there two Checkers for >> > different >> > > > > > > > > implementations of the same schemaType? Why not the >> checker >> > to >> > > > > create >> > > > > > > > > different validator, so we don not have to change >> > > > > > > > > schemaRegistryCompatibilityCheckers. >> > > > > > > > >> > > > > > > > users can only use one, not two, which will bring >> complexity to >> > > > users >> > > > > > > > >> > > > > > > > > >> > > > > > > > > (3)And after the update to >> > > > > > > > ProtobufNativeAdvancedSchemaCompatibilityCheck. >> > > > > > > > > Existing topics previously only checked the name of the >> root >> > > > > message, >> > > > > > > not >> > > > > > > > > the content of protobuf. >> > > > > > > > > >> > > > > > > > > What if the user wants both Checkers? >> > > > > > > > > >> > > > > > > > > Set to ProtobufNativeAdvancedSchemaCompatibilityCheck, >> affect >> > > the >> > > > > > topic >> > > > > > > > of >> > > > > > > > > the existing schema? >> > > > > > > > > >> > > > > > > > > Older topics still use the old checker, and newer topics >> or >> > > > certain >> > > > > > > older >> > > > > > > > > topics use the new advancedchecker. >> > > > > > > > > >> > > > > > > > when `ProtobufNativeAdvancedSchemaCompatibilityCheck` >> stable, >> > > > > > > > users will not choose >> `ProtobufNativeSchemaCompatibilityCheck`. >> > > > > > > > because it not a complete checker. >> > > > > > > > > (4)So should we have one schemaType for a checker? >> > > > > > > protobufNativeChecker >> > > > > > > > > can have as many different implementation classes as >> > possible. >> > > > This >> > > > > > > > > classname configuration in PIP, let's see if it can be >> set at >> > > the >> > > > > > topic >> > > > > > > > > level. In the current PIP design I just load this >> parameter >> > > into >> > > > > the >> > > > > > > > > checker when the broker is started and the checkers map is >> > set >> > > > up. >> > > > > > Can >> > > > > > > I >> > > > > > > > do >> > > > > > > > > this in the new normal pr if I want to support topic >> level? >> > Or >> > > > > > perfect >> > > > > > > it >> > > > > > > > > here? >> > > > > > > > > >> > > > > > > > > Add a call PROTOBUF_NATIVE_ADVANCE schemaType >> corresponding >> > > > > > > > > ProtobufNativeAdvancedSchemaCompatibilityCheck? (Seems to >> be >> > > more >> > > > > > > > trouble). >> > > > > > > > > >> > > > > > > > > Sorry I can not use the computer and network in the >> company, >> > I >> > > > use >> > > > > my >> > > > > > > > > mobile phone to reply to the email, the format may be a >> bit >> > > > messy. >> > > > > > > Please >> > > > > > > > > understand. >> > > > > > > > > >> > > > > > > > > Thanks, >> > > > > > > > > >> > > > > > > > > sinan >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > 丛搏 <bog...@apache.org> 于 2023年3月7日周二 下午11:39写道: >> > > > > > > > > >> > > > > > > > > > SiNan Liu <liusinan1...@gmail.com> 于2023年3月7日周二 >> 13:22写道: >> > > > > > > > > > > >> > > > > > > > > > > Great to see your comment, bo! >> > > > > > > > > > > >> > > > > > > > > > > 1. The first way. The protobuf website has a >> description >> > of >> > > > the >> > > > > > > > rules, >> > > > > > > > > > but >> > > > > > > > > > > no plans to implement them. >> > > > > > > > > > > >> https://protobuf.dev/programming-guides/proto/#updating >> > > > > > > > > > >> > > > > > > > > > https://groups.google.com/g/protobuf >> > > > > > > > > > maybe ask here >> > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > 2. I think this PIP can be divided into two parts. >> > > > > > > > > > > (1) Add a flag(`ValidatorClassName`), load it into >> > > > > > > > > > > `ProtobufNativeSchemaCompatibilityCheck` when the >> broker >> > > > > starts. >> > > > > > > > > > > ValidatorClassName is empty by default, and the >> > > > implementation >> > > > > > > > continues >> > > > > > > > > > as >> > > > > > > > > > > before, with no change for the user. >> > > > > > > > > > >> > > > > > > > > > `ProtobufNativeSchemaCompatibilityCheck` is a plugin in >> > > > > > `broker.conf` >> > > > > > > > > > ``` >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> schemaRegistryCompatibilityCheckers=org.apache.pulsar.broker.service.schema.JsonSchemaCompatibilityCheck,org.apache.pulsar.broker.service.schema.AvroSchemaCompatibilityCheck,org.apache.pulsar.broker.service.schema.ProtobufNativeSchemaCompatibilityCheck >> > > > > > > > > > ``` >> > > > > > > > > > I do not recommend that we directly modify this plugin >> and >> > > > > continue >> > > > > > > to >> > > > > > > > > > add configuration items, which will cause trouble for >> > users. >> > > > > > > > > > We have a lot of configs and it's getting very unwieldy. >> > > > > > > > > > in my opinion, we don't change >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> `org.apache.pulsar.broker.service.schema.ProtobufNativeSchemaCompatibilityCheck`, >> > > > > > > > > > it is a simple implementation, it doesn't go wrong very >> > > often, >> > > > > most >> > > > > > > > > > users will use it. we can add another >> ProtobufNativeCheck >> > > named >> > > > > > > > > > `ProtobufNativeAdvancedSchemaCompatibilityCheck ` or >> other. >> > > in >> > > > > this >> > > > > > > > > > way, we don't need to add this flag. There is no need to >> > > > consider >> > > > > > > > > > compatibility, it is just a plug-in and will not affect >> > > current >> > > > > > > logic. >> > > > > > > > > > If the user needs it, just change the plugin to the new >> > > > > > > implementation >> > > > > > > > > > >> > > > > > > > > > > ```java >> > > > > > > > > > > ProtobufNativeSchemaValidator DEFAULT = >> > > (fromDescriptors, >> > > > > > > > > > toDescriptor) >> > > > > > > > > > > -> { >> > > > > > > > > > > for (Descriptors.Descriptor fromDescriptor : >> > > > > > > > fromDescriptors) { >> > > > > > > > > > > // The default implementation only checks >> if >> > > the >> > > > > root >> > > > > > > > message >> > > > > > > > > > > has changed. >> > > > > > > > > > > if >> > > > > > > > > > > >> > > > > > >> (!fromDescriptor.getFullName().equals(toDescriptor.getFullName())) >> > > > > > > { >> > > > > > > > > > > throw new >> > > > > ProtoBufCanReadCheckException("Protobuf >> > > > > > > > root >> > > > > > > > > > > message isn't allow change!"); >> > > > > > > > > > > } >> > > > > > > > > > > } >> > > > > > > > > > > }; >> > > > > > > > > > > ``` >> > > > > > > > > > > `ValidatorClassName` value also can be set to the >> current >> > > > > > > > implementation >> > > > > > > > > > of >> > > > > > > > > > > PIP add >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> `org.apache.pulsar.broker.service.schema.validator.ProtobufNativeSchemaBreakValidatorImpl`. >> > > > > > > > > > > >> > > > > > > > > > > (2) Recoding the >> > `ProtobufNativeSchemaCompatibilityCheck`. >> > > > > > Through >> > > > > > > > the >> > > > > > > > > > flag >> > > > > > > > > > > (`ValidatorClassName`) to build different >> > > > > > > > > > `ProtobufNativeSchemaValidator`. >> > > > > > > > > > > Isn't it just a plug-in? The user can develop and >> choose >> > a >> > > > > > > different >> > > > > > > > > > > `ProtobufNativeSchemaValidator`. I think it didn't >> change >> > > the >> > > > > > > logic, >> > > > > > > > it >> > > > > > > > > > > just allowed him to expand it. >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > I think this PIP should be an enhancement and >> supplement >> > to >> > > > the >> > > > > > > > function, >> > > > > > > > > > > and there is no such thing as unnecessary and >> > meaningless. >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > Thanks, >> > > > > > > > > > > sinan >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > 丛搏 <bog...@apache.org> 于2023年3月7日周二 11:53写道: >> > > > > > > > > > > >> > > > > > > > > > > > I think we have two ways to do that. >> > > > > > > > > > > > >> > > > > > > > > > > > First way: We need to advance the improvement of >> java >> > in >> > > > > > > protobuf. >> > > > > > > > Ask >> > > > > > > > > > > > if they have plans to improve. >> > > > > > > > > > > > >> > > > > > > > > > > > Second way: the new PROTOBUF_NATIVE >> > > > > `SchemaCompatibilityCheck` >> > > > > > > > should >> > > > > > > > > > > > be implemented as a plugin, don't change any >> existing >> > > > plugin >> > > > > > > logic >> > > > > > > > > > > > (it's simple and already used). I don't recommend >> > adding >> > > > > flags >> > > > > > > for >> > > > > > > > > > > > rollback, it adds configuration and makes little >> sense. >> > > > > > > > > > > > >> > > > > > > > > > > > Thanks, >> > > > > > > > > > > > Bo >> > > > > > > > > > > > >> > > > > > > > > > > > Asaf Mesika <asaf.mes...@gmail.com> 于2023年3月6日周一 >> > > 23:00写道: >> > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > Can you convert the code block which is actually a >> > > quote >> > > > in >> > > > > > the >> > > > > > > > > > > > > beginning of the PIP to something which doesn't >> > require >> > > > to >> > > > > > > scroll >> > > > > > > > > > > > > horizontally so much? >> > > > > > > > > > > > > Use >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax#quoting-text >> > > > > > > > > > > > > >> > > > > > > > > > > > > Let's improve the clarity of what you wrote: >> > > > > > > > > > > > > >> > > > > > > > > > > > > "the PROTOBUF uses avro struct to store." >> > > > > > > > > > > > > --> >> > > > > > > > > > > > > When Schema type PROTOBUF is used, Pulsar Client >> > > assumes >> > > > > the >> > > > > > > > object >> > > > > > > > > > given >> > > > > > > > > > > > > to it as message data is an auto-generated POJO >> > > > containing >> > > > > > the >> > > > > > > > > > > > annotations >> > > > > > > > > > > > > encoding the schema. The client is using a >> converter, >> > > > which >> > > > > > > > converts >> > > > > > > > > > a >> > > > > > > > > > > > > Protobuf schema descriptor into an Avro schema and >> > > sends >> > > > > that >> > > > > > > as >> > > > > > > > the >> > > > > > > > > > > > Schema >> > > > > > > > > > > > > of the producer/consumer. >> > > > > > > > > > > > > >> > > > > > > > > > > > > "On the broker side, protobuf and avro both use >> > > > SchemaData >> > > > > > > > converted >> > > > > > > > > > to >> > > > > > > > > > > > > org.apache.avro.Schema." >> > > > > > > > > > > > > --> >> > > > > > > > > > > > > Since the schema is an Avro schema, the >> > implementation >> > > of >> > > > > > > > > > compatibility >> > > > > > > > > > > > > check on the broker side is to simply re-use the >> > > > > > compatibility >> > > > > > > > check >> > > > > > > > > > of >> > > > > > > > > > > > the >> > > > > > > > > > > > > AVRO schema type. >> > > > > > > > > > > > > >> > > > > > > > > > > > > "ProtobufSchema is different from >> > ProtobufNativeSchema >> > > in >> > > > > > > schema >> > > > > > > > > > > > > compatibility check it uses avro-protobuf. >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://central.sonatype.com/artifact/org.apache.avro/avro-protobuf/1.11.1/overview >> > > > > > > > > > > > > But the current implementation of ProtobufNative >> > schema >> > > > > > > > compatibility >> > > > > > > > > > > > > check only >> > > > > > > > > > > > > checked if the root message name is changed." >> > > > > > > > > > > > > >> > > > > > > > > > > > > --> >> > > > > > > > > > > > > PROTOBUF_NATIVE schema type is different. >> > > > > > > > > > > > > The client is actually using Protobuf Descriptor >> as >> > the >> > > > > > schema, >> > > > > > > > as >> > > > > > > > > > > > opposed >> > > > > > > > > > > > > to Avro schema of PROTOBUF schema type. In the >> > broker, >> > > > the >> > > > > > > > > > > > PROTOBUF_NATIVE >> > > > > > > > > > > > > compatibility check actually hasn't implemented >> any >> > > rule, >> > > > > > > besides >> > > > > > > > > > one: >> > > > > > > > > > > > > checking if the root message name has changed. >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > > 1. For now, there is no official or >> third-party >> > > > > solution >> > > > > > > for >> > > > > > > > > > > > ProtoBuf >> > > > > > > > > > > > > > compatibility. If in the future have better >> > > > solutions >> > > > > > of a >> > > > > > > > third >> > > > > > > > > > > > party or >> > > > > > > > > > > > > > the official, we develop new >> > > > > > ProtobufNativeSchemaValidator >> > > > > > > > and >> > > > > > > > > > use, >> > > > > > > > > > > > so >> > > > > > > > > > > > > > add a flag. >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > Who do you need to make that configurable? Once >> you >> > > > > found a >> > > > > > > > third >> > > > > > > > > > > > party, >> > > > > > > > > > > > > just switch to it? Who knows, maybe you never >> will. >> > > > > Introduce >> > > > > > > it >> > > > > > > > > > when you >> > > > > > > > > > > > > find it, not now. >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > We improve in >> ProtobufNativeSchemaCompatibilityCheck >> > > > > > BACKWARD, >> > > > > > > > > > FORWARD >> > > > > > > > > > > > > > these strategies. As with the AVRO >> implementation, >> > > > > protobuf >> > > > > > > > > > > > compatibility >> > > > > > > > > > > > > > checking need implementing the canRead method. >> > *This >> > > > will >> > > > > > > check >> > > > > > > > > > that >> > > > > > > > > > > > > > the writtenschema can be read by readSchema.* >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > I completely disagree. >> > > > > > > > > > > > > Avro implementation is confusing for our use case. >> > > Don't >> > > > > copy >> > > > > > > > that. >> > > > > > > > > > > > > >> > > > > > > > > > > > > You have >> > > > > > > > > > > > > >> > > > > > > > > > > > > public void checkCompatible(SchemaData from, >> > SchemaData >> > > > to, >> > > > > > > > > > > > > SchemaCompatibilityStrategy strategy) >> > > > > > > > > > > > > throws IncompatibleSchemaException { >> > > > > > > > > > > > > Descriptor fromDescriptor = >> > > > > > > > > > > > > >> > ProtobufNativeSchemaUtils.deserialize(from.getData()); >> > > > > > > > > > > > > Descriptor toDescriptor = >> > > > > > > > > > > > > >> ProtobufNativeSchemaUtils.deserialize(to.getData()); >> > > > > > > > > > > > > switch (strategy) { >> > > > > > > > > > > > > case BACKWARD_TRANSITIVE: >> > > > > > > > > > > > > case BACKWARD: >> > > > > > > > > > > > > case FORWARD_TRANSITIVE: >> > > > > > > > > > > > > case FORWARD: >> > > > > > > > > > > > > case FULL_TRANSITIVE: >> > > > > > > > > > > > > case FULL: >> > > > > > > > > > > > > checkRootMessageChange(fromDescriptor, >> > > > > > > toDescriptor, >> > > > > > > > > > > > strategy); >> > > > > > > > > > > > > return; >> > > > > > > > > > > > > case ALWAYS_COMPATIBLE: >> > > > > > > > > > > > > return; >> > > > > > > > > > > > > default: >> > > > > > > > > > > > > throw new >> > > > IncompatibleSchemaException("Unknown >> > > > > > > > > > > > > SchemaCompatibilityStrategy."); >> > > > > > > > > > > > > } >> > > > > > > > > > > > > } >> > > > > > > > > > > > > >> > > > > > > > > > > > > I would rename : >> > > > > > > > > > > > > from --> currentSchema >> > > > > > > > > > > > > to --> newSchema >> > > > > > > > > > > > > >> > > > > > > > > > > > > Use that switch case and have a method for each >> like: >> > > > > > > > > > > > > validateBackwardsCompatibility(currentSchema, >> > > newSchema) >> > > > > > > > > > > > > >> > > > > > > > > > > > > I dislike canRead and usage of writtenSchema, >> since >> > you >> > > > > have >> > > > > > > two >> > > > > > > > > > > > completely >> > > > > > > > > > > > > different use cases: from the producing side and >> the >> > > > > consumer >> > > > > > > > side. >> > > > > > > > > > > > > >> > > > > > > > > > > > > schemaValidatorBuilder >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > I dislike this proposal. IMO Avro >> implementation is >> > > way >> > > > > too >> > > > > > > > > > > > complicated. >> > > > > > > > > > > > > Why not have a simple function for validation for >> > each >> > > > > switch >> > > > > > > > case >> > > > > > > > > > above? >> > > > > > > > > > > > > Why do we need strategy and builder, and all this >> > > > > complexity? >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > *Here are the basic compatibility rules we've >> > defined:* >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > IMO it's impossible to read the validation rules >> as >> > you >> > > > > > > described >> > > > > > > > > > them. >> > > > > > > > > > > > > I wrote how they should be structured numerous >> times >> > > > above. >> > > > > > > > > > > > > I can't validate them. >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > IMO, the current design is very hard to read. >> > > > > > > > > > > > > Please try to avoid jumping into code sections. >> > > > > > > > > > > > > Write a high level design section, in which you >> > > describe >> > > > in >> > > > > > > words >> > > > > > > > > > what >> > > > > > > > > > > > you >> > > > > > > > > > > > > plan to do. >> > > > > > > > > > > > > Write the validation rules in the structure that >> is >> > > easy >> > > > to >> > > > > > > > > > understand: >> > > > > > > > > > > > > rules per each compatibility check, and use proper >> > > words >> > > > > > > (current >> > > > > > > > > > schema, >> > > > > > > > > > > > > new schema), since new schema can be once used for >> > read >> > > > and >> > > > > > > once >> > > > > > > > > > used for >> > > > > > > > > > > > > write. >> > > > > > > > > > > > > >> > > > > > > > > > > > > In its current form it takes too much time to >> > > understand >> > > > > the >> > > > > > > > design, >> > > > > > > > > > and >> > > > > > > > > > > > it >> > > > > > > > > > > > > shouldn't be the case. >> > > > > > > > > > > > > >> > > > > > > > > > > > > Thanks, >> > > > > > > > > > > > > >> > > > > > > > > > > > > Asaf >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > On Sun, Mar 5, 2023 at 3:58 PM SiNan Liu < >> > > > > > > liusinan1...@gmail.com >> > > > > > > > > >> > > > > > > > > > wrote: >> > > > > > > > > > > > > >> > > > > > > > > > > > > > Hi! I updated the explanation of some things in >> the >> > > PIP >> > > > > > > issue. >> > > > > > > > And >> > > > > > > > > > also >> > > > > > > > > > > > > > added a new “flag” in the conf is used as the >> > > different >> > > > > > > > > > > > > > ProtobufNativeSchemaValidator implementation, >> also >> > > set >> > > > > > > > > > > > > > ProtobufNativeSchemaValidator default only check >> > > > whether >> > > > > > the >> > > > > > > > name >> > > > > > > > > > of >> > > > > > > > > > > > the >> > > > > > > > > > > > > > root message is the same. >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > Thanks, >> > > > > > > > > > > > > > sinan >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > Asaf Mesika <asaf.mes...@gmail.com> >> 于2023年3月5日周日 >> > > > > 20:21写道: >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > > On Wed, Mar 1, 2023 at 4:33 PM SiNan Liu < >> > > > > > > > liusinan1...@gmail.com >> > > > > > > > > > > >> > > > > > > > > > > > wrote: >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Can you please explain how a Protobuf >> Schema >> > > > > > descriptor >> > > > > > > > can >> > > > > > > > > > be >> > > > > > > > > > > > > > > validated >> > > > > > > > > > > > > > > > > for backward compatibility check using >> Avro >> > > based >> > > > > > > > > > compatibility >> > > > > > > > > > > > > > rules? >> > > > > > > > > > > > > > > > > Doesn't it expect the schema to be Avro, >> but >> > it >> > > > is >> > > > > > > > actually a >> > > > > > > > > > > > > > Protobuf >> > > > > > > > > > > > > > > > > descriptor? >> > > > > > > > > > > > > > > > > Is there some translation happening? >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > 1. *You can take a quick look at the >> previous >> > > > design, >> > > > > > the >> > > > > > > > > > PROTOBUF >> > > > > > > > > > > > uses >> > > > > > > > > > > > > > > > avro struct to store.* >> > > > > > > > > > > > > > > > https://github.com/apache/pulsar/pull/1954 >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://github.com/apache/pulsar/blob/579f22c8449be287ee1209a477aeaad346495289/pulsar-client/src/main/java/org/apache/pulsar/client/impl/schema/ProtobufSchema.java#L59-L61 >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://github.com/apache/pulsar/blob/579f22c8449be287ee1209a477aeaad346495289/pulsar-client/src/main/java/org/apache/pulsar/client/impl/schema/ProtobufSchema.java#L110-L115 >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Ok. So to summarize your code (easier to >> write it >> > > > than >> > > > > > send >> > > > > > > > > > links): >> > > > > > > > > > > > > > > * Pulsar Client, when used with Protobuf >> Schema, >> > > > > actually >> > > > > > > > > > converts >> > > > > > > > > > > > the >> > > > > > > > > > > > > > > Protobuf descriptor into an Avro Schema (using >> > code >> > > > > found >> > > > > > > > inside >> > > > > > > > > > Avro >> > > > > > > > > > > > > > > library) and saves that Avro schema as the >> > schema. >> > > > It's >> > > > > > not >> > > > > > > > > > saving >> > > > > > > > > > > > the >> > > > > > > > > > > > > > > protobuf descriptor at all. Very confusing I >> have >> > > to >> > > > > add >> > > > > > - >> > > > > > > > never >> > > > > > > > > > > > expected >> > > > > > > > > > > > > > > that. >> > > > > > > > > > > > > > > This explains why In the >> > > > > ProtobufSchemaCompatibilityCheck >> > > > > > > > they >> > > > > > > > > > just >> > > > > > > > > > > > > > extend >> > > > > > > > > > > > > > > the Avro without doing any translation. >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Thanks for that. >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Now thatI finally understand this, I can say >> > that: >> > > > you >> > > > > > > *must* >> > > > > > > > > > explain >> > > > > > > > > > > > > > that >> > > > > > > > > > > > > > > in the motivation part in your PIP. >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > 2. *On the broker side, protobuf and avro >> both >> > > use >> > > > > > > > `SchemaData` >> > > > > > > > > > > > > > converted >> > > > > > > > > > > > > > > > to `org.apache.avro.Schema`.* >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://github.com/apache/pulsar/blob/579f22c8449be287ee1209a477aeaad346495289/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/ServerCnx.java#L1280-L1293 >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://github.com/apache/pulsar/blob/579f22c8449be287ee1209a477aeaad346495289/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/schema/ProtobufSchemaCompatibilityCheck.java#L26-L31 >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://github.com/apache/pulsar/blob/579f22c8449be287ee1209a477aeaad346495289/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/schema/AvroSchemaBasedCompatibilityCheck.java#L47-L70 >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Actually those links don't really help. >> > > > > > > > > > > > > > > The main link that helps is: >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://github.com/apache/pulsar/blob/ec102fb024a6ea2b195826778300f20e330dff06/pulsar-client/src/main/java/org/apache/pulsar/client/impl/schema/ProtobufSchema.java#L102-L122 >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > I'm sorry - I don't understand. >> > > > > > > > > > > > > > > > > I understand the different compatibility >> > check >> > > > > > > > strategies. >> > > > > > > > > > > > > > > > > If you just spell them out here, then as >> you >> > > say, >> > > > > > just >> > > > > > > > > > translate >> > > > > > > > > > > > the >> > > > > > > > > > > > > > > > > Protobuf Descriptor into an Avro schema >> and >> > run >> > > > the >> > > > > > > Avro >> > > > > > > > > > > > > > > > > compatibility validation, no? >> > > > > > > > > > > > > > > > > I believe the answer is no, since you may >> > want >> > > to >> > > > > > > verify >> > > > > > > > > > > > different >> > > > > > > > > > > > > > > things >> > > > > > > > > > > > > > > > > when it comes to Protobuf, which are >> > different >> > > > then >> > > > > > > Avro. >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > 1. >> > > > > > > > > > > > > > > > *ProtobufSchema is different from >> > > > > ProtobufNativeSchema >> > > > > > in >> > > > > > > > that >> > > > > > > > > > it >> > > > > > > > > > > > uses >> > > > > > > > > > > > > > > > avro-protobuf.* >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://central.sonatype.com/artifact/org.apache.avro/avro-protobuf/1.11.1/overview >> > > > > > > > > > > > > > > > *ProtobufNativeSchema needs a native >> > > compatibility >> > > > > > check, >> > > > > > > > but >> > > > > > > > > > > > there is >> > > > > > > > > > > > > > no >> > > > > > > > > > > > > > > > official or third party implementation. So >> this >> > > PIP >> > > > > > does >> > > > > > > > not >> > > > > > > > > > use >> > > > > > > > > > > > > > > > avro-protobuf for protobuf compatibility >> > > checking.* >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > 2. *By the way, this is implemented in much >> the >> > > > same >> > > > > > way >> > > > > > > > that >> > > > > > > > > > > > Apache >> > > > > > > > > > > > > > avro >> > > > > > > > > > > > > > > > does compatibility checking.* >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/SchemaValidatorBuilder.java >> > > > > > > > > > > > > > > > >> > > > > > > `canReadStrategy`,`canBeReadStrategy`,`mutualReadStrategy` >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/ValidateCanRead.java >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/ValidateCanBeRead.java >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/ValidateMutualRead.java >> > > > > > > > > > > > > > > > *In `ValidateMutualRead.java`, the >> arguments of >> > > > > > > > `canRead()` are >> > > > > > > > > > > > > > > > writtenSchema and readSchema. We only need >> to >> > > > change >> > > > > > the >> > > > > > > > order >> > > > > > > > > > of >> > > > > > > > > > > > > > > arguments >> > > > > > > > > > > > > > > > we pass to `canRead()`.* >> > > > > > > > > > > > > > > > ```java >> > > > > > > > > > > > > > > > private void >> > > > > > validateWithStrategy(Descriptors.Descriptor >> > > > > > > > > > > > toValidate, >> > > > > > > > > > > > > > > > Descriptors.Descriptor fromDescriptor) >> throws >> > > > > > > > > > > > > > > ProtoBufCanReadCheckException >> > > > > > > > > > > > > > > > { >> > > > > > > > > > > > > > > > switch (strategy) { >> > > > > > > > > > > > > > > > case CanReadExistingStrategy -> >> > > > > canRead(fromDescriptor, >> > > > > > > > > > > > toValidate); >> > > > > > > > > > > > > > > > case CanBeReadByExistingStrategy -> >> > > > > canRead(toValidate, >> > > > > > > > > > > > > > fromDescriptor); >> > > > > > > > > > > > > > > > case CanBeReadMutualStrategy -> { >> > > > > > > > > > > > > > > > canRead(toValidate, fromDescriptor); >> > > > > > > > > > > > > > > > canRead(fromDescriptor, toValidate); >> > > > > > > > > > > > > > > > } >> > > > > > > > > > > > > > > > } >> > > > > > > > > > > > > > > > } >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > private void canRead(Descriptors.Descriptor >> > > > > > > writtenSchema, >> > > > > > > > > > > > > > > > Descriptors.Descriptor readSchema) throws >> > > > > > > > > > > > > > ProtoBufCanReadCheckException { >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> ProtobufNativeSchemaBreakCheckUtils.checkSchemaCompatibility(writtenSchema, >> > > > > > > > > > > > > > > > readSchema); >> > > > > > > > > > > > > > > > } >> > > > > > > > > > > > > > > > ``` >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > I get that you want to take inspiration from >> the >> > > > > existing >> > > > > > > > Avro >> > > > > > > > > > Schema >> > > > > > > > > > > > > > > compatibility check, to do your code design. >> > > > > > > > > > > > > > > I also understand you *won't* use any existing >> > avro >> > > > > code >> > > > > > > for >> > > > > > > > > > that. >> > > > > > > > > > > > > > > I also understand, you have to write the >> > validation >> > > > > check >> > > > > > > on >> > > > > > > > your >> > > > > > > > > > > > own, >> > > > > > > > > > > > > > > since there is no 3rd party to explain that. >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > The only thing I can't understand are the >> actual >> > > > rules >> > > > > > as I >> > > > > > > > wrote >> > > > > > > > > > > > before, >> > > > > > > > > > > > > > > since they are written confusingly. >> > > > > > > > > > > > > > > So, I repeat what I asked before: >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > I think you should structure the validation >> rules >> > > > > > > > differently: >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > * Backward checks >> > > > > > > > > > > > > > > ** List down rules, where use newSchema (the >> > schema >> > > > > used >> > > > > > by >> > > > > > > > > > producer >> > > > > > > > > > > > or >> > > > > > > > > > > > > > > consumer) and existingSchema (last schema >> used) >> > > > > > > > > > > > > > > * Forward >> > > > > > > > > > > > > > > ** List down rules, where use newSchema (the >> > schema >> > > > > used >> > > > > > by >> > > > > > > > > > producer >> > > > > > > > > > > > or >> > > > > > > > > > > > > > > consumer) and existingSchema (last schema >> used) >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Once that's accomplished I will be able to >> > > understand >> > > > > the >> > > > > > > > > > different >> > > > > > > > > > > > > > > validation rules for each compatibility check. >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > Thanks, >> > > > > > > > > > > > > > > > sinan >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > Asaf Mesika <asaf.mes...@gmail.com> >> > 于2023年3月1日周三 >> > > > > > > 21:19写道: >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > On Mon, Feb 27, 2023 at 3:47 PM SiNan Liu >> < >> > > > > > > > > > > > liusinan1...@gmail.com> >> > > > > > > > > > > > > > > > wrote: >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > I read it and they look identical. >> What's >> > > the >> > > > > > > > difference >> > > > > > > > > > > > between >> > > > > > > > > > > > > > > > them? >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > Current avro,json, and protobuf schemas >> are >> > > all >> > > > > > > > implemented >> > > > > > > > > > > > based >> > > > > > > > > > > > > > on >> > > > > > > > > > > > > > > > AVRO. >> > > > > > > > > > > > > > > > > > > What do you mean, they are all >> > implemented >> > > > > based >> > > > > > on >> > > > > > > > > > Avro? You >> > > > > > > > > > > > > > mean >> > > > > > > > > > > > > > > > the >> > > > > > > > > > > > > > > > > > > protobuf schema is converted into an >> Avro >> > > > > Schema, >> > > > > > > and >> > > > > > > > > > then >> > > > > > > > > > > > you >> > > > > > > > > > > > > > use >> > > > > > > > > > > > > > > > Avro >> > > > > > > > > > > > > > > > > > > compatibility validation? >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> `org.apache.pulsar.broker.service.schema.ProtobufSchemaCompatibilityCheck` >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > >> > > > > >> > `org.apache.pulsar.broker.service.schema.AvroSchemaCompatibilityCheck` >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > >> > > > > >> > `org.apache.pulsar.broker.service.schema.JsonSchemaCompatibilityCheck` >> > > > > > > > > > > > > > > > > > They all extends >> > > > > > `AvroSchemaBasedCompatibilityCheck`, >> > > > > > > > the >> > > > > > > > > > > > > > > > > > `checkCompatible()` is the same >> > > implementation >> > > > > with >> > > > > > > > `AVRO`. >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Can you please explain how a Protobuf >> Schema >> > > > > > descriptor >> > > > > > > > can >> > > > > > > > > > be >> > > > > > > > > > > > > > > validated >> > > > > > > > > > > > > > > > > for backward compatibility check using >> Avro >> > > based >> > > > > > > > > > compatibility >> > > > > > > > > > > > > > rules? >> > > > > > > > > > > > > > > > > Doesn't it expect the schema to be Avro, >> but >> > it >> > > > is >> > > > > > > > actually a >> > > > > > > > > > > > > > Protobuf >> > > > > > > > > > > > > > > > > descriptor? >> > > > > > > > > > > > > > > > > Is there some translation happening? >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > I think you should structure the >> validation >> > > > rules >> > > > > > > > > > differently: >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > The Compatibility check strategy is >> > described >> > > > on >> > > > > > the >> > > > > > > > > > website >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://pulsar.apache.org/docs/next/schema-understand/#schema-compatibility-check-strategy >> > > > > > > > > > > > > > > > > > 1. BACKWARD(CanReadExistingStrategy): >> > > Consumers >> > > > > > using >> > > > > > > > > > schema >> > > > > > > > > > > > V3 can >> > > > > > > > > > > > > > > > process >> > > > > > > > > > > > > > > > > > data written by producers using the last >> > > schema >> > > > > > > version >> > > > > > > > > > V2. So >> > > > > > > > > > > > V2 >> > > > > > > > > > > > > > is >> > > > > > > > > > > > > > > > > > "writtenSchema" and V3 is "readSchema". >> > > > > > > > > > > > > > > > > > 2. FORWARD(CanBeReadByExistingStrategy): >> > > > > Consumers >> > > > > > > > using >> > > > > > > > > > the >> > > > > > > > > > > > last >> > > > > > > > > > > > > > > > schema >> > > > > > > > > > > > > > > > > > version V2 can process data written by >> > > > producers >> > > > > > > using >> > > > > > > > a >> > > > > > > > > > new >> > > > > > > > > > > > schema >> > > > > > > > > > > > > > > V3, >> > > > > > > > > > > > > > > > > > even though they may not be able to use >> the >> > > > full >> > > > > > > > > > capabilities >> > > > > > > > > > > > of >> > > > > > > > > > > > > > the >> > > > > > > > > > > > > > > > new >> > > > > > > > > > > > > > > > > > schema. So V3 is "writtenSchema" and V2 >> is >> > > > > > > > "readSchema". >> > > > > > > > > > > > > > > > > > 3. FULL(CanBeReadMutualStrategy): >> Schemas >> > are >> > > > > both >> > > > > > > > > > backward and >> > > > > > > > > > > > > > > forward >> > > > > > > > > > > > > > > > > > compatible. >> > > > > > > > > > > > > > > > > > Schema can evolve. The old version >> schema >> > and >> > > > the >> > > > > > new >> > > > > > > > > > version >> > > > > > > > > > > > > > schema >> > > > > > > > > > > > > > > > should >> > > > > > > > > > > > > > > > > > be well understood. >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > I'm sorry - I don't understand. >> > > > > > > > > > > > > > > > > I understand the different compatibility >> > check >> > > > > > > > strategies. >> > > > > > > > > > > > > > > > > If you just spell them out here, then as >> you >> > > say, >> > > > > > just >> > > > > > > > > > translate >> > > > > > > > > > > > the >> > > > > > > > > > > > > > > > > Protobuf Descriptor into an Avro schema >> and >> > run >> > > > the >> > > > > > > Avro >> > > > > > > > > > > > > > > > > compatibility validation, no? >> > > > > > > > > > > > > > > > > I believe the answer is no, since you may >> > want >> > > to >> > > > > > > verify >> > > > > > > > > > > > different >> > > > > > > > > > > > > > > things >> > > > > > > > > > > > > > > > > when it comes to Protobuf, which are >> > different >> > > > then >> > > > > > > Avro. >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > At the current state, I can't understand >> your >> > > > > design >> > > > > > at >> > > > > > > > all. >> > > > > > > > > > > > Please >> > > > > > > > > > > > > > > help >> > > > > > > > > > > > > > > > > clarify that. >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > So each strategy should have its own >> > section. >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > The arguments of `canRead()` are >> > > writtenSchema >> > > > > and >> > > > > > > > > > readSchema. >> > > > > > > > > > > > As >> > > > > > > > > > > > > > > we've >> > > > > > > > > > > > > > > > > > just described, we only need to change >> the >> > > > order >> > > > > of >> > > > > > > > > > arguments >> > > > > > > > > > > > we >> > > > > > > > > > > > > > pass >> > > > > > > > > > > > > > > > to >> > > > > > > > > > > > > > > > > > `canRead()`. >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > Thanks, >> > > > > > > > > > > > > > > > > > sinan >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > Asaf Mesika <asaf.mes...@gmail.com> >> > > > > 于2023年2月27日周一 >> > > > > > > > 20:49写道: >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > And you can see the difference >> between >> > > > > ProtoBuf >> > > > > > > and >> > > > > > > > > > > > > > > ProtoBufNative: >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > >> > > > > >> > https://pulsar.apache.org/docs/next/schema-get-started/#protobufnative >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > >> > https://pulsar.apache.org/docs/next/schema-get-started/#protobuf >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > I read it and they look identical. >> > What's >> > > > the >> > > > > > > > difference >> > > > > > > > > > > > between >> > > > > > > > > > > > > > > > them? >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > Current avro,json, and protobuf >> schemas >> > are >> > > > all >> > > > > > > > > > implemented >> > > > > > > > > > > > based >> > > > > > > > > > > > > > > on >> > > > > > > > > > > > > > > > > > AVRO. >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > What do you mean, they are all >> > implemented >> > > > > based >> > > > > > on >> > > > > > > > > > Avro? You >> > > > > > > > > > > > > > mean >> > > > > > > > > > > > > > > > the >> > > > > > > > > > > > > > > > > > > protobuf schema is converted into an >> Avro >> > > > > Schema, >> > > > > > > and >> > > > > > > > > > then >> > > > > > > > > > > > you >> > > > > > > > > > > > > > use >> > > > > > > > > > > > > > > > Avro >> > > > > > > > > > > > > > > > > > > compatibility validation? >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > *Here are the basic compatibility >> rules >> > > > we've >> > > > > > > > defined:* >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > I think you should structure the >> > validation >> > > > > rules >> > > > > > > > > > > > differently: >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > * Backward checks >> > > > > > > > > > > > > > > > > > > ** List down rules, where use >> newSchema >> > > (the >> > > > > > schema >> > > > > > > > used >> > > > > > > > > > by >> > > > > > > > > > > > > > > producer >> > > > > > > > > > > > > > > > or >> > > > > > > > > > > > > > > > > > > consumer) and existingSchema (last >> schema >> > > > used) >> > > > > > > > > > > > > > > > > > > * Forward >> > > > > > > > > > > > > > > > > > > ** List down rules, where use >> newSchema >> > > (the >> > > > > > schema >> > > > > > > > used >> > > > > > > > > > by >> > > > > > > > > > > > > > > producer >> > > > > > > > > > > > > > > > or >> > > > > > > > > > > > > > > > > > > consumer) and existingSchema (last >> schema >> > > > used) >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > So each strategy should have its own >> > > section. >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > I'm saying this since you used >> > > > "writttenSchema" >> > > > > > > word >> > > > > > > > but >> > > > > > > > > > it >> > > > > > > > > > > > > > > > represents >> > > > > > > > > > > > > > > > > > > something completely different if it's >> > > > backward >> > > > > > or >> > > > > > > > > > forward >> > > > > > > > > > > > check. >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > Once you'll have that structure like >> > that, >> > > I >> > > > > > > > personally >> > > > > > > > > > will >> > > > > > > > > > > > be >> > > > > > > > > > > > > > > able >> > > > > > > > > > > > > > > > to >> > > > > > > > > > > > > > > > > > > read and understand it. >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > The motivation and problem statement >> are >> > > now >> > > > > > good - >> > > > > > > > > > thanks >> > > > > > > > > > > > for >> > > > > > > > > > > > > > > > improving >> > > > > > > > > > > > > > > > > > > it. >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > On Mon, Feb 27, 2023 at 8:20 AM SiNan >> > Liu < >> > > > > > > > > > > > > > liusinan1...@gmail.com> >> > > > > > > > > > > > > > > > > > wrote: >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > Hi! I updated the PIP issue again. >> This >> > > > time >> > > > > > I've >> > > > > > > > added >> > > > > > > > > > > > some >> > > > > > > > > > > > > > > > background >> > > > > > > > > > > > > > > > > > > and >> > > > > > > > > > > > > > > > > > > > some explanations. >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > The compatibility check rules are >> > already >> > > > > > written >> > > > > > > > in >> > > > > > > > > > the >> > > > > > > > > > > > > > > > > > Implementation. >> > > > > > > > > > > > > > > > > > > > ProtoBufNative implements the same >> > > canRead >> > > > > > method >> > > > > > > > as >> > > > > > > > > > Apache >> > > > > > > > > > > > > > Avro. >> > > > > > > > > > > > > > > > > > > > It does this by checking whether the >> > > schema >> > > > > for >> > > > > > > > > > writing and >> > > > > > > > > > > > > > > reading >> > > > > > > > > > > > > > > > is >> > > > > > > > > > > > > > > > > > > > compatible. I also indicate whether >> the >> > > > > > > > writtenSchema >> > > > > > > > > > and >> > > > > > > > > > > > > > > > readSchema of >> > > > > > > > > > > > > > > > > > > the >> > > > > > > > > > > > > > > > > > > > Backward, Forward, and Full >> strategies >> > > are >> > > > > the >> > > > > > > old >> > > > > > > > or >> > > > > > > > > > the >> > > > > > > > > > > > new >> > > > > > > > > > > > > > > > version >> > > > > > > > > > > > > > > > > > of >> > > > > > > > > > > > > > > > > > > > the schema. >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > Thanks, >> > > > > > > > > > > > > > > > > > > > sinan >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > Asaf Mesika <asaf.mes...@gmail.com> >> > > > > > > 于2023年2月26日周日 >> > > > > > > > > > 23:24写道: >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > I'm sorry, but this PIP lacks a >> lot >> > of >> > > > > > > background >> > > > > > > > > > > > knowledge, >> > > > > > > > > > > > > > so >> > > > > > > > > > > > > > > > you >> > > > > > > > > > > > > > > > > > > need >> > > > > > > > > > > > > > > > > > > > to >> > > > > > > > > > > > > > > > > > > > > add IMO for people to understand >> it. >> > > You >> > > > > > don't >> > > > > > > > need >> > > > > > > > > > to >> > > > > > > > > > > > > > explain >> > > > > > > > > > > > > > > > the >> > > > > > > > > > > > > > > > > > > entire >> > > > > > > > > > > > > > > > > > > > > pulsar in this PIP, but at the >> very >> > > > least a >> > > > > > few >> > > > > > > > > > > > paragraphs >> > > > > > > > > > > > > > > > detailing >> > > > > > > > > > > > > > > > > > > all >> > > > > > > > > > > > > > > > > > > > > you need to know, to put you in >> > > context: >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > - Start by saying Pulsar as a >> > > built-in >> > > > > > > schema >> > > > > > > > > > registry >> > > > > > > > > > > > > > > inside >> > > > > > > > > > > > > > > > > > Pulsar >> > > > > > > > > > > > > > > > > > > > > broker. >> > > > > > > > > > > > > > > > > > > > > - Every time the client >> updates >> > > the >> > > > > > > > schema, it >> > > > > > > > > > > > uploads >> > > > > > > > > > > > > > it >> > > > > > > > > > > > > > > > to >> > > > > > > > > > > > > > > > > > the >> > > > > > > > > > > > > > > > > > > > > broker. When that happens, >> it >> > > has a >> > > > > > > feature >> > > > > > > > > > which >> > > > > > > > > > > > > > > validates >> > > > > > > > > > > > > > > > if >> > > > > > > > > > > > > > > > > > > the >> > > > > > > > > > > > > > > > > > > > > new >> > > > > > > > > > > > > > > > > > > > > schema version is compatible >> > with >> > > > the >> > > > > > > > previous >> > > > > > > > > > > > > > versions. >> > > > > > > > > > > > > > > > There >> > > > > > > > > > > > > > > > > > > > > are 4 types >> > > > > > > > > > > > > > > > > > > > > of compatibility: Full, ... >> > > > (complete >> > > > > > and >> > > > > > > > > > explain >> > > > > > > > > > > > each >> > > > > > > > > > > > > > > one >> > > > > > > > > > > > > > > > > > > briefly) >> > > > > > > > > > > > > > > > > > > > > - Also explain Pulsar Schema >> > > registry >> > > > > > > supports >> > > > > > > > > > various >> > > > > > > > > > > > > > > schema >> > > > > > > > > > > > > > > > > > > > > protocols: Avro, protobuf >> native, >> > > ... >> > > > > > > > (complete >> > > > > > > > > > the >> > > > > > > > > > > > > > rest), >> > > > > > > > > > > > > > > > each >> > > > > > > > > > > > > > > > > > > > > protocol >> > > > > > > > > > > > > > > > > > > > > has a schema which dictates >> how to >> > > > > > serialize >> > > > > > > > and >> > > > > > > > > > > > > > deserialize >> > > > > > > > > > > > > > > > the >> > > > > > > > > > > > > > > > > > > > message >> > > > > > > > > > > > > > > > > > > > > content into typed object. >> > > > > > > > > > > > > > > > > > > > > - Explain in short what is >> > protobuf >> > > > > native >> > > > > > > > > > (compare >> > > > > > > > > > > > > > protobuf >> > > > > > > > > > > > > > > > > > > > non-native) >> > > > > > > > > > > > > > > > > > > > > - Please don't paste code >> instead >> > of >> > > > > > > > explaining. >> > > > > > > > > > > > > > > > > > > > > - Explain that protobuf >> native >> > > > > current >> > > > > > > > > > validation >> > > > > > > > > > > > check >> > > > > > > > > > > > > > > is >> > > > > > > > > > > > > > > > only >> > > > > > > > > > > > > > > > > > > > > composed of checking the >> root >> > > > message >> > > > > > > name >> > > > > > > > is >> > > > > > > > > > the >> > > > > > > > > > > > same >> > > > > > > > > > > > > > > > between >> > > > > > > > > > > > > > > > > > > > > the current >> > > > > > > > > > > > > > > > > > > > > schema version and the new >> > > version. >> > > > > > > > > > > > > > > > > > > > > - Explain briefly what >> is a >> > > root >> > > > > > > message >> > > > > > > > > > and its >> > > > > > > > > > > > > > name. >> > > > > > > > > > > > > > > > > > > > > - Explain the problem (list >> > > > > scenarios) >> > > > > > > > that we >> > > > > > > > > > have >> > > > > > > > > > > > > > > because >> > > > > > > > > > > > > > > > > > > > protobuf >> > > > > > > > > > > > > > > > > > > > > native schema only supports >> > FULL >> > > > > > > > compatibility >> > > > > > > > > > > > > > > validation. >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > Regarding high level design - as >> in >> > > what >> > > > > you >> > > > > > > > plan to >> > > > > > > > > > do. >> > > > > > > > > > > > > > > > > > > > > I suggest you add "High Level >> Design" >> > > and >> > > > > in >> > > > > > it >> > > > > > > > > > detail >> > > > > > > > > > > > how >> > > > > > > > > > > > > > you >> > > > > > > > > > > > > > > > plan >> > > > > > > > > > > > > > > > > > to >> > > > > > > > > > > > > > > > > > > > > validate, per protobuf version, >> per >> > > > > > > compatibility >> > > > > > > > > > check >> > > > > > > > > > > > > > > > (backward, >> > > > > > > > > > > > > > > > > > > > forward, >> > > > > > > > > > > > > > > > > > > > > full,...). >> > > > > > > > > > > > > > > > > > > > > I tried reading the >> implementation - >> > > for >> > > > > me , >> > > > > > > > it's >> > > > > > > > > > all >> > > > > > > > > > > > over >> > > > > > > > > > > > > > the >> > > > > > > > > > > > > > > > > > place. >> > > > > > > > > > > > > > > > > > > > Can >> > > > > > > > > > > > > > > > > > > > > you please list in order what I >> wrote >> > > > > above, >> > > > > > > and >> > > > > > > > > > list the >> > > > > > > > > > > > > > > > validation >> > > > > > > > > > > > > > > > > > > > rules >> > > > > > > > > > > > > > > > > > > > > with a good explanation why you >> > > validate >> > > > it >> > > > > > > like >> > > > > > > > > > that? >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > Lastly, one you have all the >> > validation >> > > > > rules >> > > > > > > > clearly >> > > > > > > > > > > > stated, >> > > > > > > > > > > > > > > you >> > > > > > > > > > > > > > > > can >> > > > > > > > > > > > > > > > > > > use >> > > > > > > > > > > > > > > > > > > > > it to document it properly so >> users >> > can >> > > > > know >> > > > > > > what >> > > > > > > > > > > > validation >> > > > > > > > > > > > > > to >> > > > > > > > > > > > > > > > > > expect. >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > Thanks, >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > Asaf >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > On Wed, Feb 22, 2023 at 5:10 PM >> SiNan >> > > > Liu < >> > > > > > > > > > > > > > > > liusinan1...@gmail.com> >> > > > > > > > > > > > > > > > > > > > wrote: >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > Sorry, my mistake. I removed the >> > code >> > > > and >> > > > > > > > > > described the >> > > > > > > > > > > > > > > design >> > > > > > > > > > > > > > > > to >> > > > > > > > > > > > > > > > > > > > improve >> > > > > > > > > > > > > > > > > > > > > > the PROTOBUF_NATIVE schema >> > > > compatibility >> > > > > > > > checks. >> > > > > > > > > > You >> > > > > > > > > > > > can >> > > > > > > > > > > > > > > have a >> > > > > > > > > > > > > > > > > > look. >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > Asaf Mesika < >> asaf.mes...@gmail.com >> > > >> > > > > > > > 于2023年2月22日周三 >> > > > > > > > > > > > 21:16写道: >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > I read it but you're almost >> > > directly >> > > > > > diving >> > > > > > > > into >> > > > > > > > > > the >> > > > > > > > > > > > > > code - >> > > > > > > > > > > > > > > > it >> > > > > > > > > > > > > > > > > > will >> > > > > > > > > > > > > > > > > > > > > take >> > > > > > > > > > > > > > > > > > > > > > me >> > > > > > > > > > > > > > > > > > > > > > > hours just to reverse engineer >> > your >> > > > > > design. >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > Can you please include a "High >> > > Level >> > > > > > > Design" >> > > > > > > > > > section >> > > > > > > > > > > > in >> > > > > > > > > > > > > > > which >> > > > > > > > > > > > > > > > you >> > > > > > > > > > > > > > > > > > > > > explain >> > > > > > > > > > > > > > > > > > > > > > > how you plan to tackle any >> issue? >> > > > > > > > > > > > > > > > > > > > > > > If I can read that section and >> > > > explain >> > > > > to >> > > > > > > > someone >> > > > > > > > > > > > else >> > > > > > > > > > > > > > how >> > > > > > > > > > > > > > > > this >> > > > > > > > > > > > > > > > > > > will >> > > > > > > > > > > > > > > > > > > > > > work, >> > > > > > > > > > > > > > > > > > > > > > > it means the section is >> complete. >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > Let's leave the code to the >> PRs. >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > On Sun, Feb 19, 2023 at 2:59 >> PM >> > > SiNan >> > > > > > Liu < >> > > > > > > > > > > > > > > > > > liusinan1...@gmail.com> >> > > > > > > > > > > > > > > > > > > > > > wrote: >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > Hi all, >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > I made a PIP to discuss: >> > > > > > > > > > > > > > > > > > > > > >> > > > > > https://github.com/apache/pulsar/issues/19565 >> > > > > > > > > > > > > > > > > > > > > > . >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > We can talk about the >> current >> > > > design >> > > > > > > here. >> > > > > > > > > > > > Especially >> > > > > > > > > > > > > > for >> > > > > > > > > > > > > > > > the >> > > > > > > > > > > > > > > > > > > field >> > > > > > > > > > > > > > > > > > > > > > type >> > > > > > > > > > > > > > > > > > > > > > > > change check rules, please >> give >> > > > your >> > > > > > > > valuable >> > > > > > > > > > > > advice. >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > Thanks, >> > > > > > > > > > > > > > > > > > > > > > > > Sinan >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> >