Sijie,
Thanks for your explanation.

Now it is clear to me that it won't be a problem for users.

I will fix all of the tests on my PR


Enrico

Il Mar 16 Mar 2021, 19:15 Sijie Guo <guosi...@gmail.com> ha scritto:

> I would be surprised that it would break compatibility just because it
> writes the schema definition in a different order.
>
> The test validating how schema definition was written should be improved. I
> don't see that breaks the compatibility.
>
> The test case you have is problematic. If you want to verify the schema is
> compatible, you need to use the writer schema and reader schema in order.
>
> See:
>
> https://avro.apache.org/docs/1.8.2/api/java/org/apache/avro/generic/GenericDatumReader.html#GenericDatumReader(org.apache.avro.Schema,%20org.apache.avro.Schema)
>
> - Sijie
>
>
>
> On Tue, Mar 16, 2021 at 9:21 AM Enrico Olivelli <eolive...@gmail.com>
> wrote:
>
> > Matteo,
> > actually it breaks
> >
> > @Test
> > public void helloCompatibility() throws Exception {
> >    // sampled data created using Pulsar with Avro 1.9
> >    String schemaAvro19 =
> >
> >
> "{\"type\":\"record\",\"name\":\"MyPojo\",\"namespace\":\"SenderTest\",\"fields\":[{\"name\":\"foo\",\"type\":[\"null\",\"string\"],\"default\":null},{\"name\":\"bar\",\"type\":[\"null\",\"int\"],\"default\":null}]}";
> >     byte[] encoded = Base64.getDecoder().decode("AgRmMQKcAQ==");
> >
> >    // let's update to Pulsar 2.8.0 with Avro 1.10.1
> >     Schema<MyPojo> schema = Schema.AVRO(MyPojo.class);
> >
> >     String schemaAvro110 = schema.getSchemaInfo().getSchemaDefinition();
> >
> >     System.out.println("SCHEMA110:"+schemaAvro110);
> >     System.out.println("SCHEMA19:"+schemaAvro19);
> >     System.out.println("ENCODED: "+
> > Base64.getEncoder().encodeToString(encoded));
> >
> >    // java.lang.ArrayIndexOutOfBoundsException: 51
> >     MyPojo decoded = schema.decode(encoded);
> >
> >     MyPojo foo = new MyPojo();
> >     foo.setFoo("f1");
> >     foo.setBar(78);
> >     assertEquals(decoded, foo);
> >
> > }
> >
> > @ToString
> > @Data
> > @EqualsAndHashCode
> > public static final class MyPojo {
> >     String foo;
> >     Integer bar;
> > }
> >
> > Here you can see that the
> >
> >
> >
> SCHEMA110:{"type":"record","name":"MyPojo","namespace":"SenderTest","fields":[{"name":"bar","type":["null","int"],"default":null},{"name":"foo","type":["null","string"],"default":null}]}
> >
> >
> SCHEMA19:{"type":"record","name":"MyPojo","namespace":"SenderTest","fields":[{"name":"foo","type":["null","string"],"default":null},{"name":"bar","type":["null","int"],"default":null}]}
> > ENCODED: AgRmMQKcAQ==
> > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time
> > elapsed: 0.326 s <<< FAILURE! - in SenderTest
> > [ERROR] helloCompatibility  Time elapsed: 0.313 s  <<< ERROR!
> > java.lang.ArrayIndexOutOfBoundsException: 51
> > at SenderTest.helloCompatibility(SenderTest.java:88)
> >
> > But the problem does not happen if I am using a Pulsar Producer with
> > Avro 1.9 and the Consumer with 1.10, there is something that is
> > twaking the schema on the Consumer side and logs report an "adapted
> > schema" that works
> >
> >
> > Enrico
> >
> > Il giorno mar 16 mar 2021 alle ore 16:33 Matteo Merli
> > <mme...@apache.org> ha scritto:
> > >
> > > A good test to add would be to include  the object serialized by avro
> 1.9
> > > and ensure we can read it with avro 1.10.
> > >
> > > On Tue, Mar 16, 2021 at 8:13 AM Enrico Olivelli <eolive...@gmail.com>
> > wrote:
> > >
> > > > Sijie.
> > > > let me write an example.
> > > >
> > > > if you have this Pojo class
> > > > class Pojo {
> > > >     String foo;
> > > >     String bar;
> > > > }
> > > >
> > > > if you use Schema.AVRO(Pojo.class)
> > > >
> > > > with Avro 1.9 you will probably see a Schema with the fields in this
> > > > order: foo, bar
> > > > with Avro 1.10 you will see exactly the schema with bar, foo. (in
> > > > alphabetical order)
> > > >
> > > > This is a potential problem because the same Java code that used to
> > > > write/read the "foo,bar" schema after upgrading to the new version of
> > > > Pulsar (with Avro 1.10) it will start writing with a different schema
> > > > ("bar,foo")
> > > > and those two schemas are not compatible to each other. Because you
> > > > cannot change the order of fields.
> > > >
> > > > We have tests that verify that Schema.AVRO(Pojo.class) produces a
> > > > schema in the expected order of fields, and they are failing.
> > > > I had started to fix all of the tests (the fix is trivial), but as
> far
> > > > as I was fixing them I realized that this change may cause unexpected
> > > > problems to users.
> > > >
> > > > you can see the errors here on CI on the Avro upgrade patch
> > > >
> >
> https://github.com/apache/pulsar/pull/9898/checks?check_run_id=2106075182
> > > >
> > > > I have tried to reproduce a problem with real clients but I am not
> > > > able to produce an error.
> > > >
> > > > I am missing some part of the story, do you have an explanation ?
> > > >
> > > > Enrico
> > > >
> > > >
> > > > Il giorno mar 16 mar 2021 alle ore 02:13 Sijie Guo
> > > > <guosi...@gmail.com> ha scritto:
> > > > >
> > > > > I don't quite understand the compatibility issue here. Looking into
> > > > > AVRO-2579, it seems that only the order of fields returned will be
> > > > > different. We don't depend on the ordering for the compatibility
> > checks.
> > > > >
> > > > > Can you explain more about the compatibility issue you refer to?
> > > > >
> > > > > Thanks,
> > > > > Sijie
> > > > >
> > > > > On Mon, Mar 15, 2021 at 3:52 AM Enrico Olivelli <
> eolive...@gmail.com
> > >
> > > > wrote:
> > > > >
> > > > > > Hello,
> > > > > > I am working on the upgrade of Avro from 1.9 to 1.10.1.
> > > > > >
> > > > > > I noticed a bad behaviour change about Schema.AVRO(Pojo.class).
> > > > > > Basically the new Avro version creates the Schema by sorting the
> > > > > > fields in alphabetical order and this is an incompatible schema
> > > > > > change.
> > > > > >
> > > > > > More details here
> > > > > > https://github.com/apache/pulsar/pull/9898
> > > > > >
> > > > > > and here
> > > > > > https://issues.apache.org/jira/browse/AVRO-2579
> > > > > >
> > > > > > Am I missing something?
> > > > > >
> > > > > > Thoughts ?
> > > > > >
> > > > > > Enrico
> > > > > >
> > > >
> > > --
> > > --
> > > Matteo Merli
> > > <mme...@apache.org>
> >
>

Reply via email to