+1 - i think keeping the `_sorted_fields` and `_required` defaults consistent between the clients is the way to go.
> On Mar 29, 2023, at 7:09 AM, Yunze Xu <y...@streamnative.io.INVALID> wrote: > > I found the Python client has two options to control the behavior: > 1. Set `_sorted_fields`. It's false by default in the Python client, > but it's true in the Java client. i.e. the Java client sorts all > fields by default. > 2. Set `_required`. It's false by default for all types in the Python > client, but it's only false for the string type in the Java client. > > i.e. given the following Java class: > > ```java > class User { > String name; > int age; > double score; > } > ``` > > We have to give the following definition in Python: > > ```python > class User(Record): > _sorted_fields = True > name = String() > age = Integer(required=True) > score = Double(required=True) > ``` > > I see https://github.com/apache/pulsar/pull/12232 adds the > `_sorted_fields` field and disables the field sort by default. It > breaks compatibility with the Java client. > > IMO, we should make `_sorted_fields` true by default and `_required` > true for all types other than `String` by default. > > Thanks, > Yunze > > On Wed, Mar 29, 2023 at 4:00 PM Yunze Xu <y...@streamnative.io> wrote: >> >> Hi all, >> >> Recently I found the default generated schema definition in the Python >> client is different from the Java client, which leads to some >> unexpected behavior. >> >> For example, given the following class definition in Python: >> >> ```python >> class Data(Record): >> i = Integer() >> ``` >> >> The type of `i` field is a union: "type": ["null", "int"] >> >> While given the following class definition in Java: >> >> ```java >> class Data { >> private final int i; >> /* ... */ >> } >> ``` >> >> The type of `i` field is an integer: "type": "int" >> >> It brings an issue that if a Python consumer subscribes to a topic >> with schema defined above, then a Java producer will fail to create >> because of the schema incompatibility. >> >> Currently, the workaround is to change the schema compatibility >> strategy to FORWARD. >> >> Should we change the way to generate schema definition in the Python >> client to be compatible with the Java client? It could bring breaking >> changes to old Python clients, but it could guarantee compatibility >> with the Java client. >> >> If not, we still have to introduce an extra configuration to make >> Python schema compatible with Java schema. But it requires code >> changes. e.g. here is a possible solution: >> >> ```python >> class Data(Record): >> # NOTE: Users might have to add this extra field to control how to >> generate the schema >> __java_compatible = True >> i = Integer() >> ``` >> >> Thanks, >> Yunze