I found the Python client has two options to control the behavior: 1. Set `_sorted_fields`. It's false by default in the Python client, but it's true in the Java client. i.e. the Java client sorts all fields by default. 2. Set `_required`. It's false by default for all types in the Python client, but it's only false for the string type in the Java client.
i.e. given the following Java class: ```java class User { String name; int age; double score; } ``` We have to give the following definition in Python: ```python class User(Record): _sorted_fields = True name = String() age = Integer(required=True) score = Double(required=True) ``` I see https://github.com/apache/pulsar/pull/12232 adds the `_sorted_fields` field and disables the field sort by default. It breaks compatibility with the Java client. IMO, we should make `_sorted_fields` true by default and `_required` true for all types other than `String` by default. Thanks, Yunze On Wed, Mar 29, 2023 at 4:00 PM Yunze Xu <y...@streamnative.io> wrote: > > Hi all, > > Recently I found the default generated schema definition in the Python > client is different from the Java client, which leads to some > unexpected behavior. > > For example, given the following class definition in Python: > > ```python > class Data(Record): > i = Integer() > ``` > > The type of `i` field is a union: "type": ["null", "int"] > > While given the following class definition in Java: > > ```java > class Data { > private final int i; > /* ... */ > } > ``` > > The type of `i` field is an integer: "type": "int" > > It brings an issue that if a Python consumer subscribes to a topic > with schema defined above, then a Java producer will fail to create > because of the schema incompatibility. > > Currently, the workaround is to change the schema compatibility > strategy to FORWARD. > > Should we change the way to generate schema definition in the Python > client to be compatible with the Java client? It could bring breaking > changes to old Python clients, but it could guarantee compatibility > with the Java client. > > If not, we still have to introduce an extra configuration to make > Python schema compatible with Java schema. But it requires code > changes. e.g. here is a possible solution: > > ```python > class Data(Record): > # NOTE: Users might have to add this extra field to control how to > generate the schema > __java_compatible = True > i = Integer() > ``` > > Thanks, > Yunze