+1 - i think keeping the `_sorted_fields` and `_required` defaults consistent 
between the clients is the way to go. 

> On Mar 29, 2023, at 7:09 AM, Yunze Xu <y...@streamnative.io.INVALID> wrote:
> 
> I found the Python client has two options to control the behavior:
> 1. Set `_sorted_fields`. It's false by default in the Python client,
> but it's true in the Java client. i.e. the Java client sorts all
> fields by default.
> 2. Set `_required`. It's false by default for all types in the Python
> client, but it's only false for the string type in the Java client.
> 
> i.e. given the following Java class:
> 
> ```java
> class User {
>    String name;
>    int age;
>    double score;
> }
> ```
> 
> We have to give the following definition in Python:
> 
> ```python
> class User(Record):
>    _sorted_fields = True
>    name = String()
>    age = Integer(required=True)
>    score = Double(required=True)
> ```
> 
> I see https://github.com/apache/pulsar/pull/12232 adds the
> `_sorted_fields` field and disables the field sort by default. It
> breaks compatibility with the Java client.
> 
> IMO, we should make `_sorted_fields` true by default and `_required`
> true for all types other than `String` by default.
> 
> Thanks,
> Yunze
> 
> On Wed, Mar 29, 2023 at 4:00 PM Yunze Xu <y...@streamnative.io> wrote:
>> 
>> Hi all,
>> 
>> Recently I found the default generated schema definition in the Python
>> client is different from the Java client, which leads to some
>> unexpected behavior.
>> 
>> For example, given the following class definition in Python:
>> 
>> ```python
>> class Data(Record):
>>    i = Integer()
>> ```
>> 
>> The type of `i` field is a union: "type": ["null", "int"]
>> 
>> While given the following class definition in Java:
>> 
>> ```java
>> class Data {
>>    private final int i;
>>    /* ... */
>> }
>> ```
>> 
>> The type of `i` field is an integer: "type": "int"
>> 
>> It brings an issue that if a Python consumer subscribes to a topic
>> with schema defined above, then a Java producer will fail to create
>> because of the schema incompatibility.
>> 
>> Currently, the workaround is to change the schema compatibility
>> strategy to FORWARD.
>> 
>> Should we change the way to generate schema definition in the Python
>> client to be compatible with the Java client? It could bring breaking
>> changes to old Python clients, but it could guarantee compatibility
>> with the Java client.
>> 
>> If not, we still have to introduce an extra configuration to make
>> Python schema compatible with Java schema. But it requires code
>> changes. e.g. here is a possible solution:
>> 
>> ```python
>> class Data(Record):
>>    # NOTE: Users might have to add this extra field to control how to
>> generate the schema
>>    __java_compatible = True
>>    i = Integer()
>> ```
>> 
>> Thanks,
>> Yunze

Reply via email to