I found the Python client has two options to control the behavior:
1. Set `_sorted_fields`. It's false by default in the Python client,
but it's true in the Java client. i.e. the Java client sorts all
fields by default.
2. Set `_required`. It's false by default for all types in the Python
client, but it's only false for the string type in the Java client.

i.e. given the following Java class:

```java
class User {
    String name;
    int age;
    double score;
}
```

We have to give the following definition in Python:

```python
class User(Record):
    _sorted_fields = True
    name = String()
    age = Integer(required=True)
    score = Double(required=True)
```

I see https://github.com/apache/pulsar/pull/12232 adds the
`_sorted_fields` field and disables the field sort by default. It
breaks compatibility with the Java client.

IMO, we should make `_sorted_fields` true by default and `_required`
true for all types other than `String` by default.

Thanks,
Yunze

On Wed, Mar 29, 2023 at 4:00 PM Yunze Xu <y...@streamnative.io> wrote:
>
> Hi all,
>
> Recently I found the default generated schema definition in the Python
> client is different from the Java client, which leads to some
> unexpected behavior.
>
> For example, given the following class definition in Python:
>
> ```python
> class Data(Record):
>     i = Integer()
> ```
>
> The type of `i` field is a union: "type": ["null", "int"]
>
> While given the following class definition in Java:
>
> ```java
> class Data {
>     private final int i;
>     /* ... */
> }
> ```
>
> The type of `i` field is an integer: "type": "int"
>
> It brings an issue that if a Python consumer subscribes to a topic
> with schema defined above, then a Java producer will fail to create
> because of the schema incompatibility.
>
> Currently, the workaround is to change the schema compatibility
> strategy to FORWARD.
>
> Should we change the way to generate schema definition in the Python
> client to be compatible with the Java client? It could bring breaking
> changes to old Python clients, but it could guarantee compatibility
> with the Java client.
>
> If not, we still have to introduce an extra configuration to make
> Python schema compatible with Java schema. But it requires code
> changes. e.g. here is a possible solution:
>
> ```python
> class Data(Record):
>     # NOTE: Users might have to add this extra field to control how to
> generate the schema
>     __java_compatible = True
>     i = Integer()
> ```
>
> Thanks,
> Yunze

Reply via email to