Hi everyone,

Currently, the Datagen connector generates data that doesn't match the schema 
definition 
when dealing with fixed-length and variable-length fields. It defaults to a 
unified length of 100 
and requires manual configuration by the user. This violates the correctness of 
schema constraints 
and hampers ease of use.


Jane Chan and I have discussed offline and I will summarize our discussion 
below.


To enhance the datagen connector to automatically generate data that conforms 
to the schema 
definition without additional manual configuration, we propose handling the 
following data types 
appropriately [1]:
      1. For fixed-length data types (char, binary), the length should be 
defined by the schema definition 
         and not be user-defined.
      2. For variable-length data types (varchar, varbinary), the length should 
be defined by the schema 
          definition, but allow for user-defined lengths that are smaller than 
the schema definition.



Looking forward to your feedback :)


[1] https://issues.apache.org/jira/browse/FLINK-32993


Best,
Yubin

Reply via email to