Thank Jane for providing examples to make discussions clearer. Thank Lincoln and Xuyang for your feedback,I agree with you wholeheartedly that it is better to throw an error instead of ignoring it directly. Extending datagen to generate variable length values is really an excelent idea, I will create another jira to follow up.
Taking the example provided, 1. For fixed-length data types (char, binary), two DDLs which custom length should throw exception like 'User-defined length of the fixed-length field f0 is not supported.' 1. CREATE TABLE foo ( f0 CHAR(5) ) WITH ('connector' = 'datagen', 'fields.f0.length' = '10'); CREATE TABLE bar ( f0 CHAR(5) ) WITH ('connector' = 'datagen', 'fields.f0.length' = '1'); 1. For variable-length data types (varchar, varbinary),the first DDL can be executed legally, if illegal user-defined length configured, will throw exception like 'User-defined length of the VARCHAR field %s should be shorter than the schema definition.' 1. CREATE TABLE meow ( f0 VARCHAR(20) ) WITH ('connector' = 'datagen', 'fields.f0.length' = '10'); 1. For special variable-length data types, since the length of String and Bytes is very large (2^31 - 1), when users does not specify a smaller field length, Fields that occupy a huge amount of memory (estimated to be more than 2GB) will be generated by default, which can easily lead to "java.lang.OutOfMemoryError: Java heap space", so I recommend that the default length of these two fields is 100 just like before, but the length can be configured to less than 2^31-1. 1. CREATE TABLE purr ( f0 STRING ) WITH ('connector' = 'datagen', 'fields.f0.length' = '10'); Updates have been synchronized to the merge request [1] WDYT? [1] https://github.com/apache/flink/pull/23678 Best! Yubin