autophagy opened a new pull request, #26414: URL: https://github.com/apache/flink/pull/26414
## What is the purpose of the change When creating a table using `TableEnvironment.from_elements`, the Table API skips type validation on any Row elements that were created using positional arguments, rather than keyword arguments. For example, take a table with a single column, whose type is an array of Rows. These rows have 2 columns, `a VARCHAR` and `b BOOLEAN`. If we create a table with elements where one of these rows has columns with incorrect datatypes: ```python schema = DataTypes.ROW( [ DataTypes.FIELD( "col", DataTypes.ARRAY( DataTypes.ROW( [ DataTypes.FIELD("a", DataTypes.STRING()), DataTypes.FIELD("b", DataTypes.BOOLEAN()), ] ) ), ), ] ) elements = [( [("pyflink", True), ("pyflink", False), (True, "pyflink")], )] table = self.t_env.from_elements(elements, schema) table_result = list(table.execute().collect()) ``` This results in a type validation error: ``` TypeError: field a in element in array field col: VARCHAR can not accept object True in type <class 'bool'> ``` In an example where we use Row instead of tuples, but with column arguments: ``` elements = [( [Row(a="pyflink", b=True), Row(a="pyflink", b=False), Row(a=True, b="pyflink")], )] ``` We also get the same type validation error. However, when we use Row with positional arguments: ``` elements = [( [Row("pyflink", True), Row("pyflink", False), Row(True, "pyflink")], )] ``` the type validation is skipped, leading to an unpickling error when collecting: ``` > data = pickle.loads(data) E EOFError: Ran out of input ``` The type validator skips this by stating that [the order in the row could be different to the order of the datatype fields](https://github.com/apache/flink/blob/master/flink-python/pyflink/table/types.py#L2156), but I don't think this is true. Both rows made from tuples and lists are type verified positionally with the positions of the Datatype fields, and in the case of the `Row` class the order the row's internal values are preserved. Similarly, `Row` class equality in cases where both of the rows are created with positional arguments ## Brief change log - *Change the type validation logic used by `TableEnvironment.from_elements` so that `Row`s constructed with positional arguments are not skipped.* ## Verifying this change This change added tests and can be verified as follows: - *Added a test to ensure consistent type validation behaviour with rows constructed from tuples, lists, `Row`s with keyword arguments and `Row`s with positional arguments* - ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not applicable) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org