Mika Naylor created FLINK-37616:
-----------------------------------
Summary: PyFlink incorrectly unpickles Row fields
Key: FLINK-37616
URL: https://issues.apache.org/jira/browse/FLINK-37616
Project: Flink
Issue Type: Bug
Components: API / Python
Reporter: Mika Naylor
Assignee: Mika Naylor
If you call {{TableEnvironment.from_elements}} where one of the fields in the
row contains a {{Row}} Type, for example where one of the values you pass in is:
{code:java}
[
Row("pyflink1A", "pyflink2A", "pyflink3A"),
Row("pyflink1B", "pyflink2B", "pyflink3B"),
Row("pyflink1C", "pyflink2C", "pyflink3C"),
],{code}
where the schema for the field is:
{code:java}
DataTypes.ARRAY(
DataTypes.ROW(
[
DataTypes.FIELD("a", DataTypes.STRING()),
DataTypes.FIELD("b", DataTypes.STRING()),
DataTypes.FIELD("c", DataTypes.STRING()),
]
)
),{code}
When you call {{execute().collect()}} on the table, the array is returned as:
{code:java}
[
<Row(['pyflink1a', 'pyflink2a', 'pyflink3a'])>,
<Row(['pyflink1b', 'pyflink2b', 'pyflink3b'])>,
<Row(['pyflink1c', 'pyflink2c', 'pyflink3c'])>
]{code}
Instead of each {{Row}} having 3 values, the collected row only has 1 value,
which is now a list of the actual values in the row. The input and output rows
are no longer equal (as their internal _values collection are no longer equal,
one being a list of strings and the other being a list of a list of strings).
The len() of the source Row is correctly returned as 3, but the collected row
incorrectly reports a len() of 1.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)