Hi,
According to the documentation for PyFlink Table row based operations [1],
typical usage is as follows:
@udtf(result_types=[DataTypes.INT(), DataTypes.STRING()])
def split(x: Row) -> Row:
for s in x[1].split(","):
yield x[0], s
table.flat_map(split)
Is there any way that row fields inside the UDTF can be accessed by
their attribute names instead of array index? In my use case, I'm doing the
following:
raw_data = t_env.from_path('MySource')
raw_data \
.join_lateral(my_table_tranform_fn(raw_data).alias('a', 'b', 'c') \
.flat_map(my_flat_map_fn).alias('x', 'y', 'z') \
.execute_insert("MySink")
In the table function `my_flat_map_fn` I'm unable to access the fields of
the row by their attribute names i.e., assuming the input argument to the
table function is x, I cannot access fields as x.a, x.b or x.c, instead I
have use use x[0], x[1] and x[2]. The error I get is the _fields is not
populated.
In my use case, the number of columns is very high and working with indexes
is so much error prone and unmaintainable.
Any suggestions?
Thanks,
Sumeet