Mika Naylor created FLINK-37656:
-----------------------------------

             Summary: Introduce FromValues in Python Table API
                 Key: FLINK-37656
                 URL: https://issues.apache.org/jira/browse/FLINK-37656
             Project: Flink
          Issue Type: New Feature
          Components: API / Python
            Reporter: Mika Naylor
            Assignee: Mika Naylor


The Java Table API has, for some time now, supported 
TableEnvironment.{{{}fromValues{}}} which creates a table from a set of values, 
similar to the {{VALUES}} clause in SQL. The Python Table API has a a similar 
function called {{{}from_elements{}}}, that has a similar API surface but 
different behaviour, and doesn't translate to a {{VALUES}} clause.

The {{from_elements }}method, when given a set of Python objects, doesn't 
transform it into a VALUES clause but instead serializes the set of values to 
an Avro file, which then is deserialized into an Avro source table. In cases 
where the function gets as input a set of {{{}Expression{}}}s, however, it does 
actually function the same as {{{}fromValues{}}}.

It would be useful to have an explicit {{from_values}} method, rather than an 
implicit one within {{from_elements}} based on the input types, that behaves in 
a similar way to the Java one where it can take a set of Expressions, or a set 
of supported Python objects which are translated into Expressions. It would be 
useful to have this especially for Flink contexts/environments where this 
serialization-to-file -> deserialization-from-file step cannot be done.

{{from_elements }}would then still be useful for large datasets which you 
wouldn't want to embed in the query, and {{from_values }}where this embedding 
doesn't present an issue to avoid the serialization/deserialization step.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to