Dian Fu created FLINK-17146:
-------------------------------

             Summary: Support conversion between PyFlink Table and Pandas 
DataFrame
                 Key: FLINK-17146
                 URL: https://issues.apache.org/jira/browse/FLINK-17146
             Project: Flink
          Issue Type: New Feature
          Components: API / Python
            Reporter: Dian Fu
            Assignee: Dian Fu


Pandas dataframe is the de-facto standard to work with tabular data in Python 
community. PyFlink table is Flink’s representation of the tabular data in 
Python language. It would be nice to provide the ability to convert between the 
PyFlink table and Pandas dataframe in PyFlink Table API which has the following 
benefits:
 * It provides users the ability to switch between PyFlink and Pandas 
seamlessly when processing data in Python language. Users could process data 
using one execution engine and switch to another seamlessly. For example, it 
may happen that users have already got a Pandas dataframe at hand and want to 
perform some expensive transformation of it. Then they could convert it to a 
PyFlink table and leverage the power of Flink engine. Users could also convert 
a PyFlink table to Pandas dataframe and perform transformation of it with the 
rich functionalities provided by the Pandas ecosystem.
 * No intermediate connectors are needed when converting between them.

More details could be found in 
[FLIP-120|https://cwiki.apache.org/confluence/display/FLINK/FLIP-120%3A+Support+conversion+between+PyFlink+Table+and+Pandas+DataFrame].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to