Hi Shaoxuan,

Yes, I've seen your message and I am not saying it already contradicts.
I agree as long as we just define DAG/pipeline/logical plan it is a
reasonable thing to do. No doubts about that. I have a feeling though it
mentions at some points things that might be in the area of
responsibility of Beam, e.g. convenience methods like: fromElements,
Table#head(in comments)... Those methods require bidirectional
communication between java <> python, and not only one way communication
python -> (logical, representation) -> java. Also UDFs support as far as
I understand is something we might be able to leverage Beam (but at the
same time I might be completely wrong).

The only thing I wanted to outline is I would welcome at least some
comparisons of the proposed approach to Beam multi-language support.
Discussion when can we think of leveraging Beam and when we should come
up with our own solution and why would also be beneficial I think. Right
now the design document does not mention Beam at all.

Sorry if I sounded too harsh, my intention isn't/wasn't to discard this
effort.

Best,

Dawid

On 04/04/2019 09:41, Shaoxuan Wang wrote:
> David,
> This proposal does not contradict with what we have discussed.
> Please check my reply in
> https://lists.apache.org/thread.html/f6f8116b4b38b0b2d70ed45b990d6bb1bcb33611fde6fdf32ec0e840@%3Cdev.flink.apache.org%3E
> on
> 2019/02/21.
> "Beam Python API and Flink Python TableAPI describe the DAG/pipeline in
> different manners. We got a chance to communicate with Tyler Akidau (from
> Beam) offline, and explained why the Flink tableAPI needs a specific design
> for python, rather than purely leverage Beam portability layer.
>
> In our proposal, most of the Python code is just a DAG/pipeline builder for
> tableAPI. The majority of operators run purely in Java, while only python
> UDFs executed in Python environment during the runtime. This design does
> not affect the development and adoption of Beam language portability layer
> with Flink runner. Flink and Beam community will still collaborate, jointly
> develop and optimize on the JVM / Non-JVM (python,GO) bridge (data and
> control connections between different processes) to ensure the robustness
> and performance."
>
> When we talk about multi-language support, it involves two components: API
> and language. And they are Orthogonal. TableAPI is a descriptive API, and
> will be a superset of SQL. I do not see Beam has the layer and any plan to
> cover the tableAPI semantics. We already have two languages supported for
> tableAPI(java/scala). I do not see the reason why we should not add another
> language (python) support for tableAPI.
>
> Regards,
> Shaoxuan
>
>
>
> On Thu, Apr 4, 2019 at 3:13 PM Dawid Wysakowicz <dwysakow...@apache.org>
> wrote:
>
>> Hi all,
>>
>> Thank you very much Jincheng for the very thorough proposal. I was
>> following the discussion very briefly, but I have an impression that the
>> consensus in the previous discussion[1] was that we do not want to have
>> an independent, flink specific multi language support but we want to
>> collaborate on that manner with the Beam community. I think this is also
>> the concern Thomas raised[2].
>>
>> Let's make sure we do not contradict with what was said in[1]. Could you
>> elaborate more how does it fit in the Beam-Flink multi language support?
>>
>> Best,
>>
>> Dawid
>>
>> [1]
>>
>> https://lists.apache.org/thread.html/f6f8116b4b38b0b2d70ed45b990d6bb1bcb33611fde6fdf32ec0e840@%3Cdev.flink.apache.org%3E
>>
>> [2]
>>
>> https://lists.apache.org/thread.html/da6cd815fa601d81be9f706aaa4d2c595db0b52c40a9040238b830c7@%3Cdev.flink.apache.org%3E
>>
>>
>> On 04/04/2019 08:31, jincheng sun wrote:
>>> Hi Shuyi,
>>>
>>> Glad to see your feedback and port more requirements about
>> multi-language!
>>> I think the Flink community is very much looking forward to more language
>>> support, of course, Golang should be in the future support list.
>>> Since the topic of supporting Python on Flink has been researched and
>>> discussed in the community for a long time, and I want to support Python
>> in
>>> the Table API as the first stage, then other languages should be planed
>> to
>>> support. but I do not think more about the detail about how/when support
>>> Golang. And very welcome to share more ideas on how to support Golang if
>>> you have more thoughts. :)
>>>
>>> Regarding UDF, we do have some ideas and design attempts. The related
>>> attempts to show the performance of python UDF are not optimistic. And
>>> there are also some problems with Python environment management should be
>>> considered. After we have more investigations and experiments, I will
>> share
>>> the discussion with you in time. Perhaps after the first stage(Python
>>> TableAPI support), We will then discuss the detailed discussion of UDF
>>> support.
>>>
>>> I think the support of the DataStream API should be considered after
>>> supporting UDFs because DataStream is mostly supported by various
>>> functions.
>>>
>>> We plan to complete the first phase before the release of Flink-1.9, and
>>> start the UDF support after 1.9. Of course,  I am very glad to hear that
>>> you want to contribute to the Flink multi-language support. I believe,
>>> nothing is impossible if more people interest in Python Table API with
>> UDF
>>> support and more people want to contribute community more, UDF may be
>> there
>>> when flink-1.9 release. :)
>>>
>>> Best,
>>> Jincheng
>>>
>>> Shuyi Chen <suez1...@gmail.com> 于2019年4月4日周四 上午3:35写道:
>>>
>>>> Thanks a lot for driving the FLIP, jincheng. The approach looks
>>>> good. Adding multi-lang support sounds a promising direction to expand
>> the
>>>> footprint of Flink. Do we have plan for adding Golang support? As many
>>>> backend engineers nowadays are familiar with Go, but probably not Java
>> as
>>>> much, adding Golang support would significantly reduce their friction to
>>>> use Flink. Also, do we have a design for multi-lang UDF support, and
>> what's
>>>> timeline for adding DataStream API support? We would like to help and
>>>> contribute as well as we do have similar need internally at our company.
>>>> Thanks a lot.
>>>>
>>>> Shuyi
>>>>
>>>> On Tue, Apr 2, 2019 at 1:03 AM jincheng sun <sunjincheng...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi All,
>>>>> As Xianda brought up in the previous email, There are a large number of
>>>>> data analysis users who want flink to support Python. At the Flink API
>>>>> level, we have DataStreamAPI/DataSetAPI/TableAPI&SQL, the Table API
>> will
>>>>> become the first-class citizen. Table API is declarative and can be
>>>>> automatically optimized, which is mentioned in the Flink mid-term
>> roadmap
>>>>> by Stephan. So we first considering supporting Python at the Table
>> level
>>>> to
>>>>> cater to the current large number of analytics users. For further
>> promote
>>>>> Python support in flink table level. Dian, Wei and I discussed offline
>> a
>>>>> bit and came up with an initial features outline as follows:
>>>>>
>>>>> - Python TableAPI Interface
>>>>>   Introduce a set of Python Table API interfaces, including interface
>>>>> definitions such as Table, TableEnvironment, TableConfig, etc.
>>>>>
>>>>> - Implementation Architecture
>>>>>   We will offer two alternative architecture options, one for pure
>> Python
>>>>> language support and one for extended multi-language design.
>>>>>
>>>>> - Job Submission
>>>>>   Provide a way that can submit(local/remote) Python Table API jobs.
>>>>>
>>>>> - Python Shell
>>>>>   Python Shell is to provide an interactive way for users to write and
>>>>> execute flink Python Table API jobs.
>>>>>
>>>>>
>>>>> The design document for FLIP-38 can be found here:
>>>>>
>>>>>
>>>>>
>> https://docs.google.com/document/d/1ybYt-0xWRMa1Yf5VsuqGRtOfJBz4p74ZmDxZYg3j_h8/edit?usp=sharing
>>>>> I am looking forward to your comments and feedback.
>>>>>
>>>>> Best,
>>>>> Jincheng
>>>>>
>>

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to