Re: [VOTE] FLIP-78: Flink Python UDF Environment and Dependency Management

Wei Zhong Thu, 24 Oct 2019 18:46:31 -0700

Hi Max,

Is there any other concerns from your side? I appreciate if you can give some 
feedback and vote on this.


Best,
Wei

> 在 2019年10月25日，09:33，jincheng sun <sunjincheng...@gmail.com> 写道：
> 
> Hi Thomas,
> 
> Thanks for your explanation. I understand your original intention. I will
> seriously consider this issue. After I have the initial solution, I will
> bring up a further discussion in Beam ML.
> 
> Thanks for your voting. :)
> 
> Best,
> Jincheng
> 
> 
> Thomas Weise <t...@apache.org> 于2019年10月25日周五 上午7:32写道：
> 
>> Hi Jincheng,
>> 
>> Yes, this topic can be further discussed on the Beam ML. The only reason I
>> brought it up here is that it would be desirable from Beam Flink runner
>> perspective for the artifact staging mechanism that you work on to be
>> reusable.
>> 
>> Stage 1 in Beam is also up to the runner, artifact staging is a service
>> discovered from the job server and that the Flink job server currently uses
>> DFS is not set in stone. My interest was more regarding assumptions
>> regarding the artifact structure, which may or may not allow for reusable
>> implementation.
>> 
>> +1 for the proposal otherwise
>> 
>> Thomas
>> 
>> 
>> On Mon, Oct 21, 2019 at 8:40 PM jincheng sun <sunjincheng...@gmail.com>
>> wrote:
>> 
>>> Hi Thomas,
>>> 
>>> Thanks for sharing your thoughts. I think improve and solve the
>> limitations
>>> of the Beam artifact staging is good topic(For beam).
>>> 
>>> As I understand it as follows:
>>> 
>>> For Beam(data):
>>>    Stage1: BeamClient ------> JobService (data will be upload to DFS).
>>>    Stage2: JobService(FlinkClient) ------>  FlinkJob (operator download
>>> the data from DFS)
>>>    Stage3: Operator ------> Harness(artifact staging service)
>>> 
>>> For Flink(data):
>>>    Stage1: FlinkClient(data(local) upload to BlobServer using distribute
>>> cache) ------> Operator (data will be download from BlobServer). Do not
>>> have to depend on DFS.
>>>    Stage2: Operator ------> Harness(for docker we using artifact staging
>>> service)
>>> 
>>> So, I think Beam have to depend on DFS in Stage1. and Stage2 can using
>>> distribute cache if we remove the dependency of DFS for Beam in
>> Stage1.(Of
>>> course we need more detail here),  we can bring up the discussion in a
>>> separate Beam dev@ ML, the current discussion focuses on Flink 1.10
>>> version
>>> of  UDF Environment and Dependency Management for python, so I recommend
>>> voting in the current ML for Flink 1.10, Beam artifact staging
>> improvements
>>> are discussed in a separate Beam dev@.
>>> 
>>> What do you think?
>>> 
>>> Best,
>>> Jincheng
>>> 
>>> Thomas Weise <t...@apache.org> 于2019年10月21日周一 下午10:25写道：
>>> 
>>>> Beam artifact staging currently relies on shared file system and there
>>> are
>>>> limitations, for example when running locally with Docker and local FS.
>>> It
>>>> sounds like a distributed cache based implementation might be a good
>>>> (better?) option for artifact staging even for the Beam Flink runner?
>>>> 
>>>> If so, can the implementation you propose be compatible with the Beam
>>>> artifact staging service so that it can be plugged into the Beam Flink
>>>> runner?
>>>> 
>>>> Thanks,
>>>> Thomas
>>>> 
>>>> 
>>>> On Mon, Oct 21, 2019 at 2:34 AM jincheng sun <sunjincheng...@gmail.com
>>> 
>>>> wrote:
>>>> 
>>>>> Hi Max,
>>>>> 
>>>>> Sorry for the late reply. Regarding the issue you mentioned above,
>> I'm
>>>> glad
>>>>> to share my thoughts:
>>>>> 
>>>>>> For process-based execution we use Flink's cache distribution
>> instead
>>>> of
>>>>> Beam's artifact staging.
>>>>> 
>>>>> In current design, we use Flink's cache distribution to upload users'
>>>> files
>>>>> from client to cluster in both docker mode and process mode. That is,
>>>>> Flink's cache distribution and Beam's artifact staging service work
>>>>> together in docker mode.
>>>>> 
>>>>> 
>>>>>> Do we want to implement two different ways of staging artifacts? It
>>>> seems
>>>>> sensible to use the same artifact staging functionality also for the
>>>>> process-based execution.
>>>>> 
>>>>> I agree that the implementation will be simple if we use the same
>>>> artifact
>>>>> staging functionality also for the process-based execution. However,
>>> it's
>>>>> not the best for performance as it will introduce an additional
>> network
>>>>> transmission, as in process mode TaskManager and python worker share
>>> the
>>>>> same environment, in which case the user files in Flink Distribute
>>> Cache
>>>>> can be accessed by python worker directly. We do not need the staging
>>>>> service in this case.
>>>>> 
>>>>>> Apart from being simpler, this would also allow the process-based
>>>>> execution to run in other environments than the Flink TaskManager
>>>>> environment.
>>>>> 
>>>>> IMHO, this case is more like docker mode, and we can share or reuse
>> the
>>>>> code of Beam docker mode. Furthermore, in this case python worker is
>>>>> launched by the operator, so it is always in the same environment as
>>> the
>>>>> operator.
>>>>> 
>>>>> Thanks again for your feedback, and it is valuable for find out the
>>> final
>>>>> best architecture.
>>>>> 
>>>>> Feel free to correct me if there is anything incorrect.
>>>>> 
>>>>> Best,
>>>>> Jincheng
>>>>> 
>>>>> Maximilian Michels <m...@apache.org> 于2019年10月16日周三 下午4:23写道：
>>>>> 
>>>>>> I'm also late to the party here :) When I saw the first draft, I
>> was
>>>>>> thinking how exactly the design doc would tie in with Beam. Thanks
>>> for
>>>>>> the update.
>>>>>> 
>>>>>> A couple of comments with this regard:
>>>>>> 
>>>>>>> Flink has provided a distributed cache mechanism and allows users
>>> to
>>>>>> upload their files using "registerCachedFile" method in
>>>>>> ExecutionEnvironment/StreamExecutionEnvironment. The python files
>>> users
>>>>>> specified through "add_python_file", "set_python_requirements" and
>>>>>> "add_python_archive" are also uploaded through this method
>>> eventually.
>>>>>> 
>>>>>> For process-based execution we use Flink's cache distribution
>> instead
>>>> of
>>>>>> Beam's artifact staging.
>>>>>> 
>>>>>>> Apache Beam Portability Framework already supports artifact
>> staging
>>>>> that
>>>>>> works out of the box with the Docker environment. We can use the
>>>> artifact
>>>>>> staging service defined in Apache Beam to transfer the dependencies
>>>> from
>>>>>> the operator to Python SDK harness running in the docker container.
>>>>>> 
>>>>>> Do we want to implement two different ways of staging artifacts? It
>>>>>> seems sensible to use the same artifact staging functionality also
>>> for
>>>>>> the process-based execution. Apart from being simpler, this would
>>> also
>>>>>> allow the process-based execution to run in other environments than
>>> the
>>>>>> Flink TaskManager environment.
>>>>>> 
>>>>>> Thanks,
>>>>>> Max
>>>>>> 
>>>>>> On 15.10.19 11:13, Wei Zhong wrote:
>>>>>>> Hi Thomas,
>>>>>>> 
>>>>>>> Thanks a lot for your suggestion!
>>>>>>> 
>>>>>>> As you can see from the section "Goals" that this FLIP focuses on
>>> the
>>>>>> dependency management in process mode. However, the APIs and design
>>>>>> proposed in this FLIP also applies for the docker mode. So it makes
>>>> sense
>>>>>> to me to also describe how this design is integated to the artifact
>>>>> staging
>>>>>> service of Apache Beam in docker mode. I have updated the design
>> doc
>>>> and
>>>>>> looking forward to your feedback.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Wei
>>>>>>> 
>>>>>>>> 在 2019年10月15日，01:54，Thomas Weise <t...@apache.org> 写道：
>>>>>>>> 
>>>>>>>> Sorry for joining the discussion late.
>>>>>>>> 
>>>>>>>> The Beam environment already supports artifact staging, it works
>>> out
>>>>> of
>>>>>> the
>>>>>>>> box with the Docker environment. I think it would be helpful to
>>>>> explain
>>>>>> in
>>>>>>>> the FLIP how this proposal relates to what Beam offers / how it
>>>> would
>>>>> be
>>>>>>>> integrated.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Thomas
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Mon, Oct 14, 2019 at 8:09 AM Jeff Zhang <zjf...@gmail.com>
>>>> wrote:
>>>>>>>> 
>>>>>>>>> +1
>>>>>>>>> 
>>>>>>>>> Hequn Cheng <chenghe...@gmail.com> 于2019年10月14日周一 下午10:55写道：
>>>>>>>>> 
>>>>>>>>>> +1
>>>>>>>>>> 
>>>>>>>>>> Good job, Wei!
>>>>>>>>>> 
>>>>>>>>>> Best, Hequn
>>>>>>>>>> 
>>>>>>>>>> On Mon, Oct 14, 2019 at 2:54 PM Dian Fu <
>> dian0511...@gmail.com>
>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Hi Wei,
>>>>>>>>>>> 
>>>>>>>>>>> +1 (non-binding). Thanks for driving this.
>>>>>>>>>>> 
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Dian
>>>>>>>>>>> 
>>>>>>>>>>>> 在 2019年10月14日，下午1:40，jincheng sun <sunjincheng...@gmail.com
>>> 
>>>> 写道：
>>>>>>>>>>>> 
>>>>>>>>>>>> +1
>>>>>>>>>>>> 
>>>>>>>>>>>> Wei Zhong <weizhong0...@gmail.com> 于2019年10月12日周六 下午8:41写道：
>>>>>>>>>>>> 
>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I would like to start the vote for FLIP-78[1] which is
>>>> discussed
>>>>>> and
>>>>>>>>>>>>> reached consensus in the discussion thread[2].
>>>>>>>>>>>>> 
>>>>>>>>>>>>> The vote will be open for at least 72 hours. I'll try to
>>> close
>>>> it
>>>>>> by
>>>>>>>>>>>>> 2019-10-16 18:00 UTC, unless there is an objection or not
>>>> enough
>>>>>>>>>> votes.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Wei
>>>>>>>>>>>>> 
>>>>>>>>>>>>> [1]
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-78%3A+Flink+Python+UDF+Environment+and+Dependency+Management
>>>>>>>>>>>>> <
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-78:+Flink+Python+UDF+Environment+and+Dependency+Management
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> [2]
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-Python-UDF-Environment-and-Dependency-Management-td33514.html
>>>>>>>>>>>>> <
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-Python-UDF-Environment-and-Dependency-Management-td33514.html
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> Best Regards
>>>>>>>>> 
>>>>>>>>> Jeff Zhang
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>>

Re: [VOTE] FLIP-78: Flink Python UDF Environment and Dependency Management

Reply via email to