Re: [SURVEY] How many people implement Flink job based on the interface Program?

Flavio Pompermaier Tue, 23 Jul 2019 03:55:56 -0700

I agree but you have to know in which jar a job is contained..when you
upload the jar on our application you immediately know the qualified name
of the job class and in which jar it belongs to. I think that when you
upload a jar on Flink, Flink should list all available jobs inside it
(IMHO)..it  could be a single main class (as it is now) or multiple classes
(IMHO)


On Tue, Jul 23, 2019 at 12:13 PM Jeff Zhang <zjf...@gmail.com> wrote:

> IIUC the list of jobs contained in jar means the jobs you defined in the
> pipeline. Then I don't think it is flink's responsibility to maintain the
> job list info, it is the job scheduler that define the pipeline. So the job
> scheduler should maintain the job list.
>
>
>
> Flavio Pompermaier <pomperma...@okkam.it> 于2019年7月23日周二 下午5:23写道：
>
>> The jobs are somehow related to each other in the sense that we have a
>> configurable pipeline where there are optional steps you can enable/disable
>> (and thus we create a single big jar).
>> Because of this, we have our application REST service that actually works
>> also as a job scheduler and use the job server as a proxy towards Flink:
>> when one steps ends (this is what is signalled back after the env.execute()
>> from Flink to the application REST service) our application tells the job
>> server to execute the next job of the pipeline on the cluster.
>> Of course this is a "dirty" solution (because we should user a workflow
>> scheduler like Airflow or Luigi or similar) but we wanted to keep things as
>> simplest as possible for the moment.
>> In the future, if our customers would ever improve this part, we will
>> integrate our application with a dedicated job scheduler like the one
>> listed before (probably)..I don't know if some of them are nowadays already
>> integrated with Flink..when we started coding our frontend application (2
>> ears ago) none of them were using it.
>>
>> Best,
>> Flavio
>>
>> On Tue, Jul 23, 2019 at 10:40 AM Jeff Zhang <zjf...@gmail.com> wrote:
>>
>>> Thanks Flavio,
>>>
>>> I get most of your points except one
>>>
>>>    - Get the list of jobs contained in jar (ideally this is is true for
>>>    every engine beyond Spark or Flink)
>>>
>>> Just curious to know how you submit job via rest api, if there're
>>> multiple jobs in one jar, then do you need to submit jar one time and
>>> submit jobs multiple times ?
>>> And is there any relationship between these jobs in the same jar ?
>>>
>>>
>>>
>>> Flavio Pompermaier <pomperma...@okkam.it> 于2019年7月23日周二 下午4:01写道：
>>>
>>>> Hi Jeff, the thing about the manifest is really about to have a way to
>>>> list multiple main classes in the jart (without the need to inspect every
>>>> Java class or forcing a 1-to-1 between jar and job like it is now).
>>>> My requirements were driven by the UI we're using in our framework:
>>>>
>>>>    - Get the list of jobs contained in jar (ideally this is is true
>>>>    for every engine beyond Spark or Flink)
>>>>    - Get the list of required/optional parameters for each job
>>>>    - Besides the optionality of a parameter, each parameter should
>>>>    include an help description, a type (to validate the input param), a
>>>>    default value and a set of choices (when there's a limited number of
>>>>    options available)
>>>>    - obviously the job serve should be able to
>>>>    submit/run/cancel/monitor a job and upload/delete the uploaded jars
>>>>    - the job server should not depend on any target platform
>>>>    dependency (Spark or Flink) beyond the rest client: at the moment the 
>>>> rest
>>>>    client requires a lot of core libs (indeed because it needs to submit 
>>>> the
>>>>    job graph/plan)
>>>>    - in our vision, the flink client should be something like Apache
>>>>    Livy (https://livy.apache.org/)
>>>>    - One of the biggest  limitations we face when running a Flink job
>>>>    from the REST API is the fact that the job can't do anything after
>>>>    env.execute() while we need to call an external service to signal that 
>>>> the
>>>>    job has ended + some other details
>>>>
>>>> Best,
>>>> Flavio
>>>>
>>>> On Tue, Jul 23, 2019 at 3:44 AM Jeff Zhang <zjf...@gmail.com> wrote:
>>>>
>>>>> Hi Flavio,
>>>>>
>>>>> Based on the discussion in the tickets you mentioned above, the
>>>>> program-class attribute was a mistake and community is intended to use
>>>>> main-class to replace it.
>>>>>
>>>>> Deprecating Program interface is a part of work of flink new client
>>>>> api.
>>>>> IIUC, your requirements are not so complicated. We can implement that
>>>>> in the new flink client api. How about listing your requirement, and let's
>>>>> discuss how we can make it in the new flink client api. BTW, I guess most
>>>>> of your requirements are based on your flink job server, It would be
>>>>> helpful if you could provide more info about your flink job server. Thanks
>>>>>
>>>>>
>>>>>
>>>>> Flavio Pompermaier <pomperma...@okkam.it> 于2019年7月22日周一 下午8:59写道：
>>>>>
>>>>>> Hi Tison,
>>>>>> we use a modified version of the Program interface to enable a web UI
>>>>>> do properly detect and run Flink jobs contained in a jar + their 
>>>>>> parameters.
>>>>>> As stated in [1], we dected multiple Main classes per jar by handling
>>>>>> an extra comma-separeted Manifest entry (i.e. 'Main-classes').
>>>>>>
>>>>>> As mentioned on the discussion on the dev ML, our revised Program
>>>>>> interface looks like this:
>>>>>>
>>>>>> public interface FlinkJob {
>>>>>>   String getDescription();
>>>>>>   List<FlinkJobParameter> getParameters();
>>>>>>   boolean isStreamingOrBatch();
>>>>>> }
>>>>>>
>>>>>> public class FlinkJobParameter {
>>>>>>   private String paramName;
>>>>>>   private String paramType = "string";
>>>>>>   private String paramDesc;
>>>>>>   private String paramDefaultValue;
>>>>>>   private Set<String> choices;
>>>>>>   private boolean mandatory;
>>>>>> }
>>>>>>
>>>>>> I've also opened some JIRA issues related to this topic:
>>>>>>
>>>>>> [1] https://issues.apache.org/jira/browse/FLINK-10864
>>>>>> [2] https://issues.apache.org/jira/browse/FLINK-10862
>>>>>> [3] https://issues.apache.org/jira/browse/FLINK-10879.
>>>>>>
>>>>>> Best,
>>>>>> Flavio
>>>>>>
>>>>>>
>>>>>> On Mon, Jul 22, 2019 at 1:46 PM Zili Chen <wander4...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi guys,
>>>>>>>
>>>>>>> We want to have an accurate idea of how many people are implementing
>>>>>>> Flink job based on the interface Program, and how they actually
>>>>>>> implement it.
>>>>>>>
>>>>>>> The reason I ask for the survey is from this thread[1] where we
>>>>>>> notice
>>>>>>> this codepath is stale and less useful than it should be. As it is an
>>>>>>> interface marked as @PublicEvolving it is originally aimed at serving
>>>>>>> as user interface. Thus before doing deprecation or dropping, we'd
>>>>>>> like
>>>>>>> to see if there are users implementing their job based on this
>>>>>>> interface(org.apache.flink.api.common.Program) and if there is any,
>>>>>>> we are curious about how it is used.
>>>>>>>
>>>>>>> If little or none of Flink user based on this interface, we would
>>>>>>> propose deprecating or dropping it.
>>>>>>>
>>>>>>> I really appreciate your time and your insight.
>>>>>>>
>>>>>>> Best,
>>>>>>> tison.
>>>>>>>
>>>>>>> [1]
>>>>>>> https://lists.apache.org/thread.html/7ffc9936a384b891dbcf0a481d26c6d13b2125607c200577780d1e18@%3Cdev.flink.apache.org%3E
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Best Regards
>>>>>
>>>>> Jeff Zhang
>>>>>
>>>>
>>>>
>>>>
>>>
>>> --
>>> Best Regards
>>>
>>> Jeff Zhang
>>>
>>
>>
>>
>
> --
> Best Regards
>
> Jeff Zhang
>

Re: [SURVEY] How many people implement Flink job based on the interface Program?

Reply via email to