Thanks for updating the SEP wiki. The revised design looks clear to me. Some config name might be simplified, e.g. job.config.loader.properties.path. Overall it looks good.
Thanks, Xinyu On Fri, Dec 6, 2019 at 1:52 PM Ke Wu <ke.wu...@gmail.com> wrote: > As we are revamping our config loading logic in SEP-23: Simplify Job > Runner, we will introduce backward incompatible changes on the job > submission logic, i.e. deprecating the usage of --config-factory and > --config-path, which reads a full config upon job submission, instead, we > will only provide job submission related config through --config and > ConfigLoader will load the complete config on AM instead. > > Please take a look and chime in your thoughts. > > Best, > Ke > > > On Dec 2, 2019, at 4:42 PM, Ke Wu <ke.wu...@gmail.com> wrote: > > > > Hi Xinyu, > > > > Please see the response in line: > > > >> 1. After this change, seems the original config-factory and config-path > >> are only used to supply parameters for submitting job. Is that the > case? > >> Which configs are still needed in the submission? > > > > Yes, only configs related to job submission is needed. > > > > job.name, job.factory.class & yarn.package.path are the minimum three > configs needed for the job submission, which may be supplied by --config > instead. > > > >> 2. For backward compatibility, does it still work if the user doesn't > >> specify the new ConfigLoader in the command line? The > >> PropertiesConfigLoader class seems requiring the path of the config > after > >> exploding the tgz. > > > > If the user does not specify config loader in the config, then it will > work in the previous flow, where runner publishes configs in coordinator > stream and job coordinator/application master will pick it up by reading > from Kafka. So this is a backward compatible change. > > > >> 3. If the final plan is to remove the original config factory/path, how > >> do we pass the parameters needed for Yarn submission, e.g. job name, > id, > >> and tgz path? > > > > We can either pass them by --config or introduce delicate command line > arguments for it in CommandLine.scala. > > > > > > Let me know if you have any further questions. > > > > Best, > > Ke > > > >> On Nov 27, 2019, at 11:02 AM, Xinyu Liu <xinyuliu...@gmail.com> wrote: > >> > >> Thanks a lot for putting out the design for simplifying the job > submission > >> process. The motivation makes sense to me that most of the planning and > >> config generation should be done after submitting to the cluster, > instead > >> of during the submission, which can happen in a local sandbox without > the > >> access to the resources needed for planning. It also improves the > process > >> from the security stand of the view. > >> > >> A few questions regarding to the interface changes: > >> > >> 1. After this change, seems the original config-factory and config-path > >> are only used to supply parameters for submitting job. Is that the > case? > >> Which configs are still needed in the submission? > >> 2. For backward compatibility, does it still work if the user doesn't > >> specify the new ConfigLoader in the command line? The > >> PropertiesConfigLoader class seems requiring the path of the config > after > >> exploding the tgz. > >> 3. If the final plan is to remove the original config factory/path, how > >> do we pass the parameters needed for Yarn submission, e.g. job name, > id, > >> and tgz path? > >> > >> Thanks, > >> Xinyu > >> > >> On Fri, Nov 15, 2019 at 3:00 PM Ke Wu <ke.wu...@gmail.com> wrote: > >> > >>> We created SEP-23: Simplify Job Runner, which simplifies job runner by > >>> moving config retrieval and planning to AM. > >>> > >>> Please find out the SEP wiki below: > >>> > >>> > https://cwiki.apache.org/confluence/display/SAMZA/SEP-23%3A+Simplify+Job+Runner > >>> > >>> Please take a look and chime in your thoughts. > >>> > >>> Thanks, > >>> Ke > >>> > > > >