Hi till, thanks for the feedback and suggestion. I think it make senses to only support flink-dist-*.jar at the first step. Just as your suggestion, the config option could be "yarn.submission.automatic-flink-dist-upload", default is true. Users could use "-yt/--yarnship" to specify a HDFS path that contains flink-dist-*.jar and set the above config option to "false" to disable flink-dist-*.jar uploading.
Best, Yang Till Rohrmann <trohrm...@apache.org> 于2020年4月20日周一 下午8:07写道: > Thanks for the clarification Yang. Now it makes sense to me. > > If it makes things easier, then I would still go first with the dead > simple solution to turn automatic upload of local dist off via a > configuration option before trying to implement a smart solution > which relies on pattern matching or something else. For example, users > might specify a remote location which is not accessible from the client. > Then one could not figure out which files are already uploaded. The smart > solution could be a follow up step then. > > Cheers, > Till > > On Mon, Apr 20, 2020 at 1:09 PM Yang Wang <danrtsey...@gmail.com> wrote: > >> Hi till, >> >> Sorry for that i do not giving a detailed explanation of the >> optimization. Actually, the optimization contains >> the following two parts. >> * Use remote uploaded jars to avoid unnecessary uploading(e.g. >> flink-dist-*.jar, user jars, dependencies). >> this could be done via enriching "-yt/--yarnship" to support remote ship >> files. >> * Use the "PUBLIC" or "PRIVATE" visibility of YARN local resource to >> avoid unnecessary downloading. When >> a local resource is public, once it is download by YARN NodeManager, it >> could be reused by all the application >> in the same NodeManager. >> >> >> Why do we need to specify the visibility of the remote files? Won't >>> the visibility be specified when uploading these files? >> >> It is mostly for the users who want to eliminate the unnecessary >> downloading so that the container could be >> launched faster. "PRIVATE" means the remote jars could be shared by the >> applications submitted by the current user. >> "PUBLIC" means the remote jars could be shared by all the Flink >> applications. And "APPLICATION" means they >> could only be shared by the containers of the current application in a >> same NodeManager. >> >> >> For the implementation, i think we could do it step by step. >> * Enrich "-yt/--yarnship" to support HDFS directory >> * Add a new config option to control whether to avoid the unnecessary >> uploading >> * Enrich "-yt/--yarnship" to specify local resource visibility >> >> >> Best, >> Yang >> >> >> >> Till Rohrmann <trohrm...@apache.org> 于2020年4月20日周一 下午5:26写道: >> >>> Shall we say for the first version we only can deactivate the upload of >>> local files instead of doing some optimizations? I guess my problem is that >>> I don't fully understand the optimizations yet. Maybe we introduce a power >>> user config option `yarn.submission.automatic-flink-dist-upload` or so. >>> >>> Why do we need to specify the visibility of the remote files? Won't the >>> visibility be specified when uploading these files? >>> >>> Apart from that, the proposal looks good to me. >>> >>> Cheers, >>> Till >>> >>> On Mon, Apr 20, 2020 at 5:38 AM Yang Wang <danrtsey...@gmail.com> wrote: >>> >>>> Hi tison, >>>> >>>> I think i get your concerns and points. >>>> >>>> Take both FLINK-13938[1] and FLINK-14964[2] into account, i will do in >>>> the following steps. >>>> * Enrich "-yt/--yarnship" to support HDFS directory >>>> * Enrich "-yt/--yarnship" to specify local resource visibility. It is >>>> "APPLICATION" by default. It could be also configured to "PUBLIC", >>>> which means shared by all applications, or "PRIVATE" which means shared >>>> by a same user. >>>> * Add a new config option to control whether to optimize the >>>> submission(default is false). When configured to true, Flink client will >>>> try to filter the jars and files by name and size to avoid unnecessary >>>> uploading. >>>> >>>> A very rough submission command could be issued as following. >>>> *./bin/flink run -m yarn-cluster -d -yt >>>> hdfs://myhdfs/flink/release/flink-1.11:PUBLIC,hdfs://myhdfs/user/someone/mylib >>>> \* >>>> *-yD yarn.submission-optimization.enable=true >>>> examples/streaming/WindowJoin.jar* >>>> >>>> cc @Rong Rong <walter...@gmail.com>, since you also help to review the >>>> old PR of FLINK-13938, maybe you could also share some thoughts. >>>> >>>> >>>> [1]. https://issues.apache.org/jira/browse/FLINK-13938 >>>> [2]. https://issues.apache.org/jira/browse/FLINK-14964 >>>> >>>> >>>> Best, >>>> Yang >>>> >>>> >>>> >>>> tison <wander4...@gmail.com> 于2020年4月18日周六 下午12:12写道: >>>> >>>>> Hi Yang, >>>>> >>>>> Name filtering & schema special handling makes sense for me. We can >>>>> enrich later if there is requirement without breaking interface. >>>>> >>>>> For #1, from my perspective your first proposal is >>>>> >>>>> having an option specifies remote flink/lib, then we turn off auto >>>>> uploading local flink/lib and register that path as local resources >>>>> >>>>> It seems we here add another special logic for handling one kind of >>>>> things...what I propose is we do these two steps explicitly separated: >>>>> >>>>> 1. an option turns off auto uploading local flink/lib >>>>> 2. a general option register remote files as local resources >>>>> >>>>> The rest thing here is that you propose we handle flink/lib as PUBLIC >>>>> visibility while other files as APPLICATION visibility, whether a >>>>> composite configuration or name filtering to special handle libs makes >>>>> sense though. >>>>> >>>>> YarnClusterDescriptor already has a lot of special handling logics >>>>> which introduce a number of config options and keys, which should >>>>> have been configured in few of common options and validated at the >>>>> runtime. >>>>> >>>>> Best, >>>>> tison. >>>>> >>>>> >>>>> Yang Wang <danrtsey...@gmail.com> 于2020年4月17日周五 下午11:42写道: >>>>> >>>>>> Hi tison, >>>>>> >>>>>> For #3, if you mean registering remote HDFS file as local resource, >>>>>> we should make the "-yt/--yarnship" >>>>>> to support remote directory. I think it is the right direction. >>>>>> >>>>>> For #1, if the users could ship remote directory, then they could >>>>>> also specify like this >>>>>> "-yt hdfs://hdpdev/flink/release/flink-1.x, >>>>>> hdfs://hdpdev/user/someone/mylib". Do you mean we add an >>>>>> option for whether trying to avoid unnecessary uploading? Maybe we >>>>>> could filter by names and file size. >>>>>> I think this is a good suggestion, and we do not need to introduce a >>>>>> new config option "-ypl". >>>>>> >>>>>> For #2, for flink-dist, the #1 could already solve the problem. We do >>>>>> not need to support remote schema. >>>>>> It will confuse the users when we only support HDFS, not S3, OSS, etc. >>>>>> >>>>>> >>>>>> Best, >>>>>> Yang >>>>>> >>>>>> tison <wander4...@gmail.com> 于2020年4月17日周五 下午8:05写道: >>>>>> >>>>>>> Hi Yang, >>>>>>> >>>>>>> I agree that these two of works would benefit from single assignee. >>>>>>> My concern is as below >>>>>>> >>>>>>> 1. Both share libs & remote flink dist/libs are remote ship files. I >>>>>>> don't think we have to implement multiple codepath/configuration. >>>>>>> 2. So, for concept clarification, there are >>>>>>> (1) an option to disable shipping local libs >>>>>>> (2) flink-dist supports multiple schema at least we said "hdfs://" >>>>>>> (3) an option for registering remote shipfiles with path & >>>>>>> visibility. I think new configuration system helps. >>>>>>> >>>>>>> the reason we have to special handling (2) instead of including it >>>>>>> in (3) is because when shipping flink-dist to TM container, we specially >>>>>>> detect flink-dist. Of course we can merge it into general ship files and >>>>>>> validate shipfiles finally contain flink-dist, which is an alternative. >>>>>>> >>>>>>> The *most important* difference is (1) and (3) which we don't have >>>>>>> an option for only remote libs. Is this clarification satisfy your >>>>>>> proposal? >>>>>>> >>>>>>> Best, >>>>>>> tison. >>>>>>> >>>>>>> >>>>>>> Till Rohrmann <trohrm...@apache.org> 于2020年4月17日周五 下午7:49写道: >>>>>>> >>>>>>>> Hi Yang, >>>>>>>> >>>>>>>> from what I understand it sounds reasonable to me. Could you sync >>>>>>>> with Tison on FLINK-14964 on how to proceed. I'm not super deep into >>>>>>>> these >>>>>>>> issues but they seem to be somewhat related and Tison already did some >>>>>>>> implementation work. >>>>>>>> >>>>>>>> I'd say it be awesome if we could include this kind of improvement >>>>>>>> into the release. >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Till >>>>>>>> >>>>>>>> On Thu, Apr 16, 2020 at 4:43 AM Yang Wang <danrtsey...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi All, thanks a lot for reviving this discussion. >>>>>>>>> >>>>>>>>> I think we could unify the FLINK-13938 and FLINK-14964 since they >>>>>>>>> have the similar >>>>>>>>> purpose, avoid unnecessary uploading and downloading jars in YARN >>>>>>>>> deployment. >>>>>>>>> The difference is FLINK-13938 aims to support the flink system lib >>>>>>>>> directory only, while >>>>>>>>> FLINK-14964 is trying to support arbitrary pre-uloaded >>>>>>>>> jars(including user and system jars). >>>>>>>>> >>>>>>>>> >>>>>>>>> So i suggest to do this feature as following. >>>>>>>>> 1. Upload the flink lib directory or users to hdfs, e.g. >>>>>>>>> "hdfs://hdpdev/flink/release/flink-1.x" >>>>>>>>> "hdfs://hdpdev/user/someone/mylib" >>>>>>>>> 2. Use the -ypl argument to specify the shared lib, multiple >>>>>>>>> directories could be specified >>>>>>>>> 3. YarnClusterDescriptor will use the pre-uploaded jars to avoid >>>>>>>>> unnecessary uploading, >>>>>>>>> both for system and user jars >>>>>>>>> 4. YarnClusterDescriptor needs to set the system jars to public >>>>>>>>> visibility so that the distributed >>>>>>>>> cache in the YARN nodemanager could be reused by multiple >>>>>>>>> applications. This is to avoid >>>>>>>>> unnecessary downloading, especially for the "flink-dist-*.jar". >>>>>>>>> For the user shared lib, the >>>>>>>>> visibility is still set to "APPLICATION" level. >>>>>>>>> >>>>>>>>> >>>>>>>>> For our past internal use case, the shared lib could help with >>>>>>>>> accelerating the submission a lot. >>>>>>>>> Also it helps to reduce the pressure of HDFS when we want to >>>>>>>>> launch many applications together. >>>>>>>>> >>>>>>>>> @tison @Till Rohrmann <trohrm...@apache.org> @Hailu, Andreas >>>>>>>>> <andreas.ha...@gs.com> If you guys thinks the suggestion makes >>>>>>>>> sense. I >>>>>>>>> will try to find some time to work on this and hope it could catch >>>>>>>>> up with release-1.1 cycle. >>>>>>>>> >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> Yang >>>>>>>>> >>>>>>>>> Hailu, Andreas [Engineering] <andreas.ha...@gs.com> 于2020年4月16日周四 >>>>>>>>> 上午8:47写道: >>>>>>>>> >>>>>>>>>> Okay, I’ll continue to watch the JIRAs. Thanks for the update, >>>>>>>>>> Till. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> *// *ah >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> *From:* Till Rohrmann <trohrm...@apache.org> >>>>>>>>>> *Sent:* Wednesday, April 15, 2020 10:51 AM >>>>>>>>>> *To:* Hailu, Andreas [Engineering] <andreas.ha...@ny.email.gs.com >>>>>>>>>> > >>>>>>>>>> *Cc:* Yang Wang <danrtsey...@gmail.com>; tison < >>>>>>>>>> wander4...@gmail.com>; user@flink.apache.org >>>>>>>>>> *Subject:* Re: Flink Conf "yarn.flink-dist-jar" Question >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Hi Andreas, >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> it looks as if FLINK-13938 and FLINK-14964 won't make it into the >>>>>>>>>> 1.10.1 release because the community is about to start the release >>>>>>>>>> process. >>>>>>>>>> Since FLINK-13938 is a new feature it will be shipped with a major >>>>>>>>>> release. >>>>>>>>>> There is still a bit of time until the 1.11 feature freeze and if >>>>>>>>>> Yang Wang >>>>>>>>>> has time to finish this PR, then we could ship it. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> >>>>>>>>>> Till >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Wed, Apr 15, 2020 at 3:23 PM Hailu, Andreas [Engineering] < >>>>>>>>>> andreas.ha...@gs.com> wrote: >>>>>>>>>> >>>>>>>>>> Yang, Tison, >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Do we know when some solution for 13938 and 14964 will arrive? Do >>>>>>>>>> you think it will be in a 1.10.x version? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> *// *ah >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> *From:* Hailu, Andreas [Engineering] >>>>>>>>>> *Sent:* Friday, March 20, 2020 9:19 AM >>>>>>>>>> *To:* 'Yang Wang' <danrtsey...@gmail.com> >>>>>>>>>> *Cc:* tison <wander4...@gmail.com>; user@flink.apache.org >>>>>>>>>> *Subject:* RE: Flink Conf "yarn.flink-dist-jar" Question >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Hi Yang, >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> This is good to know. As a stopgap measure until a solution >>>>>>>>>> between 13938 and 14964 arrives, we can automate the application >>>>>>>>>> staging >>>>>>>>>> directory cleanup from our client should the process fail. It’s not >>>>>>>>>> ideal, >>>>>>>>>> but will at least begin to manage our users’ quota. I’ll continue to >>>>>>>>>> watch >>>>>>>>>> the two tickets. Thank you. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> *// *ah >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> *From:* Yang Wang <danrtsey...@gmail.com> >>>>>>>>>> *Sent:* Monday, March 16, 2020 9:37 PM >>>>>>>>>> *To:* Hailu, Andreas [Engineering] <andreas.ha...@ny.email.gs.com >>>>>>>>>> > >>>>>>>>>> *Cc:* tison <wander4...@gmail.com>; user@flink.apache.org >>>>>>>>>> *Subject:* Re: Flink Conf "yarn.flink-dist-jar" Question >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Hi Hailu, >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Sorry for the late response. If the Flink cluster(e.g. Yarn >>>>>>>>>> application) is stopped directly >>>>>>>>>> >>>>>>>>>> by `yarn application -kill`, then the staging directory will be >>>>>>>>>> left behind. Since the jobmanager >>>>>>>>>> >>>>>>>>>> do not have any change to clean up the staging directly. Also it >>>>>>>>>> may happen when the >>>>>>>>>> >>>>>>>>>> jobmanager crashed and reached the attempts limit of Yarn. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> For FLINK-13938, yes, it is trying to use the Yarn public cache >>>>>>>>>> to accelerate the container >>>>>>>>>> >>>>>>>>>> launch. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> >>>>>>>>>> Yang >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Hailu, Andreas <andreas.ha...@gs.com> 于2020年3月10日周二 上午4:38写道: >>>>>>>>>> >>>>>>>>>> Also may I ask what causes these application ID directories to be >>>>>>>>>> left behind? Is it a job failure, or can they persist even if the >>>>>>>>>> application succeeds? I’d like to know so that I can implement my own >>>>>>>>>> cleanup in the interim to prevent exceeding user disk space quotas. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> *// *ah >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> *From:* Hailu, Andreas [Engineering] >>>>>>>>>> *Sent:* Monday, March 9, 2020 1:20 PM >>>>>>>>>> *To:* 'Yang Wang' <danrtsey...@gmail.com> >>>>>>>>>> *Cc:* tison <wander4...@gmail.com>; user@flink.apache.org >>>>>>>>>> *Subject:* RE: Flink Conf "yarn.flink-dist-jar" Question >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Hi Yang, >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Yes, a combination of these two would be very helpful for us. We >>>>>>>>>> have a single shaded binary which we use to run all of the jobs on >>>>>>>>>> our YARN >>>>>>>>>> cluster. If we could designate a single location in HDFS for that as >>>>>>>>>> well, >>>>>>>>>> we could also greatly benefit from FLINK-13938. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> It sounds like a general public cache solution is what’s being >>>>>>>>>> called for? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> *// *ah >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> *From:* Yang Wang <danrtsey...@gmail.com> >>>>>>>>>> *Sent:* Sunday, March 8, 2020 10:52 PM >>>>>>>>>> *To:* Hailu, Andreas [Engineering] <andreas.ha...@ny.email.gs.com >>>>>>>>>> > >>>>>>>>>> *Cc:* tison <wander4...@gmail.com>; user@flink.apache.org >>>>>>>>>> *Subject:* Re: Flink Conf "yarn.flink-dist-jar" Question >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Hi Hailu, tison, >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I created a very similar ticket before to accelerate Flink >>>>>>>>>> submission on Yarn[1]. However, >>>>>>>>>> >>>>>>>>>> we do not get a consensus in the PR. Maybe it's time to revive >>>>>>>>>> the discussion and try >>>>>>>>>> >>>>>>>>>> to find a common solution for both the two tickets[1][2]. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> [1]. https://issues.apache.org/jira/browse/FLINK-13938 >>>>>>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_FLINK-2D13938&d=DwMFaQ&c=7563p3e2zaQw0AB1wrFVgyagb2IE5rTZOYPxLxfZlX4&r=hRr4SA7BtUvKoMBP6VDhfisy2OJ1ZAzai-pcCC6TFXM&m=rlD0F8Cr4H0aPlN6O2_K13Q76RFOERSWuJANh4q6X_8&s=njA3vGYTf0g7Zsog8AiwS4bbXxblOxepBEWUV9W3E0s&e=> >>>>>>>>>> >>>>>>>>>> [2]. https://issues.apache.org/jira/browse/FLINK-14964 >>>>>>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_FLINK-2D14964&d=DwMFaQ&c=7563p3e2zaQw0AB1wrFVgyagb2IE5rTZOYPxLxfZlX4&r=hRr4SA7BtUvKoMBP6VDhfisy2OJ1ZAzai-pcCC6TFXM&m=rlD0F8Cr4H0aPlN6O2_K13Q76RFOERSWuJANh4q6X_8&s=9kT1RZkGwWh3MAbc_ZUrsEsmRRfw6VK4rlNIeNxs6GU&e=> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> >>>>>>>>>> Yang >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Hailu, Andreas <andreas.ha...@gs.com> 于2020年3月7日周六 上午11:21写道: >>>>>>>>>> >>>>>>>>>> Hi Tison, thanks for the reply. I’ve replied to the ticket. I’ll >>>>>>>>>> be watching it as well. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> *// *ah >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> *From:* tison <wander4...@gmail.com> >>>>>>>>>> *Sent:* Friday, March 6, 2020 1:40 PM >>>>>>>>>> *To:* Hailu, Andreas [Engineering] <andreas.ha...@ny.email.gs.com >>>>>>>>>> > >>>>>>>>>> *Cc:* user@flink.apache.org >>>>>>>>>> *Subject:* Re: Flink Conf "yarn.flink-dist-jar" Question >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> FLINK-13938 seems a bit different than your requirement. The one >>>>>>>>>> totally matches is FLINK-14964 >>>>>>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_FLINK-2D14964&d=DwMFaQ&c=7563p3e2zaQw0AB1wrFVgyagb2IE5rTZOYPxLxfZlX4&r=hRr4SA7BtUvKoMBP6VDhfisy2OJ1ZAzai-pcCC6TFXM&m=9sMjDI0I_9Yni5ZWqV8GScK_KBTaA65yK9kBG-LE5_4&s=X1ZoN456fuc5mNxO6fBzDboEhrI0EHL873LzOd6tnN8&e=>. >>>>>>>>>> I'll appreciate it if you can share you opinion on the JIRA ticket. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> >>>>>>>>>> tison. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> tison <wander4...@gmail.com> 于2020年3月7日周六 上午2:35写道: >>>>>>>>>> >>>>>>>>>> Yes your requirement is exactly taken into consideration by the >>>>>>>>>> community. We currently have an open JIRA ticket for the specific >>>>>>>>>> feature[1] and works for loosing the constraint of flink-jar schema >>>>>>>>>> to >>>>>>>>>> support DFS location should happen. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> >>>>>>>>>> tison. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> [1] https://issues.apache.org/jira/browse/FLINK-13938 >>>>>>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_FLINK-2D13938&d=DwMFaQ&c=7563p3e2zaQw0AB1wrFVgyagb2IE5rTZOYPxLxfZlX4&r=hRr4SA7BtUvKoMBP6VDhfisy2OJ1ZAzai-pcCC6TFXM&m=9sMjDI0I_9Yni5ZWqV8GScK_KBTaA65yK9kBG-LE5_4&s=ediMPoQtcPX7K-5fjXJxE2cPp5OySkzwXYfYj8mDWO0&e=> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Hailu, Andreas <andreas.ha...@gs.com> 于2020年3月7日周六 上午2:03写道: >>>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> We noticed that every time an application runs, it uploads the >>>>>>>>>> flink-dist artifact to the /user/<user>/.flink HDFS directory. This >>>>>>>>>> causes >>>>>>>>>> a user disk space quota issue as we submit thousands of apps to our >>>>>>>>>> cluster >>>>>>>>>> an hour. We had a similar problem with our Spark applications where >>>>>>>>>> it >>>>>>>>>> uploaded the Spark Assembly package for every app. Spark provides an >>>>>>>>>> argument to use a location in HDFS its for applications to leverage >>>>>>>>>> so they >>>>>>>>>> don’t need to upload them for every run, and that was our solution >>>>>>>>>> (see >>>>>>>>>> “spark.yarn.jar” configuration if interested.) >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Looking at the Resource Orchestration Frameworks page >>>>>>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.apache.org_projects_flink_flink-2Ddocs-2Dstable_ops_config.html-23yarn-2Dflink-2Ddist-2Djar&d=DwMFaQ&c=7563p3e2zaQw0AB1wrFVgyagb2IE5rTZOYPxLxfZlX4&r=hRr4SA7BtUvKoMBP6VDhfisy2OJ1ZAzai-pcCC6TFXM&m=9sMjDI0I_9Yni5ZWqV8GScK_KBTaA65yK9kBG-LE5_4&s=3SPuvZu9nPph-qnE3TtbTngG-k3XDBLQGyk9I_tjNtI&e=>, >>>>>>>>>> I see there’s might be a similar concept through a >>>>>>>>>> “yarn.flink-dist-jar” >>>>>>>>>> configuration option. I wanted to place the flink-dist package we’re >>>>>>>>>> using >>>>>>>>>> in a location in HDFS and configure out jobs to point to it, e.g. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> yarn.flink-dist-jar: >>>>>>>>>> hdfs:////user/delp/.flink/flink-dist_2.11-1.9.1.jar >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Am I correct in that this is what I’m looking for? I gave this a >>>>>>>>>> try with some jobs today, and based on what I’m seeing in the >>>>>>>>>> launch_container.sh in our YARN application, it still looks like >>>>>>>>>> it’s being >>>>>>>>>> uploaded: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> export >>>>>>>>>> _FLINK_JAR_PATH="hdfs://d279536/user/delp/.flink/application_1583031705852_117863/flink-dist_2.11-1.9.1.jar" >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> How can I confirm? Or is this perhaps not config I’m looking for? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> >>>>>>>>>> Andreas >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> ------------------------------ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Your Personal Data: We may collect and process information about >>>>>>>>>> you that may be subject to data protection laws. For more >>>>>>>>>> information about >>>>>>>>>> how we use and disclose your personal data, how we protect your >>>>>>>>>> information, our legal basis to use your information, your rights >>>>>>>>>> and who >>>>>>>>>> you can contact, please refer to: www.gs.com/privacy-notices >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> ------------------------------ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Your Personal Data: We may collect and process information about >>>>>>>>>> you that may be subject to data protection laws. For more >>>>>>>>>> information about >>>>>>>>>> how we use and disclose your personal data, how we protect your >>>>>>>>>> information, our legal basis to use your information, your rights >>>>>>>>>> and who >>>>>>>>>> you can contact, please refer to: www.gs.com/privacy-notices >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> ------------------------------ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Your Personal Data: We may collect and process information about >>>>>>>>>> you that may be subject to data protection laws. For more >>>>>>>>>> information about >>>>>>>>>> how we use and disclose your personal data, how we protect your >>>>>>>>>> information, our legal basis to use your information, your rights >>>>>>>>>> and who >>>>>>>>>> you can contact, please refer to: www.gs.com/privacy-notices >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> ------------------------------ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Your Personal Data: We may collect and process information about >>>>>>>>>> you that may be subject to data protection laws. For more >>>>>>>>>> information about >>>>>>>>>> how we use and disclose your personal data, how we protect your >>>>>>>>>> information, our legal basis to use your information, your rights >>>>>>>>>> and who >>>>>>>>>> you can contact, please refer to: www.gs.com/privacy-notices >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> ------------------------------ >>>>>>>>>> >>>>>>>>>> Your Personal Data: We may collect and process information about >>>>>>>>>> you that may be subject to data protection laws. For more >>>>>>>>>> information about >>>>>>>>>> how we use and disclose your personal data, how we protect your >>>>>>>>>> information, our legal basis to use your information, your rights >>>>>>>>>> and who >>>>>>>>>> you can contact, please refer to: www.gs.com/privacy-notices >>>>>>>>>> >>>>>>>>>