Hi Hailu, tison, I created a very similar ticket before to accelerate Flink submission on Yarn[1]. However, we do not get a consensus in the PR. Maybe it's time to revive the discussion and try to find a common solution for both the two tickets[1][2].
[1]. https://issues.apache.org/jira/browse/FLINK-13938 [2]. https://issues.apache.org/jira/browse/FLINK-14964 Best, Yang Hailu, Andreas <andreas.ha...@gs.com> 于2020年3月7日周六 上午11:21写道: > Hi Tison, thanks for the reply. I’ve replied to the ticket. I’ll be > watching it as well. > > > > *// *ah > > > > *From:* tison <wander4...@gmail.com> > *Sent:* Friday, March 6, 2020 1:40 PM > *To:* Hailu, Andreas [Engineering] <andreas.ha...@ny.email.gs.com> > *Cc:* user@flink.apache.org > *Subject:* Re: Flink Conf "yarn.flink-dist-jar" Question > > > > FLINK-13938 seems a bit different than your requirement. The one totally > matches is FLINK-14964 > <https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_FLINK-2D14964&d=DwMFaQ&c=7563p3e2zaQw0AB1wrFVgyagb2IE5rTZOYPxLxfZlX4&r=hRr4SA7BtUvKoMBP6VDhfisy2OJ1ZAzai-pcCC6TFXM&m=9sMjDI0I_9Yni5ZWqV8GScK_KBTaA65yK9kBG-LE5_4&s=X1ZoN456fuc5mNxO6fBzDboEhrI0EHL873LzOd6tnN8&e=>. > I'll appreciate it if you can share you opinion on the JIRA ticket. > > > > Best, > > tison. > > > > > > tison <wander4...@gmail.com> 于2020年3月7日周六 上午2:35写道: > > Yes your requirement is exactly taken into consideration by the community. > We currently have an open JIRA ticket for the specific feature[1] and works > for loosing the constraint of flink-jar schema to support DFS location > should happen. > > > > Best, > > tison. > > > > [1] https://issues.apache.org/jira/browse/FLINK-13938 > <https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_FLINK-2D13938&d=DwMFaQ&c=7563p3e2zaQw0AB1wrFVgyagb2IE5rTZOYPxLxfZlX4&r=hRr4SA7BtUvKoMBP6VDhfisy2OJ1ZAzai-pcCC6TFXM&m=9sMjDI0I_9Yni5ZWqV8GScK_KBTaA65yK9kBG-LE5_4&s=ediMPoQtcPX7K-5fjXJxE2cPp5OySkzwXYfYj8mDWO0&e=> > > > > > > Hailu, Andreas <andreas.ha...@gs.com> 于2020年3月7日周六 上午2:03写道: > > Hi, > > > > We noticed that every time an application runs, it uploads the flink-dist > artifact to the /user/<user>/.flink HDFS directory. This causes a user disk > space quota issue as we submit thousands of apps to our cluster an hour. We > had a similar problem with our Spark applications where it uploaded the > Spark Assembly package for every app. Spark provides an argument to use a > location in HDFS its for applications to leverage so they don’t need to > upload them for every run, and that was our solution (see “spark.yarn.jar” > configuration if interested.) > > > > Looking at the Resource Orchestration Frameworks page > <https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.apache.org_projects_flink_flink-2Ddocs-2Dstable_ops_config.html-23yarn-2Dflink-2Ddist-2Djar&d=DwMFaQ&c=7563p3e2zaQw0AB1wrFVgyagb2IE5rTZOYPxLxfZlX4&r=hRr4SA7BtUvKoMBP6VDhfisy2OJ1ZAzai-pcCC6TFXM&m=9sMjDI0I_9Yni5ZWqV8GScK_KBTaA65yK9kBG-LE5_4&s=3SPuvZu9nPph-qnE3TtbTngG-k3XDBLQGyk9I_tjNtI&e=>, > I see there’s might be a similar concept through a “yarn.flink-dist-jar” > configuration option. I wanted to place the flink-dist package we’re using > in a location in HDFS and configure out jobs to point to it, e.g. > > > > yarn.flink-dist-jar: hdfs:////user/delp/.flink/flink-dist_2.11-1.9.1.jar > > > > Am I correct in that this is what I’m looking for? I gave this a try with > some jobs today, and based on what I’m seeing in the launch_container.sh in > our YARN application, it still looks like it’s being uploaded: > > > > export > _FLINK_JAR_PATH="hdfs://d279536/user/delp/.flink/application_1583031705852_117863/flink-dist_2.11-1.9.1.jar" > > > > How can I confirm? Or is this perhaps not config I’m looking for? > > > > Best, > > Andreas > > > ------------------------------ > > > Your Personal Data: We may collect and process information about you that > may be subject to data protection laws. For more information about how we > use and disclose your personal data, how we protect your information, our > legal basis to use your information, your rights and who you can contact, > please refer to: www.gs.com/privacy-notices > > > ------------------------------ > > Your Personal Data: We may collect and process information about you that > may be subject to data protection laws. For more information about how we > use and disclose your personal data, how we protect your information, our > legal basis to use your information, your rights and who you can contact, > please refer to: www.gs.com/privacy-notices >