Re: Change for submitting to yarn in 1.3.1

2015-05-25 Thread Chester Chen
I put the design requirements and description in the commit comment. So I will close the PR. please refer the following commit https://github.com/AlpineNow/spark/commit/5b336bbfe92eabca7f4c20e5d49e51bb3721da4d On Mon, May 25, 2015 at 3:21 PM, Chester Chen wrote: > All, > I have created a

Re: Change for submitting to yarn in 1.3.1

2015-05-25 Thread Chester Chen
All, I have created a PR just for the purpose of helping document the use case, requirements and design. As it is unlikely to get merge in. So it only used to illustrate the problems we trying and solve and approaches we took. https://github.com/apache/spark/pull/6398 Hope this helps

Re: Change for submitting to yarn in 1.3.1

2015-05-22 Thread Marcelo Vanzin
Hi Kevin, One thing that might help you in the meantime, while we work on a better interface for all this... On Thu, May 21, 2015 at 5:21 PM, Kevin Markey wrote: > Making *yarn.Client* private has prevented us from moving from Spark > 1.0.x to Spark 1.2 or 1.3 despite many alluring new features

Re: Change for submitting to yarn in 1.3.1

2015-05-21 Thread Koert Kuipers
we also launch jobs programmatically, both on standalone mode and yarn-client mode. in standalone mode it always worked, in yarn-client mode we ran into some issues and were forced to use spark-submit, but i still have on my todo list to move back to a normal java launch without spark-submit at som

Re: Change for submitting to yarn in 1.3.1

2015-05-21 Thread Nathan Kronenfeld
Thanks, Marcelo > Instantiating SparkContext directly works. Well, sorta: it has > limitations. For example, see discussions about Spark not really liking > multiple contexts in the same JVM. It also does not work in "cluster" > deploy mode. > > That's fine - when one is doing something out of s

Re: Change for submitting to yarn in 1.3.1

2015-05-21 Thread Marcelo Vanzin
Hi Nathan, On Thu, May 21, 2015 at 7:30 PM, Nathan Kronenfeld < nkronenfeld@uncharted.software> wrote: > > >> In researching and discussing these issues with Cloudera and others, >> we've been told that only one mechanism is supported for starting Spark >> jobs: the *spark-submit* scripts. >> > >

Re: Change for submitting to yarn in 1.3.1

2015-05-21 Thread Nathan Kronenfeld
> In researching and discussing these issues with Cloudera and others, we've > been told that only one mechanism is supported for starting Spark jobs: the > *spark-submit* scripts. > Is this new? We've been submitting jobs directly from a programatically created spark context (instead of through s

Re: Change for submitting to yarn in 1.3.1

2015-05-21 Thread Marcelo Vanzin
Hi Kevin, I read through your e-mail and I see two main things you're talking about. - You want a public YARN "Client" class and don't really care about anything else. In you message you already mention why that's not a good idea. It's much better to have a standardized submission API. As you no

Re: Change for submitting to yarn in 1.3.1

2015-05-21 Thread Kevin Markey
This is an excellent discussion.  As mentioned in an earlier email, we agree with a number of Chester's suggestions, but we have yet other concerns.  I've researched this further in the past several days, and I've queried my team.  This email attempts to

Re: Change for submitting to yarn in 1.3.1

2015-05-15 Thread Marcelo Vanzin
Hi Chester, Writing a design / requirements doc sounds great. One comment though: On Thu, May 14, 2015 at 11:18 PM, Chester At Work wrote: >For #5 yes, it's about the command line args. These are args are > the input for the spark jobs. Seems a bit too much to create a file just to > sp

Re: Change for submitting to yarn in 1.3.1

2015-05-14 Thread Chester At Work
Marcelo Thanks for the comments. All my requirements are from our work over last year in yarn-cluster mode. So I am biased on the yarn side. It's true some of the task might be able accomplished with a separate yarn API call, the API just does not same to be that nature any more if w

Re: Change for submitting to yarn in 1.3.1

2015-05-14 Thread Marcelo Vanzin
Hi Chester, Thanks for the feedback. A few of those are great candidates for improvements to the launcher library. On Wed, May 13, 2015 at 5:44 AM, Chester At Work wrote: > 1) client should not be private ( unless alternative is provided) so > we can call it directly. > Patrick already to

Re: Change for submitting to yarn in 1.3.1

2015-05-13 Thread Chester @work
Patrick Thanks for responding. Yes. many of are features requests not private client related. These are the things I have been working with since last year. I have trying to push the PR for these changes. If the new Launcher lib is the way to go , we will try to work with new APIs. T

Re: Change for submitting to yarn in 1.3.1

2015-05-13 Thread Patrick Wendell
Hey Chester, Thanks for sending this. It's very helpful to have this list. The reason we made the Client API private was that it was never intended to be used by third parties programmatically and we don't intend to support it in its current form as a stable API. We thought the fact that it was f

Re: Change for submitting to yarn in 1.3.1

2015-05-13 Thread Chester At Work
Patrick There are several things we need, some of them already mentioned in the mailing list before. I haven't looked at the SparkLauncher code, but here are few things we need from our perspectives for Spark Yarn Client 1) client should not be private ( unless alternative is provid

Re: Change for submitting to yarn in 1.3.1

2015-05-12 Thread Patrick Wendell
Hey Kevin and Ron, So is the main shortcoming of the launcher library the inability to get an app ID back from YARN? Or are there other issues here that fundamentally regress things for you. It seems like adding a way to get back the appID would be a reasonable addition to the launcher. - Patric

Re: Change for submitting to yarn in 1.3.1

2015-05-12 Thread Marcelo Vanzin
On Tue, May 12, 2015 at 11:34 AM, Kevin Markey wrote: > I understand that SparkLauncher was supposed to address these issues, but > it really doesn't. Yarn already provides indirection and an arm's length > transaction for starting Spark on a cluster. The launcher introduces yet > another layer

Re: Change for submitting to yarn in 1.3.1

2015-05-12 Thread Kevin Markey
We have the same issue. As result, we are stuck back on 1.0.2. Not being able to programmatically interface directly with the Yarn client to obtain the application id is a show stopper for us, which is a real shame given the Yarn enhancements in 1.2, 1.3, and 1.4. I understand that SparkLaun

Re: Change for submitting to yarn in 1.3.1

2015-05-11 Thread Mridul Muralidharan
That works when it is launched from same process - which is unfortunately not our case :-) - Mridul On Sun, May 10, 2015 at 9:05 PM, Manku Timma wrote: > sc.applicationId gives the yarn appid. > > On 11 May 2015 at 08:13, Mridul Muralidharan wrote: >> >> We had a similar requirement, and as a s

Re: Change for submitting to yarn in 1.3.1

2015-05-10 Thread Manku Timma
sc.applicationId gives the yarn appid. On 11 May 2015 at 08:13, Mridul Muralidharan wrote: > We had a similar requirement, and as a stopgap, I currently use a > suboptimal impl specific workaround - parsing it out of the > stdout/stderr (based on log config). > A better means to get to this is i

Re: Change for submitting to yarn in 1.3.1

2015-05-10 Thread Mridul Muralidharan
We had a similar requirement, and as a stopgap, I currently use a suboptimal impl specific workaround - parsing it out of the stdout/stderr (based on log config). A better means to get to this is indeed required ! Regards, Mridul On Sun, May 10, 2015 at 7:33 PM, Ron's Yahoo! wrote: > Hi, > I u

Change for submitting to yarn in 1.3.1

2015-05-10 Thread Ron's Yahoo!
Hi, I used to submit my Spark yarn applications by using org.apache.spark.yarn.deploy.Client api so I can get the application id after I submit it. The following is the code that I have, but after upgrading to 1.3.1, the yarn Client class was made into a private class. Is there a particular r