RE: Official Stance on Not Using Spark Submit

2016-10-13 Thread assaf.mendelson
:14 PM To: Mendelson, Assaf Subject: Re: Official Stance on Not Using Spark Submit Just folks who don't want to use spark-submit, no real use-cases I've seen yet. I didn't know about SparkLauncher myself and I don't think there are any official docs on that or launching

Re: Official Stance on Not Using Spark Submit

2016-10-10 Thread Ofir Manor
Funny, someone from my team talked to me about that idea yesterday. We use SparkLauncher, but it just calls spark-submit that calls other scripts that starts a new Java program that tries to submit (in our case in cluster mode - driver is started in the Spark cluster) and exit. That make it a chall

Re: Official Stance on Not Using Spark Submit

2016-10-10 Thread Russell Spitzer
Just folks who don't want to use spark-submit, no real use-cases I've seen yet. I didn't know about SparkLauncher myself and I don't think there are any official docs on that or launching spark as an embedded library for tests. On Mon, Oct 10, 2016 at 11:09 AM Matei Zaharia wrote: > What are th

Re: Official Stance on Not Using Spark Submit

2016-10-10 Thread Matei Zaharia
What are the main use cases you've seen for this? Maybe we can add a page to the docs about how to launch Spark as an embedded library. Matei > On Oct 10, 2016, at 10:21 AM, Russell Spitzer > wrote: > > I actually had not seen SparkLauncher before, that looks pretty great :) > > On Mon, Oct

Re: Official Stance on Not Using Spark Submit

2016-10-10 Thread Russell Spitzer
I actually had not seen SparkLauncher before, that looks pretty great :) On Mon, Oct 10, 2016 at 10:17 AM Russell Spitzer wrote: > I'm definitely only talking about non-embedded uses here as I also use > embedded Spark (cassandra, and kafka) to run tests. This is almost always > safe since every

Re: Official Stance on Not Using Spark Submit

2016-10-10 Thread Russell Spitzer
I'm definitely only talking about non-embedded uses here as I also use embedded Spark (cassandra, and kafka) to run tests. This is almost always safe since everything is in the same JVM. It's only once we get to launching against a real distributed env do we end up with issues. Since Pyspark uses

Re: Official Stance on Not Using Spark Submit

2016-10-10 Thread Sean Owen
I have also 'embedded' a Spark driver without much trouble. It isn't that it can't work. The Launcher API is ptobably the recommended way to do that though. spark-submit is the way to go for non programmatic access. If you're not doing one of those things and it is not working, yeah I think peopl

Re: Official Stance on Not Using Spark Submit

2016-10-10 Thread Marcin Tustin
I've done this for some pyspark stuff. I didn't find it especially problematic. On Mon, Oct 10, 2016 at 12:58 PM, Reynold Xin wrote: > How are they using it? Calling some main function directly? > > > On Monday, October 10, 2016, Russell Spitzer > wrote: > >> I've seen a variety of users attemp

Re: Official Stance on Not Using Spark Submit

2016-10-10 Thread Reynold Xin
How are they using it? Calling some main function directly? On Monday, October 10, 2016, Russell Spitzer wrote: > I've seen a variety of users attempting to work around using Spark Submit > with at best middling levels of success. I think it would be helpful if the > project had a clear statemen