This seems related: [SPARK-10123][DEPLOY] Support specifying deploy mode from configuration
FYI On Wed, Dec 16, 2015 at 7:31 AM, Saiph Kappa <[email protected]> wrote: > Hi, > > I have a client application running on host0 that is launching multiple > drivers on multiple remote standalone spark clusters (each cluster is > running on a single machine): > > « > ... > > List("host1", "host2" , "host3").foreach(host => { > > val sparkConf = new SparkConf() > sparkConf.setAppName("App") > > sparkConf.set("spark.driver.memory", "4g") > sparkConf.set("spark.executor.memory", "4g") > sparkConf.set("spark.driver.maxResultSize", "4g") > sparkConf.set("spark.serializer", > "org.apache.spark.serializer.KryoSerializer") > sparkConf.set("spark.executor.extraJavaOptions", " -XX:+UseCompressedOops > -XX:+UseConcMarkSweepGC " + > "-XX:+AggressiveOpts -XX:FreqInlineSize=300 -XX:MaxInlineSize=300 ") > > sparkConf.setMaster(s"spark://$host:7077") > > val rawStreams = (1 to source.parallelism).map(_ => > ssc.textFileStream("/home/user/data/")).toArray > val rawStream = ssc.union(rawStreams) > rawStream.count.map(c => s"Received $c records.").print() > > } > ... > > » > > The problem is that I'm getting an error message saying that the directory > "/home/user/data/" does not exist. > In fact, this directory only exists in host1, host2 and host3 and not in > host0. > But since I'm launching the driver to host1..3 I thought data would be > fetched from those machines. > > I'm also trying to avoid using the spark submit script, and couldn't find the > configuration parameter to specify the deploy mode. > > Is there any way to specify the deploy mode through configuration parameter? > > > Thanks. > >
