Re: Spark Streaming: How to specify deploy mode through configuration parameter?

Ted Yu Thu, 17 Dec 2015 07:19:48 -0800

As far as I can tell, it is not in 1.6.0 RC.
You can comment on the JIRA, requesting backport to 1.6.1


Cheers

On Thu, Dec 17, 2015 at 5:28 AM, Saiph Kappa <saiph.ka...@gmail.com> wrote:

> I am not sure how the process works and if patches are applied to all
> upcoming versions of spark. Is it likely that the fix is available in this
> build (spark 1.6.0  17-Dec-2015 09:02)?
> http://people.apache.org/~pwendell/spark-nightly/spark-master-bin/latest/
>
> Thanks!
>
> On Wed, Dec 16, 2015 at 9:22 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>
>> Since both scala and java files are involved in the PR, I don't see an
>> easy way around without building yourself.
>>
>> Cheers
>>
>> On Wed, Dec 16, 2015 at 10:18 AM, Saiph Kappa <saiph.ka...@gmail.com>
>> wrote:
>>
>>> Exactly, but it's only fixed for the next spark version. Is there any
>>> work around for version 1.5.2?
>>>
>>> On Wed, Dec 16, 2015 at 4:36 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>>>
>>>> This seems related:
>>>> [SPARK-10123][DEPLOY] Support specifying deploy mode from configuration
>>>>
>>>> FYI
>>>>
>>>> On Wed, Dec 16, 2015 at 7:31 AM, Saiph Kappa <saiph.ka...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I have a client application running on host0 that is launching
>>>>> multiple drivers on multiple remote standalone spark clusters (each 
>>>>> cluster
>>>>> is running on a single machine):
>>>>>
>>>>> «
>>>>> ...
>>>>>
>>>>> List("host1", "host2" , "host3").foreach(host => {
>>>>>
>>>>> val sparkConf = new SparkConf()
>>>>> sparkConf.setAppName("App")
>>>>>
>>>>> sparkConf.set("spark.driver.memory", "4g")
>>>>> sparkConf.set("spark.executor.memory", "4g")
>>>>> sparkConf.set("spark.driver.maxResultSize", "4g")
>>>>> sparkConf.set("spark.serializer", 
>>>>> "org.apache.spark.serializer.KryoSerializer")
>>>>> sparkConf.set("spark.executor.extraJavaOptions", " -XX:+UseCompressedOops 
>>>>> -XX:+UseConcMarkSweepGC " +
>>>>>   "-XX:+AggressiveOpts -XX:FreqInlineSize=300 -XX:MaxInlineSize=300 ")
>>>>>
>>>>> sparkConf.setMaster(s"spark://$host:7077")
>>>>>
>>>>> val rawStreams = (1 to source.parallelism).map(_ => 
>>>>> ssc.textFileStream("/home/user/data/")).toArray
>>>>> val rawStream = ssc.union(rawStreams)
>>>>> rawStream.count.map(c => s"Received $c records.").print()
>>>>>
>>>>> }
>>>>> ...
>>>>>
>>>>> »
>>>>>
>>>>> The problem is that I'm getting an error message saying that the 
>>>>> directory "/home/user/data/" does not exist.
>>>>> In fact, this directory only exists in host1, host2 and host3 and not in 
>>>>> host0.
>>>>> But since I'm launching the driver to host1..3 I thought data would be 
>>>>> fetched from those machines.
>>>>>
>>>>> I'm also trying to avoid using the spark submit script, and couldn't find 
>>>>> the configuration parameter to specify the deploy mode.
>>>>>
>>>>> Is there any way to specify the deploy mode through configuration 
>>>>> parameter?
>>>>>
>>>>>
>>>>> Thanks.
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Spark Streaming: How to specify deploy mode through configuration parameter?

Reply via email to