Re: Spark job uses only one Worker

Michael Pisula Fri, 08 Jan 2016 12:00:35 -0800

Hi Annabel,

I am using Spark in stand-alone mode (deployment using the ec2 scripts
packaged with spark).


Cheers,
Michael

On 08.01.2016 00:43, Annabel Melongo wrote:
> Michael,
>
> I don't know what's your environment but if it's Cloudera, you should
> be able to see the link to your master in the Hue.
>
> Thanks
>
>
> On Thursday, January 7, 2016 5:03 PM, Michael Pisula
> <michael.pis...@tngtech.com> wrote:
>
>
> I had tried several parameters, including --total-executor-cores, no
> effect.
> As for the port, I tried 7077, but if I remember correctly I got some
> kind of error that suggested to try 6066, with which it worked just
> fine (apart from this issue here).
>
> Each worker has two cores. I also tried increasing cores, again no
> effect. I was able to increase the number of cores the job was using
> on one worker, but it would not use any other worker (and it would not
> start if the number of cores the job wanted was higher than the number
> available on one worker).
>
> On 07.01.2016 22:51, Igor Berman wrote:
>> read about *--total-executor-cores*
>> not sure why you specify port 6066 in master...usually it's 7077
>> verify in master ui(usually port 8080) how many cores are
>> there(depends on other configs, but usually workers connect to master
>> with all their cores)
>>
>> On 7 January 2016 at 23:46, Michael Pisula
>> <michael.pis...@tngtech.com <mailto:michael.pis...@tngtech.com>> wrote:
>>
>>     Hi,
>>
>>     I start the cluster using the spark-ec2 scripts, so the cluster
>>     is in stand-alone mode.
>>     Here is how I submit my job:
>>     spark/bin/spark-submit --class demo.spark.StaticDataAnalysis
>>     --master spark://<host>:6066 --deploy-mode cluster
>>     demo/Demo-1.0-SNAPSHOT-all.jar
>>
>>     Cheers,
>>     Michael
>>
>>
>>     On 07.01.2016 22:41, Igor Berman wrote:
>>>     share how you submit your job
>>>     what cluster(yarn, standalone)
>>>
>>>     On 7 January 2016 at 23:24, Michael Pisula
>>>     <michael.pis...@tngtech.com <mailto:michael.pis...@tngtech.com>>
>>>     wrote:
>>>
>>>         Hi there,
>>>
>>>         I ran a simple Batch Application on a Spark Cluster on EC2.
>>>         Despite having 3
>>>         Worker Nodes, I could not get the application processed on
>>>         more than one
>>>         node, regardless if I submitted the Application in Cluster
>>>         or Client mode.
>>>         I also tried manually increasing the number of partitions in
>>>         the code, no
>>>         effect. I also pass the master into the application.
>>>         I verified on the nodes themselves that only one node was
>>>         active while the
>>>         job was running.
>>>         I pass enough data to make the job take 6 minutes to process.
>>>         The job is simple enough, reading data from two S3 files,
>>>         joining records on
>>>         a shared field, filtering out some records and writing the
>>>         result back to
>>>         S3.
>>>
>>>         Tried all kinds of stuff, but could not make it work. I did
>>>         find similar
>>>         questions, but had already tried the solutions that worked
>>>         in those cases.
>>>         Would be really happy about any pointers.
>>>
>>>         Cheers,
>>>         Michael
>>>
>>>
>>>
>>>         --
>>>         View this message in context:
>>>         
>>> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-job-uses-only-one-Worker-tp25909.html
>>>         Sent from the Apache Spark User List mailing list archive at
>>>         Nabble.com.
>>>
>>>         
>>> ---------------------------------------------------------------------
>>>         To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>>         <mailto:user-unsubscr...@spark.apache.org>
>>>         For additional commands, e-mail: user-h...@spark.apache.org
>>>         <mailto:user-h...@spark.apache.org>
>>>
>>>
>>
>>     -- 
>>     Michael Pisula * michael.pis...@tngtech.com 
>> <mailto:michael.pis...@tngtech.com> * +49-174-3180084
>>     TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
>>     Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
>>     Sitz: Unterföhring * Amtsgericht München * HRB 135082
>>
>>
>
> -- 
> Michael Pisula * michael.pis...@tngtech.com 
> <mailto:michael.pis...@tngtech.com> * +49-174-3180084
> TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
> Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
> Sitz: Unterföhring * Amtsgericht München * HRB 135082
>
>

-- 
Michael Pisula * michael.pis...@tngtech.com * +49-174-3180084
TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
Sitz: Unterföhring * Amtsgericht München * HRB 135082

Re: Spark job uses only one Worker

Reply via email to