Hi Igor

Thanks . The reason I am using cluster mode is this the stream app must will
run for ever. I am using client mode for my pyspark work

Andy

From:  Igor Berman <igor.ber...@gmail.com>
Date:  Friday, November 20, 2015 at 6:22 AM
To:  Andrew Davidson <a...@santacruzintegration.com>
Cc:  "user @spark" <user@spark.apache.org>
Subject:  Re: newbie: unable to use all my cores and memory

> u've asked total cores to be 2 + 1 for driver(since you are running in cluster
> mode, so it's running on one of the slaves)
> change total cores to be 3*2
> change submit mode to be client - you'll have full utilization
> (btw it's not advisable to use all cores of slave...since there is OS
> processes and other processes...)
> 
> On 20 November 2015 at 02:02, Andy Davidson <a...@santacruzintegration.com>
> wrote:
>> I am having a heck of a time figuring out how to utilize my cluster
>> effectively. I am using the stand alone cluster manager. I have a master
>> and 3 slaves. Each machine has 2 cores.
>> 
>> I am trying to run a streaming app in cluster mode and pyspark at the same
>> time.
>> 
>> t1) On my console I see
>> 
>>         * Alive Workers: 3
>>         * Cores in use: 6 Total, 0 Used
>>         * Memory in use: 18.8 GB Total, 0.0 B Used
>>         * Applications: 0 Running, 15 Completed
>>         * Drivers: 0 Running, 2 Completed
>>         * Status: ALIVE
>> 
>> t2) I start my streaming app
>> 
>> $SPARK_ROOT/bin/spark-submit \
>>         --class "com.pws.spark.streaming.IngestDriver" \
>>         --master $MASTER_URL \
>>         --total-executor-cores 2 \
>>         --deploy-mode cluster \
>>         $jarPath --clusterMode  $*
>> 
>> t3) on my console I see
>> 
>>         * Alive Workers: 3
>>         * Cores in use: 6 Total, 3 Used
>>         * Memory in use: 18.8 GB Total, 13.0 GB Used
>>         * Applications: 1 Running, 15 Completed
>>         * Drivers: 1 Running, 2 Completed
>>         * Status: ALIVE
>> 
>> Looks like pyspark should be able to use the 3 remaining cores and 5.8 GB
>> of memory
>> 
>> t4) I start pyspark
>> 
>>         export PYSPARK_PYTHON=python3.4
>>         export PYSPARK_DRIVER_PYTHON=python3.4
>>         export IPYTHON_OPTS="notebook --no-browser --port=7000
>> --log-level=WARN"
>> 
>>         $SPARK_ROOT/bin/pyspark --master $MASTER_URL --total-executor-cores 3
>> --executor-memory 2g
>> 
>> t5) on my console I see
>> 
>>         * Alive Workers: 3
>>         * Cores in use: 6 Total, 4 Used
>>         * Memory in use: 18.8 GB Total, 15.0 GB Used
>>         * Applications: 2 Running, 18 Completed
>>         * Drivers: 1 Running, 2 Completed
>>         * Status: ALIVE
>> 
>> 
>> I have 2 unused cores and a lot of memory left over. My pyspark
>> application is going getting 1 core. If streaming app is not running
>> pyspark would be assigned 2 cores each on a different worker. I have tried
>> using various combinations of --executor-cores and --total-executor-cores.
>> Any idea how to get pyspark to use more cores and memory?
>> 
>> 
>> Kind regards
>> 
>> Andy
>> 
>> P.s.  Using different values I have wound up with  pyspark status ==
>> ³waiting² I think this is because there are not enough cores available?
>> 
>> 
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>> 
> 


Reply via email to