Re: [OMPI users] over-subscription of cores

Ralph Castain Mon, 26 Dec 2011 12:43:04 -0500

I confess I'm confused. OMPI allows you to "oversubscribe" a node without any 
modification of job allocations. Just ask it to launch however many processes 
you want - it will ignore the allocated number of slots and do it.


It will set the sched_yield appropriately to deal with the oversubscription - 
i.e., each individual process will run a little slower than it would otherwise 
have done, but play nicer when it comes to "sharing" the available cpus. 
Likewise, it won't bind your processes to a specific core, which is what you 
want in this scenario.

So all you have to do is change your mpirun line to have -np N, where N is the 
actual number of desired processes. Or, if you prefer, you can use

 -npernode M

to tell mpirun to launch M processes on each node. Either will work.


On Dec 26, 2011, at 10:32 AM, Santosh Ansumali wrote:

> Thanks for the response. May be I am wrong. However my argument is as
> follows: our test shows that a 100^3 grid per core performs 10 times
> faster (normalised in proper unit)  than 200^3.  Both of these sizes
> are not fitting in cache.  100^3 run is benefiting due to smaller size
> where compiler is guessing access pattern in slightly better way.
> So, in case of running  one large job of 200^3 per core if I
> oversubscribe the core with smaller jobs of size comparable to 100^3,
> high saving due to better memory access should  compensate for thread
> compition.
> Best,
> Santosh
> On Mon, Dec 26, 2011 at 10:31 PM, Matthieu Brucher
> <matthieu.bruc...@gmail.com> wrote:
>> Hi,
>> 
>> If your problem is memory bound and if you don't use the whole memory
>> capacity of one node, it means that you are limited by your memory
>> bandwidth. In this case oversubscribing the number of processes will lead to
>> worse behavior, as all processes will fight for the same memory bandwidth.
>> 
>> Just my opinion.
>> 
>> Matthieu Brucher
>> 
>> 2011/12/23 Santosh Ansumali <ansum...@gmail.com>
>>> 
>>>  Dear All,
>>>        We are running a PDE solver which is memory bound. Due to
>>> cache related issue,   smaller  number of grid point per core leads to
>>> better performance for this code.  Thus, though available memory per
>>> core is more than 2 GB, we are able to good  performance   by using
>>> less than 1 GB per core.
>>> 
>>>  I want to know whether oversubscribing the cores can potentially
>>> improve performance of such a code.  My thinking is that if I
>>> oversubscribe the cores,  each thread will be using less than 1 GB so
>>> cache related problems will be less severe.  Is this logic correct or
>>> due to cache conflict performance will deteriorate further?
>>>      In case, over-subscription can help, how shall I modify
>>> submission file (using sun grid engine) to enable over-subscription of
>>> cores?
>>> my current submission file is written as follows
>>> #!/bin/bash
>>> #$ -N first
>>> #$ -S /bin/bash
>>> #$ -cwd
>>> #$ -e $JOB_ID.$JOB_NAME.ERROR
>>> #$ -o $JOB_ID.$JOB_NAME.OUTPUT
>>> #$ -P faculty_prj
>>> #$ -p 0
>>> #$ -pe orte 8
>>> /opt/mpi/openmpi/1.3.3/gnu/bin/mpirun -np $NSLOTS ./test_vel.out
>>> 
>>> Is it possible to allow over-subscription by modifying submission file
>>> itself?  Or do I need to change hostfiles somehow?
>>> Thanks for your help!
>>> Best Regards
>>> Santosh Ansumali,
>>> Faculty Fellow,
>>> Engineering Mechanics Unit
>>> Jawaharlal Nehru Centre for Advanced Scientific Research (JNCASR)
>>>  Jakkur, Bangalore-560 064, India
>>> Tel: + 91 80 22082938
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> 
>> 
>> --
>> Information System Engineer, Ph.D.
>> Blog: http://matt.eifelle.com
>> LinkedIn: http://www.linkedin.com/in/matthieubrucher
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> 
> -- 
> Santosh Ansumali,
> Faculty Fellow,
> Engineering Mechanics Unit
> Jawaharlal Nehru Centre for Advanced Scientific Research (JNCASR)
>  Jakkur, Bangalore-560 064, India
> Tel: + 91 80 22082938
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] over-subscription of cores

Reply via email to