Hi,

>>>>> tyr fd1026 179 cat host_sunpc0_1 
>>>>> sunpc0 slots=4
>>>>> sunpc1 slots=4
>>>>> 
>>>>> 
>>>>> tyr fd1026 180 mpiexec -report-bindings -hostfile host_sunpc0_1 -np 4 \
>>>>> -cpus-per-proc 2 -bind-to-core hostname
>>>> 
>>>> And this will of course not work. In your hostfile, you told us there
>>>> are FOUR slots on each host. Since the default is to map by slot, we
>>>> correctly mapped all four processes to the first node. We then tried
>>>> to bind 2 cores for each process, resulting in 8 cores - which is
>>>> more than you have.

Is it possible to adapt this behaviour? The default should remain mapping
by slot, but the number of processes mapped to a node can be adapted
depending on other options on the command line. If you know that a node
has n slots and a user requests m cpus-per-proc (wouldn't -slots-per-proc
be a more appropriate name?), you can only map n/m processes on that node
without oversubscribing. If the command line contains -npersocket and/or
-npernode as well, you can map min{n/m, number_of_sockets * npersocket,
npernode} processes on a node.

If you have two quad-core processors with two hardware threads per core,
you can set "sockets=2" (in openmpi-1.6.x) and "slots=16" so that you
can map up to 16 processes without oversubscribing. Since a hardware
thread isn't as good as a core, I can restrict the number of processes
per socket with "-npersocket=4" and if I want to run a multithreaded
processes which should use all cores of both sockets, I can even restrict
the number of processes with "-npernode=1". "-cpus-per-node" would work
without any additional options with the minimum function, because you
wouldn't map processes by slot first and afterwards determine how many
slots they need and if you have enough slots, if -cpus-per-proc is
requested as well.


Kind regards

Siegmar

Reply via email to