Hi, >>>>> tyr fd1026 179 cat host_sunpc0_1 >>>>> sunpc0 slots=4 >>>>> sunpc1 slots=4 >>>>> >>>>> >>>>> tyr fd1026 180 mpiexec -report-bindings -hostfile host_sunpc0_1 -np 4 \ >>>>> -cpus-per-proc 2 -bind-to-core hostname >>>> >>>> And this will of course not work. In your hostfile, you told us there >>>> are FOUR slots on each host. Since the default is to map by slot, we >>>> correctly mapped all four processes to the first node. We then tried >>>> to bind 2 cores for each process, resulting in 8 cores - which is >>>> more than you have.
Is it possible to adapt this behaviour? The default should remain mapping by slot, but the number of processes mapped to a node can be adapted depending on other options on the command line. If you know that a node has n slots and a user requests m cpus-per-proc (wouldn't -slots-per-proc be a more appropriate name?), you can only map n/m processes on that node without oversubscribing. If the command line contains -npersocket and/or -npernode as well, you can map min{n/m, number_of_sockets * npersocket, npernode} processes on a node. If you have two quad-core processors with two hardware threads per core, you can set "sockets=2" (in openmpi-1.6.x) and "slots=16" so that you can map up to 16 processes without oversubscribing. Since a hardware thread isn't as good as a core, I can restrict the number of processes per socket with "-npersocket=4" and if I want to run a multithreaded processes which should use all cores of both sockets, I can even restrict the number of processes with "-npernode=1". "-cpus-per-node" would work without any additional options with the minimum function, because you wouldn't map processes by slot first and afterwards determine how many slots they need and if you have enough slots, if -cpus-per-proc is requested as well. Kind regards Siegmar