*On 12/06/2015 11:07 AM, Carl Ponder wrote:*
I'm trying to run a multi-node job but I want to map all of the processes to cores on socket #0 only. I'm having a hard time figuring out how to do this, the obvious combinations mpirun -n 8 -npernode 4 -report-bindings ... mpirun -n 8 -npernode 4 --map-by core -report-bindings ... mpirun -n 8 -npernode 4 -cpu-set S0 -report-bindings ... mpirun -n 8 --map-by ppr:4:node,ppr:4:socket -report-bindings ... mpirun -n 8 -npernode 4 -bind-to slot=0:0,2,4,6 -report-bindings ... mpirun -n 8 -npernode 4 -bind-to slot=0:0,0:2,0:4,0:6 -report-bindings ... mpirun -n 8 -npernode 4 --map-by core:PE=2 -bind-to core -report-bindings ... all are reported as having conflicting resource requirements.
*On 12/06/2015 11:28 AM, Ralph Castain wrote:*
You want "-bind-to socket -slot-list=0,2,4,6" Or if you want each process bound to a single core on the socket, then change “socket” to “core” in the above
So far I can't get this to work. Using the above form mpirun -n 8 *-bind-to socket --slot-list 0,2,4,6* -report-bindings ... it says that it's a mis-specification: Conflicting directives for binding policy are causing the policy to be redefined: New policy: socket Prior policy: SOCKET Please check that only one policy is defined. If I treat the socket-binding as redundant and just use this mpirun -n 8 -*-slot-list 0,2,4,6* -report-bindings ... it looks like it's ignoring slots 0,2,4,6 available on the second node: A rank is missing its location specification: Rank: 0 Rank file: (null) All processes must have their location specified in the rank file. Either add an entry to the file, or provide a default slot_list to use for any unspecified ranks. (One question is whether it is interacting with Torque correctly). Trying to force it to split the processes across nodes mpirun -n 8 *-npernode 4 --slot-list 0,2,4,6* -report-bindings .... gives Conflicting directives for mapping policy are causing the policy to be redefined: New policy: RANK_FILE Prior policy: UNKNOWN Please check that only one policy is defined. Do you know what to do here? I'm using OpenMPI 1.10.1. Thanks, Carl ----------------------------------------------------------------------------------- This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. -----------------------------------------------------------------------------------