Couldn't it be that the slot list should be 0,1,2,3? It depends on the setup. You can get some more information about _what it does_ by using --report-bindings (when/if it succeeds).
2015-12-07 16:18 GMT+01:00 Carl Ponder <cpon...@nvidia.com>: > *On 12/06/2015 11:07 AM, Carl Ponder wrote:* > > I'm trying to run a multi-node job but I want to map all of the processes > to cores on socket #0 only. > I'm having a hard time figuring out how to do this, the obvious > combinations > > mpirun -n 8 -npernode 4 -report-bindings ... > mpirun -n 8 -npernode 4 --map-by core -report-bindings ... > mpirun -n 8 -npernode 4 -cpu-set S0 -report-bindings ... > mpirun -n 8 --map-by ppr:4:node,ppr:4:socket -report-bindings ... > mpirun -n 8 -npernode 4 -bind-to slot=0:0,2,4,6 -report-bindings ... > mpirun -n 8 -npernode 4 -bind-to slot=0:0,0:2,0:4,0:6 -report-bindings ... > mpirun -n 8 -npernode 4 --map-by core:PE=2 -bind-to core -report-bindings > ... > > all are reported as having conflicting resource requirements. > > * On 12/06/2015 11:28 AM, Ralph Castain wrote:* > > You want "-bind-to socket -slot-list=0,2,4,6" > Or if you want each process bound to a single core on the socket, then > change “socket” to “core” in the above > > So far I can't get this to work. Using the above form > > mpirun -n 8 *-bind-to socket --slot-list 0,2,4,6* -report-bindings ... > > it says that it's a mis-specification: > > Conflicting directives for binding policy are causing the policy > to be redefined: > > New policy: socket > Prior policy: SOCKET > > Please check that only one policy is defined. > > If I treat the socket-binding as redundant and just use this > > mpirun -n 8 -*-slot-list 0,2,4,6* -report-bindings ... > > it looks like it's ignoring slots 0,2,4,6 available on the second node: > > A rank is missing its location specification: > > Rank: 0 > Rank file: (null) > > All processes must have their location specified in the rank file. Either > add an entry to the file, or provide a default slot_list to use for > any unspecified ranks. > > (One question is whether it is interacting with Torque correctly). > Trying to force it to split the processes across nodes > > mpirun -n 8 *-npernode 4 --slot-list 0,2,4,6* -report-bindings .... > > gives > > Conflicting directives for mapping policy are causing the policy > to be redefined: > > New policy: RANK_FILE > Prior policy: UNKNOWN > > Please check that only one policy is defined. > > Do you know what to do here? I'm using OpenMPI 1.10.1. > Thanks, > > Carl > > ------------------------------ > This email message is for the sole use of the intended recipient(s) and > may contain confidential information. Any unauthorized review, use, > disclosure or distribution is prohibited. If you are not the intended > recipient, please contact the sender by reply email and destroy all copies > of the original message. > ------------------------------ > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/12/28139.php > -- Kind regards Nick