Try adding —cpu-set a,b,c,… where the a,b,c… are the core id’s of your second socket. I’m working on a cleaner option as this has come up before.
> On Dec 21, 2015, at 5:29 AM, Matt Thompson <fort...@gmail.com > <mailto:fort...@gmail.com>> wrote: > > Dear Open MPI Gurus, > > I'm currently trying to do something with Open MPI 1.8.8 that I'm pretty sure > is possible, but I'm just not smart enough to figure out. Namely, I'm seeing > some odd GPU timings and I think it's because I was dumb and assumed the GPU > was on the PCI bus next to Socket #0 as some older GPU nodes I ran on were > like that. > > But, a trip through lspci and lstopo has shown me that the GPU is actually on > Socket #1. These are dual socket Sandy Bridge nodes and I'd like to do some > tests where I run a 8 processes per node and those processes all land on > Socket #1. > > So, what I'm trying to figure out is how to have Open MPI bind processes like > that. My first thought as always is to run a helloworld job with > -report-bindings on. I can manage to do this: > > (1061) $ mpirun -np 8 -report-bindings -map-by core ./helloWorld.exe > [borg01z205:16306] MCW rank 4 bound to socket 0[core 4[hwt 0]]: > [././././B/././.][./././././././.] > [borg01z205:16306] MCW rank 5 bound to socket 0[core 5[hwt 0]]: > [./././././B/./.][./././././././.] > [borg01z205:16306] MCW rank 6 bound to socket 0[core 6[hwt 0]]: > [././././././B/.][./././././././.] > [borg01z205:16306] MCW rank 7 bound to socket 0[core 7[hwt 0]]: > [./././././././B][./././././././.] > [borg01z205:16306] MCW rank 0 bound to socket 0[core 0[hwt 0]]: > [B/././././././.][./././././././.] > [borg01z205:16306] MCW rank 1 bound to socket 0[core 1[hwt 0]]: > [./B/./././././.][./././././././.] > [borg01z205:16306] MCW rank 2 bound to socket 0[core 2[hwt 0]]: > [././B/././././.][./././././././.] > [borg01z205:16306] MCW rank 3 bound to socket 0[core 3[hwt 0]]: > [./././B/./././.][./././././././.] > Process 7 of 8 is on borg01z205 > Process 5 of 8 is on borg01z205 > Process 2 of 8 is on borg01z205 > Process 3 of 8 is on borg01z205 > Process 4 of 8 is on borg01z205 > Process 6 of 8 is on borg01z205 > Process 0 of 8 is on borg01z205 > Process 1 of 8 is on borg01z205 > > Great...but wrong socket! Is there a way to tell it to use Socket 1 instead? > > Note I'll be running under SLURM, so I will only have 8 processes per node, > so it shouldn't need to use Socket 0. > -- > Matt Thompson > Man Among Men > Fulcrum of History > _______________________________________________ > users mailing list > us...@open-mpi.org <mailto:us...@open-mpi.org> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/12/28190.php