Try adding —cpu-set a,b,c,…  where the a,b,c… are the core id’s of your second 
socket. I’m working on a cleaner option as this has come up before.


> On Dec 21, 2015, at 5:29 AM, Matt Thompson <fort...@gmail.com 
> <mailto:fort...@gmail.com>> wrote:
> 
> Dear Open MPI Gurus,
> 
> I'm currently trying to do something with Open MPI 1.8.8 that I'm pretty sure 
> is possible, but I'm just not smart enough to figure out. Namely, I'm seeing 
> some odd GPU timings and I think it's because I was dumb and assumed the GPU 
> was on the PCI bus next to Socket #0 as some older GPU nodes I ran on were 
> like that. 
> 
> But, a trip through lspci and lstopo has shown me that the GPU is actually on 
> Socket #1. These are dual socket Sandy Bridge nodes and I'd like to do some 
> tests where I run a 8 processes per node and those processes all land on 
> Socket #1.
> 
> So, what I'm trying to figure out is how to have Open MPI bind processes like 
> that. My first thought as always is to run a helloworld job with 
> -report-bindings on. I can manage to do this:
> 
> (1061) $ mpirun -np 8 -report-bindings -map-by core ./helloWorld.exe
> [borg01z205:16306] MCW rank 4 bound to socket 0[core 4[hwt 0]]: 
> [././././B/././.][./././././././.]
> [borg01z205:16306] MCW rank 5 bound to socket 0[core 5[hwt 0]]: 
> [./././././B/./.][./././././././.]
> [borg01z205:16306] MCW rank 6 bound to socket 0[core 6[hwt 0]]: 
> [././././././B/.][./././././././.]
> [borg01z205:16306] MCW rank 7 bound to socket 0[core 7[hwt 0]]: 
> [./././././././B][./././././././.]
> [borg01z205:16306] MCW rank 0 bound to socket 0[core 0[hwt 0]]: 
> [B/././././././.][./././././././.]
> [borg01z205:16306] MCW rank 1 bound to socket 0[core 1[hwt 0]]: 
> [./B/./././././.][./././././././.]
> [borg01z205:16306] MCW rank 2 bound to socket 0[core 2[hwt 0]]: 
> [././B/././././.][./././././././.]
> [borg01z205:16306] MCW rank 3 bound to socket 0[core 3[hwt 0]]: 
> [./././B/./././.][./././././././.]
> Process    7 of    8 is on borg01z205
> Process    5 of    8 is on borg01z205
> Process    2 of    8 is on borg01z205
> Process    3 of    8 is on borg01z205
> Process    4 of    8 is on borg01z205
> Process    6 of    8 is on borg01z205
> Process    0 of    8 is on borg01z205
> Process    1 of    8 is on borg01z205
> 
> Great...but wrong socket! Is there a way to tell it to use Socket 1 instead? 
> 
> Note I'll be running under SLURM, so I will only have 8 processes per node, 
> so it shouldn't need to use Socket 0.
> -- 
> Matt Thompson
> Man Among Men
> Fulcrum of History
> _______________________________________________
> users mailing list
> us...@open-mpi.org <mailto:us...@open-mpi.org>
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/12/28190.php

Reply via email to