Dear Open MPI Gurus, I'm currently trying to do something with Open MPI 1.8.8 that I'm pretty sure is possible, but I'm just not smart enough to figure out. Namely, I'm seeing some odd GPU timings and I think it's because I was dumb and assumed the GPU was on the PCI bus next to Socket #0 as some older GPU nodes I ran on were like that.
But, a trip through lspci and lstopo has shown me that the GPU is actually on Socket #1. These are dual socket Sandy Bridge nodes and I'd like to do some tests where I run a 8 processes per node and those processes all land on Socket #1. So, what I'm trying to figure out is how to have Open MPI bind processes like that. My first thought as always is to run a helloworld job with -report-bindings on. I can manage to do this: (1061) $ mpirun -np 8 -report-bindings -map-by core ./helloWorld.exe [borg01z205:16306] MCW rank 4 bound to socket 0[core 4[hwt 0]]: [././././B/././.][./././././././.] [borg01z205:16306] MCW rank 5 bound to socket 0[core 5[hwt 0]]: [./././././B/./.][./././././././.] [borg01z205:16306] MCW rank 6 bound to socket 0[core 6[hwt 0]]: [././././././B/.][./././././././.] [borg01z205:16306] MCW rank 7 bound to socket 0[core 7[hwt 0]]: [./././././././B][./././././././.] [borg01z205:16306] MCW rank 0 bound to socket 0[core 0[hwt 0]]: [B/././././././.][./././././././.] [borg01z205:16306] MCW rank 1 bound to socket 0[core 1[hwt 0]]: [./B/./././././.][./././././././.] [borg01z205:16306] MCW rank 2 bound to socket 0[core 2[hwt 0]]: [././B/././././.][./././././././.] [borg01z205:16306] MCW rank 3 bound to socket 0[core 3[hwt 0]]: [./././B/./././.][./././././././.] Process 7 of 8 is on borg01z205 Process 5 of 8 is on borg01z205 Process 2 of 8 is on borg01z205 Process 3 of 8 is on borg01z205 Process 4 of 8 is on borg01z205 Process 6 of 8 is on borg01z205 Process 0 of 8 is on borg01z205 Process 1 of 8 is on borg01z205 Great...but wrong socket! Is there a way to tell it to use Socket 1 instead? Note I'll be running under SLURM, so I will only have 8 processes per node, so it shouldn't need to use Socket 0. -- Matt Thompson Man Among Men Fulcrum of History