I hope this isn't too basic of a question, but is there a document somewhere that describes how the selection of which BTL components (eg. openib, tcp) to use occurs when mpirun/mpiexec is launched? I know it can be influenced by conf files, parameters, and env variables. But lacking those, how does it choose which components to use?
I'm trying to diagnose an issue involving OpenMPI, OFED, and an OS upgrade. I'm hoping that better understanding of how components are selected, will help me figure out what changed with the OS upgrade. Here's a longer explanation. We recently upgraded our HPC cluster from RHEL 6.2 to 6.6. We have several versions of OpenMPI availale from a central NFS store. Our cluster has some nodes with IB hardware, and some without. On the old OS image, we did not install any of the OFED components on the non-IB nodes, and OpenMPI was able to somehow figure out that it shouldn't even try the openib btl, without any runtime warnings. We got the speeds we were expecting, when running osu_bw tests from the OMB test suite, for either the IB nodes (about 3800 MB/s for 4xQDR IB), or the non-IB nodes (about 115 MB/s for 1GbE). Since the OS upgrade, we start to get warnings like this on non-IB nodes without OFED installed: > $ mpirun -np 2 hello_world > [m7stage-1-1:09962] mca: base: component_find: unable to open > /apps/openmpi/1.6.3_gnu-4.4/lib/openmpi/mca_btl_ofud: librdmacm.so.1: cannot > open shared object file: No such file or directory (ignored) > [m7stage-1-1:09961] mca: base: component_find: unable to open > /apps/openmpi/1.6.3_gnu-4.4/lib/openmpi/mca_btl_ofud: librdmacm.so.1: cannot > open shared object file: No such file or directory (ignored) > [m7stage-1-1:09961] mca: base: component_find: unable to open > /apps/openmpi/1.6.3_gnu-4.4/lib/openmpi/mca_btl_openib: librdmacm.so.1: > cannot open shared object file: No such file or directory (ignored) > [m7stage-1-1:09962] mca: base: component_find: unable to open > /apps/openmpi/1.6.3_gnu-4.4/lib/openmpi/mca_btl_openib: librdmacm.so.1: > cannot open shared object file: No such file or directory (ignored) > Hello from process # 0 of 2 on host m7stage-1-1 > Hello from process # 1 of 2 on host m7stage-1-1 Obviously these are references to software components associated with OFED. We can install OFED on the non-IB nodes, but then we get warnings more like this: > $ mpirun -np 2 hello_world > librdmacm: Fatal: no RDMA devices found > librdmacm: Fatal: no RDMA devices found > -------------------------------------------------------------------------- > [[63448,1],0]: A high-performance Open MPI point-to-point messaging module > was unable to find any relevant network interfaces: > > Module: OpenFabrics (openib) > Host: m7stage-1-1 > > Another transport will be used instead, although this may result in > lower performance. > -------------------------------------------------------------------------- > Hello from process # 0 of 2 on host m7stage-1-1 > Hello from process # 1 of 2 on host m7stage-1-1 > [m7stage-1-1:18753] 1 more process has sent help message > help-mpi-btl-base.txt / btl:no-nics > [m7stage-1-1:18753] Set MCA parameter "orte_base_help_aggregate" to 0 to see > all help / error messages Obviously we can work with this by using "--mca btl ^openib" or similar on the non-IB nodes. And we're pursuing that option. But I'm struggling to understand what happened to cause OpenMPI on the non-IB node, without OFED installed, to no longer be able to figure out that it shouldn't use the openib btl. Thus the reason why I ask for more information about how that decision is being made. Maybe that will clue me in, as to what changed. Thanks, -- Lloyd Brown Systems Administrator Fulton Supercomputing Lab Brigham Young University http://marylou.byu.edu