Sure, I’ll ask the machine admins to update and let you know how it goes. In the meantime, I was just wondering if someone has run this little program with an up-to-date OpenMPI and if it worked. If so, then I will know the problem is with our setup. Thanks -Steve
> On Jun 5, 2020, at 7:45 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> > wrote: > > You cited Open MPI v2.1.1. That's a pretty ancient version of Open MPI. > > Any chance you can upgrade to Open MPI 4.0.x? > > > >> On Jun 5, 2020, at 7:24 PM, Stephen Siegel <sie...@udel.edu> wrote: >> >> >> >>> On Jun 5, 2020, at 6:55 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> >>> wrote: >>> >>> On Jun 5, 2020, at 6:35 PM, Stephen Siegel via users >>> <users@lists.open-mpi.org> wrote: >>>> >>>> [ilyich:12946] 3 more processes have sent help message >>>> help-mpi-btl-base.txt / btl:no-nics >>>> [ilyich:12946] Set MCA parameter "orte_base_help_aggregate" to 0 to see >>>> all help / error messages >>> >>> It looks like your output somehow doesn't include the actual error message. >> >> You’re right, on this first machine I did not include all of the output. It >> is: >> >> siegel@ilyich:~/372/code/mpi/io$ mpiexec -n 4 ./a.out >> -------------------------------------------------------------------------- >> [[171,1],0]: A high-performance Open MPI point-to-point messaging module >> was unable to find any relevant network interfaces: >> >> Module: OpenFabrics (openib) >> Host: ilyich >> >> Another transport will be used instead, although this may result in >> lower performance. >> >> NOTE: You can disable this warning by setting the MCA parameter >> btl_base_warn_component_unused to 0. >> ————————————————————————————————————— >> >> So, I’ll ask my people to look into how they configured this. >> >> However, on the second machine which uses SLURM it consistently hangs on >> this example, although many other examples using MPI I/O work fine. >> >> -Steve >> >> >> >> >>> That error message was sent to stderr, so you may not have captured it if >>> you only did "mpirun ... > foo.txt". The actual error message template is >>> this: >>> >>> ----- >>> %s: A high-performance Open MPI point-to-point messaging module >>> was unable to find any relevant network interfaces: >>> >>> Module: %s >>> Host: %s >>> >>> Another transport will be used instead, although this may result in >>> lower performance. >>> >>> NOTE: You can disable this warning by setting the MCA parameter >>> btl_base_warn_component_unused to 0. >>> ----- >>> >>> This is not actually an error -- just a warning. It typically means that >>> your Open MPI has support for HPC-class networking, Open MPI saw some >>> evidence of HPC-class networking on the nodes on which your job ran, but >>> ultimately didn't use any of those HPC-class networking interfaces for some >>> reason and therefore fell back to TCP. >>> >>> I.e., your program ran correctly, but it may have run slower than it could >>> have if it were able to use HPC-class networks. >>> >>> -- >>> Jeff Squyres >>> jsquy...@cisco.com > > > -- > Jeff Squyres > jsquy...@cisco.com >