> On Jun 5, 2020, at 6:55 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> 
> wrote:
> 
> On Jun 5, 2020, at 6:35 PM, Stephen Siegel via users 
> <users@lists.open-mpi.org> wrote:
>> 
>> [ilyich:12946] 3 more processes have sent help message help-mpi-btl-base.txt 
>> / btl:no-nics
>> [ilyich:12946] Set MCA parameter "orte_base_help_aggregate" to 0 to see all 
>> help / error messages
> 
> It looks like your output somehow doesn't include the actual error message.

You’re right, on this first machine I did not include all of the output.  It is:

siegel@ilyich:~/372/code/mpi/io$ mpiexec -n 4 ./a.out
--------------------------------------------------------------------------
[[171,1],0]: A high-performance Open MPI point-to-point messaging module
was unable to find any relevant network interfaces:

Module: OpenFabrics (openib)
  Host: ilyich

Another transport will be used instead, although this may result in
lower performance.

NOTE: You can disable this warning by setting the MCA parameter
btl_base_warn_component_unused to 0.
—————————————————————————————————————

So, I’ll ask my people to look into how they configured this.

However, on the second machine which uses SLURM it consistently hangs on this 
example, although many other examples using MPI I/O work fine.

-Steve




>  That error message was sent to stderr, so you may not have captured it if 
> you only did "mpirun ... > foo.txt".  The actual error message template is 
> this:
> 
> -----
> %s: A high-performance Open MPI point-to-point messaging module
> was unable to find any relevant network interfaces:
> 
> Module: %s
>  Host: %s
> 
> Another transport will be used instead, although this may result in
> lower performance.
> 
> NOTE: You can disable this warning by setting the MCA parameter
> btl_base_warn_component_unused to 0.
> -----
> 
> This is not actually an error -- just a warning.  It typically means that 
> your Open MPI has support for HPC-class networking, Open MPI saw some 
> evidence of HPC-class networking on the nodes on which your job ran, but 
> ultimately didn't use any of those HPC-class networking interfaces for some 
> reason and therefore fell back to TCP.
> 
> I.e., your program ran correctly, but it may have run slower than it could 
> have if it were able to use HPC-class networks.
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> 

Reply via email to