Hello, Markus,

The openib BTL component is not thread-safe. It disables itself when the
thread support level is MPI_THREAD_MULTIPLE. See this rant from one of my
colleagues:

http://www.open-mpi.org/community/lists/devel/2012/10/11584.php

A message is shown but only if the library was compiled with developer-level
debugging.

Open MPI guys, could the debug-level message in
btl_openib_component.c:btl_openib_component_init() be replaced by a help
text, e.g. the same way that the help text about the amount of registerable
memory not being enough is shown. Looks like the case of openib being
disabled for no apparent reason when MPI_THREAD_MULTIPLE is in effect is not
isolated to our users only. Or at least could you put somewhere in the FAQ
an explicit statement that openib is not only not thread-safe, but that it
would disable itself in a multithreaded environment.

Kind regards,
Hristo
--
Hristo Iliev, Ph.D. -- High Performance Computing
RWTH Aachen University, Center for Computing and Communication
Rechen- und Kommunikationszentrum der RWTH Aachen
Seffenter Weg 23,  D 52074  Aachen (Germany)
Tel: +49 241 80 24367 -- Fax/UMS: +49 241 80 624367

> -----Original Message-----
> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org]
> On Behalf Of Markus Wittmann
> Sent: Wednesday, November 07, 2012 1:14 PM
> To: us...@open-mpi.org
> Subject: [OMPI users] Problems with btl openib and MPI_THREAD_MULTIPLE
> 
> Hello,
> 
> I've compiled Open MPI 1.6.3 with --enable-mpi-thread-multiple -with-tm -
> with-openib --enable-opal-multi-threads.
> 
> When I use for example the pingpong benchmark from the Intel MPI
> Benchmarks, which call MPI_Init the btl openib is used and everything
works
> fine.
> 
> When instead the benchmark calls MPI_Thread_init with
> MPI_THREAD_MULTIPLE as requested threading level the btl openib fails to
> load but gives no further hints for the reason:
> 
> mpirun -v -n 2 -npernode 1 -gmca btl_base_verbose 200 ./imb- tm-openmpi-
> ts pingpong
> 
> ...
> [l0519:08267] select: initializing btl component openib [l0519:08267]
select:
> init of component openib returned failure [l0519:08267] select: module
> openib unloaded ...
> 
> The question is now, is currently just the support for
> MPI_THREADM_MULTIPLE missing in the openib module or are there other
> errors occurring and if so, how to identify them.
> 
> Attached ist the config.log from the Open MPI build, the ompi_info output
> and the output of the IMB pingpong bechmarks.
> 
> As system used were two nodes with:
> 
>   - OpenFabrics 1.5.3
>   - CentOS release 5.8 (Final)
>   - Linux Kernel 2.6.18-308.11.1.el5 x86_64
>   - OpenSM 3.3.3
> 
> [l0519] src > ibv_devinfo
> hca_id: mlx4_0
>         transport:                      InfiniBand (0)
>         fw_ver:                         2.7.000
>         node_guid:                      0030:48ff:fff6:31e4
>         sys_image_guid:                 0030:48ff:fff6:31e7
>         vendor_id:                      0x02c9
>         vendor_part_id:                 26428
>         hw_ver:                         0xB0
>         board_id:                       SM_2122000001000
>         phys_port_cnt:                  1
>                 port:   1
>                         state:                  PORT_ACTIVE (4)
>                         max_mtu:                2048 (4)
>                         active_mtu:             2048 (4)
>                         sm_lid:                 48
>                         port_lid:               278
>                         port_lmc:               0x00
> 
> Thanks for the help in advance.
> 
> Regards,
> Markus
> 
> 
> --
> Markus Wittmann, HPC Services
> Friedrich-Alexander-Universität Erlangen-Nürnberg Regionales
> Rechenzentrum Erlangen (RRZE) Martensstrasse 1, 91058 Erlangen, Germany
> Tel.: +49 9131 85-20104
> markus.wittm...@fau.de
> http://www.rrze.fau.de/hpc/

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to