Hello, Markus, The openib BTL component is not thread-safe. It disables itself when the thread support level is MPI_THREAD_MULTIPLE. See this rant from one of my colleagues:
http://www.open-mpi.org/community/lists/devel/2012/10/11584.php A message is shown but only if the library was compiled with developer-level debugging. Open MPI guys, could the debug-level message in btl_openib_component.c:btl_openib_component_init() be replaced by a help text, e.g. the same way that the help text about the amount of registerable memory not being enough is shown. Looks like the case of openib being disabled for no apparent reason when MPI_THREAD_MULTIPLE is in effect is not isolated to our users only. Or at least could you put somewhere in the FAQ an explicit statement that openib is not only not thread-safe, but that it would disable itself in a multithreaded environment. Kind regards, Hristo -- Hristo Iliev, Ph.D. -- High Performance Computing RWTH Aachen University, Center for Computing and Communication Rechen- und Kommunikationszentrum der RWTH Aachen Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241 80 24367 -- Fax/UMS: +49 241 80 624367 > -----Original Message----- > From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] > On Behalf Of Markus Wittmann > Sent: Wednesday, November 07, 2012 1:14 PM > To: us...@open-mpi.org > Subject: [OMPI users] Problems with btl openib and MPI_THREAD_MULTIPLE > > Hello, > > I've compiled Open MPI 1.6.3 with --enable-mpi-thread-multiple -with-tm - > with-openib --enable-opal-multi-threads. > > When I use for example the pingpong benchmark from the Intel MPI > Benchmarks, which call MPI_Init the btl openib is used and everything works > fine. > > When instead the benchmark calls MPI_Thread_init with > MPI_THREAD_MULTIPLE as requested threading level the btl openib fails to > load but gives no further hints for the reason: > > mpirun -v -n 2 -npernode 1 -gmca btl_base_verbose 200 ./imb- tm-openmpi- > ts pingpong > > ... > [l0519:08267] select: initializing btl component openib [l0519:08267] select: > init of component openib returned failure [l0519:08267] select: module > openib unloaded ... > > The question is now, is currently just the support for > MPI_THREADM_MULTIPLE missing in the openib module or are there other > errors occurring and if so, how to identify them. > > Attached ist the config.log from the Open MPI build, the ompi_info output > and the output of the IMB pingpong bechmarks. > > As system used were two nodes with: > > - OpenFabrics 1.5.3 > - CentOS release 5.8 (Final) > - Linux Kernel 2.6.18-308.11.1.el5 x86_64 > - OpenSM 3.3.3 > > [l0519] src > ibv_devinfo > hca_id: mlx4_0 > transport: InfiniBand (0) > fw_ver: 2.7.000 > node_guid: 0030:48ff:fff6:31e4 > sys_image_guid: 0030:48ff:fff6:31e7 > vendor_id: 0x02c9 > vendor_part_id: 26428 > hw_ver: 0xB0 > board_id: SM_2122000001000 > phys_port_cnt: 1 > port: 1 > state: PORT_ACTIVE (4) > max_mtu: 2048 (4) > active_mtu: 2048 (4) > sm_lid: 48 > port_lid: 278 > port_lmc: 0x00 > > Thanks for the help in advance. > > Regards, > Markus > > > -- > Markus Wittmann, HPC Services > Friedrich-Alexander-Universität Erlangen-Nürnberg Regionales > Rechenzentrum Erlangen (RRZE) Martensstrasse 1, 91058 Erlangen, Germany > Tel.: +49 9131 85-20104 > markus.wittm...@fau.de > http://www.rrze.fau.de/hpc/
smime.p7s
Description: S/MIME cryptographic signature