Pasha,
se attached file.
I have traced how MPI_IPROBE is called and also managed to significantly
reduce the number of calls to MPI_IPROBE. Unfortunately this only
resulted in the program spending time in other routines. Basically the
code runs through a number of timesteps and after each timestep all
slave nodes wait for the master to give go ahead for the next step and
this is were a lot of time is being spent. Either a load inbalance or
just waiting for all MPI_BSEND:s to complete or something else.
I am kind of stuck right now and will have to do some more
investigations . Strange that this works so much better using Scali MPI.
Regards / Torgny
Pavel Shamis (Pasha) wrote:
However, setting:
-mca btl_openib_eager_limit 65536
gave a 15% improvement so OpenMPI is now down to 326 (from previous
376 seconds). Still a lot more than ScaliMPI with 214 seconds.
Can you please run ibv_devinfo on one of compute nodes ? It is
interesting to know what kind of IB HW you have on our cluster.
Pasha
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
--
---------------------------------------------------------
Torgny Faxén
National Supercomputer Center
Linköping University
S-581 83 Linköping
Sweden
Email:fa...@nsc.liu.se
Telephone: +46 13 285798 (office) +46 13 282535 (fax)
http://www.nsc.liu.se
---------------------------------------------------------
hca_id: mlx4_0
fw_ver: 2.5.000
node_guid: 001e:0bff:ff4c:1bf4
sys_image_guid: 001e:0bff:ff4c:1bf7
vendor_id: 0x02c9
vendor_part_id: 25418
hw_ver: 0xA0
board_id: HP_09D0000001
phys_port_cnt: 2
port: 1
state: active (4)
max_mtu: 2048 (4)
active_mtu: 2048 (4)
sm_lid: 1
port_lid: 132
port_lmc: 0x00
port: 2
state: down (1)
max_mtu: 2048 (4)
active_mtu: 2048 (4)
sm_lid: 0
port_lid: 0
port_lmc: 0x00