Re: [OMPI users] "-bind-to numa" of openmpi-1.7.4rc1 dosen't work for our magny cours based 32 core node

2013-12-21 Thread tmishima
Ralph, thanks. I'll try it on Tuseday. Let me confirm one thing. I don't put "-with-libevent" when I build openmpi. Is there any possibility to build with external libevent automatically? Tetsuya Mishima > Not entirely sure - add "-mca rmaps_base_verbose 10 --display-map" to your cmd line and

Re: [OMPI users] "-bind-to numa" of openmpi-1.7.4rc1 dosen't work for our magny cours based 32 core node

2013-12-21 Thread Ralph Castain
Not entirely sure - add "-mca rmaps_base_verbose 10 --display-map" to your cmd line and let's see if it finishes the mapping. Unless you specifically built with an external libevent (which I doubt), there is no conflict. The connection issue is unlikely to be a factor here as it works when not

Re: [OMPI users] "-bind-to numa" of openmpi-1.7.4rc1 dosen't work for our magny cours based 32 core node

2013-12-21 Thread tmishima
Thank you, Ralph. Then, this problem should depend on our environment. But, at least, inversion problem is not the cause because node05 has normal hier order. I can not connect to our cluster now. Tuesday, going back to my office, I'll send you further report. Before that, please let me know y

[OMPI users] Clarification about dual-rail capabilities (sharing)

2013-12-21 Thread Filippo Spiga
Dear Open MPI users, in my institution a cluster with dual-rail IB has recently deployed. Each compute node has two physical single-port Mellanox Connect-IB MT27600 card (mlx5_0, mlx5_1). By running bandwidth tests (OSU 4.2 benchmark) using MVAPICH2, I can achieve from one node to another (1 MP

Re: [OMPI users] "-bind-to numa" of openmpi-1.7.4rc1 dosen't work for our magny cours based 32 core node

2013-12-21 Thread Ralph Castain
It seems to be working fine for me: [rhc@bend001 tcp]$ mpirun -np 2 -host bend001 -report-bindings -mca rmaps_lama_bind 1c -mca rmaps lama hostname bend001 [bend001:17005] MCW rank 1 bound to socket 0[core 1[hwt 0-1]]: [../BB/../../../..][../../../../../..] [bend001:17005] MCW rank 0 bound to so