[OMPI users] problems with OpenMPI 4.0.3

2020-05-29 Thread Alberto Morillas, Angelines via users

Good morning,

We hava a cluster with two type of infinibad cards

The first one :

lspci | grep -i mella
5e:00.0 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4]

mstflint -d 5e:00.0 q
Image type:FS3
FW Version:12.24.1000
FW Release Date:   26.11.2018
Product Version:   12.24.1000
Rom Info:  type=PXE version=3.5.603 cpu=AMD64
Description:   UIDGuidsNumber
Base GUID: 506b4b03001be9fa4
Base MAC:  506b4b1be9fa4
Image VSD: N/A
Device VSD:N/A
PSID:  DEL2180110032
Security Attributes:   N/A

# ibv_devinfo
hca_id:mlx5_0
transport:   InfiniBand (0)
fw_ver:   12.24.1000
node_guid: 506b:4b03:001b:e9fa
sys_image_guid:506b:4b03:001b:e9fa
vendor_id:  0x02c9
vendor_part_id: 4115
hw_ver:  0x0
board_id:DEL2180110032
phys_port_cnt:  1
port:1
state:  PORT_ACTIVE 
(4)
max_mtu:   4096 (5)
active_mtu:4096 (5)
sm_lid:1
port_lid:  20
port_lmc:0x00
link_layer:   InfiniBand
#ibstat
CA 'mlx5_0'
CA type: MT4115
Number of ports: 1
Firmware version: 12.24.1000
Hardware version: 0
Node GUID: 0x506b4b03001be9fa
System image GUID: 0x506b4b03001be9fa
Port 1:
State: Active
Physical state: LinkUp
Rate: 100
Base lid: 20
LMC: 0
SM lid: 1
Capability mask: 0x2659e848
Port GUID: 0x506b4b03001be9fa
Link layer: InfiniBand


And the other one

lspci | grep -i mella
06:00.0 Infiniband controller: Mellanox Technologies MT28908 Family [ConnectX-6]
mstflint -d 06:00.0 q
Image type:FS4
FW Version:20.26.4012
FW Release Date:   10.12.2019
Product Version:   20.26.4012
Rom Info:  type=UEFI version=14.19.17 cpu=AMD64
   type=PXE version=3.5.805 cpu=AMD64
Description:   UIDGuidsNumber
Base GUID: b8599f0300e4453e4
Base MAC:  b8599fe4453e4

ibv_devinfo
hca_id:mlx5_0
transport:   InfiniBand (0)
fw_ver:   20.26.4012
node_guid: b859:9f03:00e4:453e
sys_image_guid:b859:9f03:00e4:453e
vendor_id:  0x02c9
vendor_part_id: 4123
hw_ver:  0x0
board_id:LNV16
phys_port_cnt:  1
port:1
state:  PORT_ACTIVE 
(4)
max_mtu:   4096 (5)
active_mtu:4096 (5)
sm_lid:1
port_lid:  3
port_lmc:0x00
link_layer:   InfiniBand
ibstat
CA 'mlx5_0'
CA type: MT4123
Number of ports: 1
Firmware version: 20.26.4012
Hardware version: 0
Node GUID: 0xb8599f0300e4453e
System image GUID: 0xb8599f0300e4453e
Port 1:
State: Active
Physical state: LinkUp
Rate: 100
Base lid: 3
LMC: 0
SM lid: 1
Capability mask: 0x2659e848
Port GUID: 0xb8599f0300e4453e
Link layer: InfiniBand


At the beginning, we only have the first one, the cards with connectx-4, and we 
use openmpi-3.1.

[OMPI users] problems openmpi-4.0.3

2020-05-29 Thread Alberto Morillas, Angelines via users
Good morning,

We have a cluster with two kind of infiniband cards, one connectx-4 and the 
other connectx-6.
Openmpi-3.1.3 works fine, but when we start with connectx-6 we started to use 
openmpi-4.0.3 (that support connectx-6) and the programs that have several 
parts, first a call to a secuencial program and inside it a call to a parallel 
program, … (in our case the program is WRF, but we have others like this with 
the same problem),  this kind of programs suddenly stop,

…..
0 S  4556  87383  87361  0  80   0 - 126676 hrtime ?   00:05:25 real.exe
0 S  4556  87384  87361  0  80   0 - 126677 hrtime ?   00:05:33 real.exe
0 S  4556  87385  87361  0  80   0 - 126675 hrtime ?   00:05:28 real.exe
……
The WCHAN=hrtime, and it looks that it is running, but really it doesn´t work

We don´t know if it could be  problem with slurm and this version of openmpi… 
Any idea?



Angelines Alberto Morillas

Unidad de Arquitectura Informática
Despacho: 22.1.32
Telf.: +34 91 346 6119
Fax:   +34 91 346 6537

skype: angelines.alberto

CIEMAT
Avenida Complutense, 40
28040 MADRID





[OMPI users] Running mpirun with grid

2020-05-29 Thread Kulshrestha, Vipul via users
Hi,

I need to launch my openmpi application on grid. My application is designed to 
run N processes, where each process would have M threads. I am using open MPI 
version 4.0.1

% /build/openmpi/openmpi-4.0.1/rhel6/bin/ompi_info | grep grid
 MCA ras: gridengine (MCA v2.1.0, API v2.0.0, Component v4.0.1)

To run it without grid, I run it as (say N = 7, M = 2)
% mpirun -np 7 

The above works well and runs N processes. Based on some earlier advice on this 
forum, I have setup the grid submission using the a grid job submission script 
that modifies the grid slow allocation, so that mpirun launches only 1 
application process copy on each host allocated by grid. I have some partial 
success. I think grid is able to start the job and then mpirun also starts to 
run, but then it errors out with below mentioned errors. Strangely, after 
giving message for having started all the daemons, it reports that it was not 
able to start one or more daemons.



I have setup a grid submission script that modifies the pe_hostfile and it 
appears that mpirun is able to take it and then is able use the host 
information to start launching the jobs. However, mpirun halts before it can 
start all the child processes. I enabled some debug logs but am not able to 
figure out a possible cause.



Could somebody look at this and guide how to resolve this issue?



I have pasted the detailed log as well as my job submission script below.



As a clarification, when I run the mpirun without grid, it (mpirun and my 
application) works on the same set of hosts without any problems.



Thanks,

Vipul



Job submission script:

#!/bin/sh

#$ -N velsyn

#$ -pe orte2 14

#$ -V -cwd -j y

#$ -o out.txt

#

echo "Got $NSLOTS slots."

echo "tmpdir is $TMPDIR"

echo "pe_hostfile is $PE_HOSTFILE"





cat $PE_HOSTFILE

newhostfile=/testdir/tmp/pe_hostfile



awk '{$2 = $2/2; print}' $PE_HOSTFILE > $newhostfile



export PE_HOSTFILE=$newhostfile

export LD_LIBRARY_PATH=/build/openmpi/openmpi-4.0.1/rhel6/lib



mpirun --merge-stderr-to-stdout --output-filename ./output:nojobid,nocopy --mca 
routed direct --mca orte_base_help_aggregate 0 --mca plm_base_verbose 1 
--bind-to none --report-bindings -np 7 



The out.txt content is:

Got 14 slots.

tmpdir is /tmp/182117160.1.all.q

pe_hostfile is /var/spool/sge/bos2/active_jobs/182117160.1/pe_hostfile

bos2.wv.org.com 2 al...@bos2.wv.org.com  
art8.wv.org.com 2 al...@art8.wv.org.com  
art10.wv.org.com 2 al...@art10.wv.org.com  
hpb7.wv.org.com 2 al...@hpb7.wv.org.com  
bos15.wv.org.com 2 al...@bos15.wv.org.com  
bos1.wv.org.com 2 al...@bos1.wv.org.com  
hpb11.wv.org.com 2 al...@hpb11.wv.org.com  
[bos2:22657] [[8251,0],0] plm:rsh: using "/wv/grid2/sge/bin/lx-amd64/qrsh 
-inherit -nostdin -V -verbose" for launching [bos2:22657] [[8251,0],0] plm:rsh: 
final template argv:

  /grid2/sge/bin/lx-amd64/qrsh -inherit -nostdin -V -verbose  set 
path = ( /build/openm

pi/openmpi-4.0.1/rhel6/bin $path ) ; if ( $?LD_LIBRARY_PATH == 1 ) set 
OMPI_have_llp ; if ( $?LD_LIBR ARY_PATH == 0 ) setenv LD_LIBRARY_PATH 
/build/openmpi/openmpi-4.0.1/rhel6/lib ; if ( $?OMPI_have_llp == 1 ) setenv 
LD_LIBRARY_PATH /build/openmpi/openmpi-4.0.1/rhel6/lib:$LD_LIBRARY_PATH ; if ( 
$?DYLD_L IBRARY_PATH == 1 ) set OMPI_have_dllp ; if ( $?DYLD_LIBRARY_PATH == 0 
) setenv DYLD_LIBRARY_PATH /bui ld/openmpi/openmpi-4.0.1/rhel6/lib ; if ( 
$?OMPI_have_dllp == 1 ) setenv DYLD_LIBRARY_PATH /build/ope

nmpi/openmpi-4.0.1/rhel6/lib:$DYLD_LIBRARY_PATH ;   
/build/openmpi/openmpi-4.0.1/rhel6/bin/orted -mca

orte_report_bindings "1" -mca ess "env" -mca ess_base_jobid "540737536" -mca 
ess_base_vpid "" -mca ess_base_num_procs "7" -mca orte_node_regex

e>"bos[1:2],art[1:8],art[2:10],hpb[1:7],bos[2:15],

bos[1:1],hpb[2:11]@0(7)" -mca orte_hnp_uri 
"540737536.0;tcp://147.34.116.60:50769" --mca routed "dire ct" --mca 
orte_base_help_aggregate "0" --mca plm_base_verbose "1" -mca plm "rsh" 
--tree-spawn -mca or te_parent_uri "540737536.0;tcp://147.34.116.60:50769" -mca 
orte_output_filename "./output:nojobid,noc opy" -mca hwloc_base_binding_policy 
"none" -mca hwloc_base_report_bindings "1" -mca pmix "^s1,s2,cray ,isolated"

Starting server daemon at host "art10"

Starting server daemon at host "art8"

Starting server daemon at host "bos1"

Starting server daemon at host "hpb7"

Starting server daemon at host "hpb11"

Starting server daemon at host "bos15"

Server daemon successfully started with task id "1.art8"

Server daemon successfully started with task id "1.bos1"

Server daemon successfully started with task id "1.art10"

Server daemon successfully started with task id "1.bos15"

Server daemon successfully started with task id "1.hpb7"

Server daemon successfully started with task id "1.hpb11"

Unmatched ".


Re: [OMPI users] Running mpirun with grid

2020-05-29 Thread John Hearns via users
Good morning Vipul. I would like to ask some higher level questions
regarding your HPC cluster.
What are the manufacturers of the cluster nodes.
How many compute nodes?
What network interconnect do you have - gigabit ethernet, 10gig ethernet,
Infiniband, Omnipath?
Which cluster middleware - openHPC? Rocks? Bright? Qlustar?
Which version of grid  - there have been MANY versions of this over the
years.
Who installed the cluster ?

And now the big question - and everyone on the list will laugh at me for
this
Would you consider switching to using the Slurm batch queuing system?









On Sat, 30 May 2020 at 00:41, Kulshrestha, Vipul via users <
users@lists.open-mpi.org> wrote:

> Hi,
>
>
>
> I need to launch my openmpi application on grid. My application is
> designed to run N processes, where each process would have M threads. I am
> using open MPI version 4.0.1
>
>
>
> % /build/openmpi/openmpi-4.0.1/rhel6/bin/ompi_info | grep grid
>
>  MCA ras: gridengine (MCA v2.1.0, API v2.0.0, Component
> v4.0.1)
>
>
>
> To run it without grid, I run it as (say N = 7, M = 2)
>
> % mpirun –np 7 
>
>
>
> The above works well and runs N processes. Based on some earlier advice on
> this forum, I have setup the grid submission using the a grid job
> submission script that modifies the grid slow allocation, so that mpirun
> launches only 1 application process copy on each host allocated by grid. I
> have some partial success. I think grid is able to start the job and then
> mpirun also starts to run, but then it errors out with below mentioned
> errors. Strangely, after giving message for having started all the daemons,
> it reports that it was not able to start one or more daemons.
>
>
>
> I have setup a grid submission script that modifies the pe_hostfile and it
> appears that mpirun is able to take it and then is able use the host
> information to start launching the jobs. However, mpirun halts before it
> can start all the child processes. I enabled some debug logs but am not
> able to figure out a possible cause.
>
>
>
> Could somebody look at this and guide how to resolve this issue?
>
>
>
> I have pasted the detailed log as well as my job submission script below.
>
>
>
> As a clarification, when I run the mpirun without grid, it (mpirun and my
> application) works on the same set of hosts without any problems.
>
>
>
> Thanks,
>
> Vipul
>
>
>
> Job submission script:
>
> #!/bin/sh
>
> #$ -N velsyn
>
> #$ -pe orte2 14
>
> #$ -V -cwd -j y
>
> #$ -o out.txt
>
> #
>
> echo "Got $NSLOTS slots."
>
> echo "tmpdir is $TMPDIR"
>
> echo "pe_hostfile is $PE_HOSTFILE"
>
>
>
>
>
> cat $PE_HOSTFILE
>
> newhostfile=/testdir/tmp/pe_hostfile
>
>
>
> awk '{$2 = $2/2; print}' $PE_HOSTFILE > $newhostfile
>
>
>
> export PE_HOSTFILE=$newhostfile
>
> export LD_LIBRARY_PATH=/build/openmpi/openmpi-4.0.1/rhel6/lib
>
>
>
> mpirun --merge-stderr-to-stdout --output-filename ./output:nojobid,nocopy
> --mca routed direct --mca orte_base_help_aggregate 0 --mca plm_base_verbose
> 1 --bind-to none --report-bindings -np 7 
>
>
>
> The out.txt content is:
>
> Got 14 slots.
>
> tmpdir is /tmp/182117160.1.all.q
>
> pe_hostfile is /var/spool/sge/bos2/active_jobs/182117160.1/pe_hostfile
>
> bos2.wv.org.com 2 al...@bos2.wv.org.com  art8.wv.org.com 2
> al...@art8.wv.org.com  art10.wv.org.com 2 al...@art10.wv.org.com
>  hpb7.wv.org.com 2 al...@hpb7.wv.org.com  bos15.wv.org.com 2
> al...@bos15.wv.org.com  bos1.wv.org.com 2 al...@bos1.wv.org.com
>  hpb11.wv.org.com 2 al...@hpb11.wv.org.com  [bos2:22657]
> [[8251,0],0] plm:rsh: using "/wv/grid2/sge/bin/lx-amd64/qrsh -inherit
> -nostdin -V -verbose" for launching [bos2:22657] [[8251,0],0] plm:rsh:
> final template argv:
>
>   /grid2/sge/bin/lx-amd64/qrsh -inherit -nostdin -V -verbose
>  set path = ( /build/openm
>
> pi/openmpi-4.0.1/rhel6/bin $path ) ; if ( $?LD_LIBRARY_PATH == 1 ) set
> OMPI_have_llp ; if ( $?LD_LIBR ARY_PATH == 0 ) setenv LD_LIBRARY_PATH
> /build/openmpi/openmpi-4.0.1/rhel6/lib ; if ( $?OMPI_have_llp == 1 ) setenv
> LD_LIBRARY_PATH /build/openmpi/openmpi-4.0.1/rhel6/lib:$LD_LIBRARY_PATH ;
> if ( $?DYLD_L IBRARY_PATH == 1 ) set OMPI_have_dllp ; if (
> $?DYLD_LIBRARY_PATH == 0 ) setenv DYLD_LIBRARY_PATH /bui
> ld/openmpi/openmpi-4.0.1/rhel6/lib ; if ( $?OMPI_have_dllp == 1 ) setenv
> DYLD_LIBRARY_PATH /build/ope
>
> nmpi/openmpi-4.0.1/rhel6/lib:$DYLD_LIBRARY_PATH ;
> /build/openmpi/openmpi-4.0.1/rhel6/bin/orted -mca
>
> orte_report_bindings "1" -mca ess "env" -mca ess_base_jobid "540737536"
> -mca ess_base_vpid "
> e>" -mca ess_base_num_procs "7" -mca orte_node_regex
>
> e>"bos[1:2],art[1:8],art[2:10],hpb[1:7],bos[2:15],
>
> bos[1:1],hpb[2:11]@0(7)" -mca orte_hnp_uri "540737536.0;tcp://
> 147.34.116.60:50769" --mca routed "dire ct" --mca
> orte_base_help_aggregate "0" --mca plm_base_verbose "1" -mca plm "rsh"
> --tree-spawn -mca or te_parent_uri "540737536.0;tcp://147.34.116.60:50769"
> -mca orte_output_filename "./output:nojobid,noc opy" -mca
> hwloc_base_binding_policy "non

Re: [OMPI users] Running mpirun with grid

2020-05-29 Thread Gilles Gouaillardet via users
John,

Most of these questions are irrelevant with respect to the resolution
of this problem.

Please use this mailing list only for Open MPI related topics.


Cheers,

Gilles

On Sat, May 30, 2020 at 3:24 PM John Hearns via users
 wrote:
>
> Good morning Vipul. I would like to ask some higher level questions regarding 
> your HPC cluster.
> What are the manufacturers of the cluster nodes.
> How many compute nodes?
> What network interconnect do you have - gigabit ethernet, 10gig ethernet, 
> Infiniband, Omnipath?
> Which cluster middleware - openHPC? Rocks? Bright? Qlustar?
> Which version of grid  - there have been MANY versions of this over the years.
> Who installed the cluster ?
>
> And now the big question - and everyone on the list will laugh at me for 
> this
> Would you consider switching to using the Slurm batch queuing system?
>
>
>
>
>
>
>
>
>
> On Sat, 30 May 2020 at 00:41, Kulshrestha, Vipul via users 
>  wrote:
>>
>> Hi,
>>
>>
>>
>> I need to launch my openmpi application on grid. My application is designed 
>> to run N processes, where each process would have M threads. I am using open 
>> MPI version 4.0.1
>>
>>
>>
>> % /build/openmpi/openmpi-4.0.1/rhel6/bin/ompi_info | grep grid
>>
>>  MCA ras: gridengine (MCA v2.1.0, API v2.0.0, Component 
>> v4.0.1)
>>
>>
>>
>> To run it without grid, I run it as (say N = 7, M = 2)
>>
>> % mpirun –np 7 
>>
>>
>>
>> The above works well and runs N processes. Based on some earlier advice on 
>> this forum, I have setup the grid submission using the a grid job submission 
>> script that modifies the grid slow allocation, so that mpirun launches only 
>> 1 application process copy on each host allocated by grid. I have some 
>> partial success. I think grid is able to start the job and then mpirun also 
>> starts to run, but then it errors out with below mentioned errors. 
>> Strangely, after giving message for having started all the daemons, it 
>> reports that it was not able to start one or more daemons.
>>
>>
>>
>> I have setup a grid submission script that modifies the pe_hostfile and it 
>> appears that mpirun is able to take it and then is able use the host 
>> information to start launching the jobs. However, mpirun halts before it can 
>> start all the child processes. I enabled some debug logs but am not able to 
>> figure out a possible cause.
>>
>>
>>
>> Could somebody look at this and guide how to resolve this issue?
>>
>>
>>
>> I have pasted the detailed log as well as my job submission script below.
>>
>>
>>
>> As a clarification, when I run the mpirun without grid, it (mpirun and my 
>> application) works on the same set of hosts without any problems.
>>
>>
>>
>> Thanks,
>>
>> Vipul
>>
>>
>>
>> Job submission script:
>>
>> #!/bin/sh
>>
>> #$ -N velsyn
>>
>> #$ -pe orte2 14
>>
>> #$ -V -cwd -j y
>>
>> #$ -o out.txt
>>
>> #
>>
>> echo "Got $NSLOTS slots."
>>
>> echo "tmpdir is $TMPDIR"
>>
>> echo "pe_hostfile is $PE_HOSTFILE"
>>
>>
>>
>>
>>
>> cat $PE_HOSTFILE
>>
>> newhostfile=/testdir/tmp/pe_hostfile
>>
>>
>>
>> awk '{$2 = $2/2; print}' $PE_HOSTFILE > $newhostfile
>>
>>
>>
>> export PE_HOSTFILE=$newhostfile
>>
>> export LD_LIBRARY_PATH=/build/openmpi/openmpi-4.0.1/rhel6/lib
>>
>>
>>
>> mpirun --merge-stderr-to-stdout --output-filename ./output:nojobid,nocopy 
>> --mca routed direct --mca orte_base_help_aggregate 0 --mca plm_base_verbose 
>> 1 --bind-to none --report-bindings -np 7 
>>
>>
>>
>> The out.txt content is:
>>
>> Got 14 slots.
>>
>> tmpdir is /tmp/182117160.1.all.q
>>
>> pe_hostfile is /var/spool/sge/bos2/active_jobs/182117160.1/pe_hostfile
>>
>> bos2.wv.org.com 2 al...@bos2.wv.org.com  art8.wv.org.com 2 
>> al...@art8.wv.org.com  art10.wv.org.com 2 al...@art10.wv.org.com 
>>  hpb7.wv.org.com 2 al...@hpb7.wv.org.com  bos15.wv.org.com 2 
>> al...@bos15.wv.org.com  bos1.wv.org.com 2 al...@bos1.wv.org.com  
>> hpb11.wv.org.com 2 al...@hpb11.wv.org.com  [bos2:22657] [[8251,0],0] 
>> plm:rsh: using "/wv/grid2/sge/bin/lx-amd64/qrsh -inherit -nostdin -V 
>> -verbose" for launching [bos2:22657] [[8251,0],0] plm:rsh: final template 
>> argv:
>>
>>   /grid2/sge/bin/lx-amd64/qrsh -inherit -nostdin -V -verbose  
>> set path = ( /build/openm
>>
>> pi/openmpi-4.0.1/rhel6/bin $path ) ; if ( $?LD_LIBRARY_PATH == 1 ) set 
>> OMPI_have_llp ; if ( $?LD_LIBR ARY_PATH == 0 ) setenv LD_LIBRARY_PATH 
>> /build/openmpi/openmpi-4.0.1/rhel6/lib ; if ( $?OMPI_have_llp == 1 ) setenv 
>> LD_LIBRARY_PATH /build/openmpi/openmpi-4.0.1/rhel6/lib:$LD_LIBRARY_PATH ; if 
>> ( $?DYLD_L IBRARY_PATH == 1 ) set OMPI_have_dllp ; if ( $?DYLD_LIBRARY_PATH 
>> == 0 ) setenv DYLD_LIBRARY_PATH /bui ld/openmpi/openmpi-4.0.1/rhel6/lib ; if 
>> ( $?OMPI_have_dllp == 1 ) setenv DYLD_LIBRARY_PATH /build/ope
>>
>> nmpi/openmpi-4.0.1/rhel6/lib:$DYLD_LIBRARY_PATH ;   
>> /build/openmpi/openmpi-4.0.1/rhel6/bin/orted -mca
>>
>> orte_report_bindings "1" -mca ess "env" -mca ess_base_jobid "540737536" -mca 
>> ess_base_vpid ">
>> e>" -mca ess_base_num_procs "7" -mca