[OMPI users] OpenMPI with PSM on True Scale with OmniPath drivers

2018-01-22 Thread William Hay
We have a couple of clusters with Qlogic Infinipath/Intel TrueScale
networking.  While testing a kernel upgrade we find that the Truescale
drivers will no longer build against recent RHEL kernels.  Intel tells
us that the Omnipath drivers will work for True Scale adapters so we
install those.  Basic functionality appears fine however we are having
trouble getting OpenMPI to work.

Using our existing builds of OpenMPI 1.10 jobs receive lots of signal
11 and crash(output attached)

If we modify LD_LIBRARY_PATH to point to the directory containing the
compatibility library provides as part of the OmniPath drivers it instead
produces complainst about not finding /dev/hfi1_0 which exists on our
cluster with actual OmniPath but not on the clusters with TrueScale
(output also attached).

We had a similar issue with Intel MPI but there it was possible to get
it to work by passing a -psm option to mpirun.  That combined with the
mention of PSM2 in the output when complaining about /dev/hfi1_0 makes
us think OpenMPI is trying to run with PSM2 rather than the original
PSM and failing because that isn't supported by TrueScale.

We hoped that there would be an mca parameter or combination of parameters
that would resolve this issue but while Googling has turned up a few
things that look like they would force the use of PSM over PSM2 none of
them seem to make a difference.

Any suggestions?

William

mpi_pi:16465 terminated with signal 11 at PC=2b213094aa0e SP=7ffc6d5ba5e0.  
Backtrace:

mpi_pi:16470 terminated with signal 11 at PC=2ae8d364fa0e SP=7ffce1c62ee0.  
Backtrace:
/shared/ucl/apps/openmpi/1.10.1/no-verbs/gnu-4.9.2/lib/libmpi.so.12(PMPI_Comm_size+0x3e)[0x2ae8d364fa0e]
/home/ccaawih/openmpi_pi/mpi_pi[0x401522]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x2ae8d4026c05]
/home/ccaawih/openmpi_pi/mpi_pi[0x4013f9]

mpi_pi:16463 terminated with signal 11 at PC=2b368a310a0e SP=7ffd71d817e0.  
Backtrace:

mpi_pi:16466 terminated with signal 11 at PC=2b1a36c91a0e SP=7ffdbf472be0.  
Backtrace:
/shared/ucl/apps/openmpi/1.10.1/no-verbs/gnu-4.9.2/lib/libmpi.so.12(PMPI_Comm_size+0x3e)[0x2b1a36c91a0e]
/home/ccaawih/openmpi_pi/mpi_pi[0x401522]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x2b1a37668c05]
/home/ccaawih/openmpi_pi/mpi_pi[0x4013f9]

mpi_pi:16468 terminated with signal 11 at PC=2ab4a84fba0e SP=7ffe40d69660.  
Backtrace:
/shared/ucl/apps/openmpi/1.10.1/no-verbs/gnu-4.9.2/lib/libmpi.so.12(PMPI_Comm_size+0x3e)[0x2ab4a84fba0e]
/home/ccaawih/openmpi_pi/mpi_pi[0x401522]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x2ab4a8ed2c05]
/home/ccaawih/openmpi_pi/mpi_pi[0x4013f9]
/shared/ucl/apps/openmpi/1.10.1/no-verbs/gnu-4.9.2/lib/libmpi.so.12(PMPI_Comm_size+0x3e)[0x2b213094aa0e]
/home/ccaawih/openmpi_pi/mpi_pi[0x401522]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x2b2131321c05]
/home/ccaawih/openmpi_pi/mpi_pi[0x4013f9]

mpi_pi:16472 terminated with signal 11 at PC=2b373d729a0e SP=7ffce87428e0.  
Backtrace:

mpi_pi:16464 terminated with signal 11 at PC=2b0253fe4a0e SP=7ffdb96f12e0.  
Backtrace:
/shared/ucl/apps/openmpi/1.10.1/no-verbs/gnu-4.9.2/lib/libmpi.so.12(PMPI_Comm_size+0x3e)[0x2b0253fe4a0e]
/home/ccaawih/openmpi_pi/mpi_pi[0x401522]
/shared/ucl/apps/openmpi/1.10.1/no-verbs/gnu-4.9.2/lib/libmpi.so.12(PMPI_Comm_size+0x3e)[0x2b368a310a0e]
/home/ccaawih/openmpi_pi/mpi_pi[0x401522]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x2b368ace7c05]
/home/ccaawih/openmpi_pi/mpi_pi[0x4013f9]
/shared/ucl/apps/openmpi/1.10.1/no-verbs/gnu-4.9.2/lib/libmpi.so.12(PMPI_Comm_size+0x3e)[0x2b373d729a0e]
/home/ccaawih/openmpi_pi/mpi_pi[0x401522]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x2b373e100c05]
/home/ccaawih/openmpi_pi/mpi_pi[0x4013f9]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x2b02549bbc05]
/home/ccaawih/openmpi_pi/mpi_pi[0x4013f9]

mpi_pi:19144 terminated with signal 11 at PC=2ad2bd9aba0e SP=7ffdd91828e0.  
Backtrace:

mpi_pi:16462 terminated with signal 11 at PC=2ac24f9e5a0e SP=7ffcea97b160.  
Backtrace:

mpi_pi:19148 terminated with signal 11 at PC=2b413cc4ca0e SP=7ffce3d51ee0.  
Backtrace:
/shared/ucl/apps/openmpi/1.10.1/no-verbs/gnu-4.9.2/lib/libmpi.so.12(PMPI_Comm_size+0x3e)[0x2b413cc4ca0e]
/home/ccaawih/openmpi_pi/mpi_pi[0x401522]

mpi_pi:16469 terminated with signal 11 at PC=2ae1e8fdda0e SP=7fffa67fe2e0.  
Backtrace:

mpi_pi:16471 terminated with signal 11 at PC=2ac89c0b5a0e SP=7ffe1157ba60.  
Backtrace:
/shared/ucl/apps/openmpi/1.10.1/no-verbs/gnu-4.9.2/lib/libmpi.so.12(PMPI_Comm_size+0x3e)[0x2ac24f9e5a0e]
/home/ccaawih/openmpi_pi/mpi_pi[0x401522]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x2ac2503bcc05]
/home/ccaawih/openmpi_pi/mpi_pi[0x4013f9]
/shared/ucl/apps/openmpi/1.10.1/no-verbs/gnu-4.9.2/lib/libmpi.so.12(PMPI_Comm_size+0x3e)[0x2ad2bd9aba0e]
/home/ccaawih/openmpi_pi/mpi_pi[0x401522]
/shared/ucl/apps/openmpi/1.10.1/no-verbs/gnu-4.9.2/lib/libmpi.so.12(PMPI_Comm_size+0x3e)[0x2ae1e8fdda0e]
/home/ccaawih/openmpi_pi/mpi_pi[0x401522]
/shared/ucl/apps/openmpi/1.10.1/no-verbs/gnu-4.9.2/lib/libmpi.so.12(PMPI_Comm_size+0x3e)[0x2ac89c0b5a0e]
/home/cc

Re: [OMPI users] OpenMPI with PSM on True Scale with OmniPath drivers

2018-01-22 Thread Gilles Gouaillardet
William,

In order to force PSM (aka Infinipath) you can

mpirun --mca pml cm --mca mtl psm ...

(Replace with psm2 for PSM2 (aka Omnipath)

You can also

mpirun --mca pml_base_verbose 10 --mca mtl_base_verbose 10 ...

in order to collect some logs.

Bottom line, pml/cm should be selected (instead of pml/ob1) and the appropriate 
mtl should be selected.


On top of that, you might need to rebuild Open MPI if some user level library 
has been changed.

Note Open MPI 1.10 is now legacy, and I strongly encourage you to upgrade to 
2.1.x or 3.0.x


Cheers,

Gilles


William Hay  wrote:
>We have a couple of clusters with Qlogic Infinipath/Intel TrueScale
>networking.  While testing a kernel upgrade we find that the Truescale
>drivers will no longer build against recent RHEL kernels.  Intel tells
>us that the Omnipath drivers will work for True Scale adapters so we
>install those.  Basic functionality appears fine however we are having
>trouble getting OpenMPI to work.
>
>Using our existing builds of OpenMPI 1.10 jobs receive lots of signal
>11 and crash(output attached)
>
>If we modify LD_LIBRARY_PATH to point to the directory containing the
>compatibility library provides as part of the OmniPath drivers it instead
>produces complainst about not finding /dev/hfi1_0 which exists on our
>cluster with actual OmniPath but not on the clusters with TrueScale
>(output also attached).
>
>We had a similar issue with Intel MPI but there it was possible to get
>it to work by passing a -psm option to mpirun.  That combined with the
>mention of PSM2 in the output when complaining about /dev/hfi1_0 makes
>us think OpenMPI is trying to run with PSM2 rather than the original
>PSM and failing because that isn't supported by TrueScale.
>
>We hoped that there would be an mca parameter or combination of parameters
>that would resolve this issue but while Googling has turned up a few
>things that look like they would force the use of PSM over PSM2 none of
>them seem to make a difference.
>
>Any suggestions?
>
>William
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


Re: [OMPI users] OpenMPI with PSM on True Scale with OmniPath drivers

2018-01-22 Thread Cabral, Matias A
Hi William, 

Couple other questions: 
 - Please share how you ompi configure line looks like. 
-  Please clarify which is/are the compat libraries you refer to. There are 
some that are actually for the opposite case: Making TS app/libs run on 
Omnipath. 
-  As Gilles mentions, moving to a newer major OMPI version is advisable. If 
this is not possible, move to 1.10.7 that has many updates against 1.10.1. 

Thanks, 

_MAC


-Original Message-
From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of Gilles 
Gouaillardet
Sent: Monday, January 22, 2018 3:31 AM
To: Open MPI Users 
Subject: Re: [OMPI users] OpenMPI with PSM on True Scale with OmniPath drivers

William,

In order to force PSM (aka Infinipath) you can

mpirun --mca pml cm --mca mtl psm ...

(Replace with psm2 for PSM2 (aka Omnipath)

You can also

mpirun --mca pml_base_verbose 10 --mca mtl_base_verbose 10 ...

in order to collect some logs.

Bottom line, pml/cm should be selected (instead of pml/ob1) and the appropriate 
mtl should be selected.


On top of that, you might need to rebuild Open MPI if some user level library 
has been changed.

Note Open MPI 1.10 is now legacy, and I strongly encourage you to upgrade to 
2.1.x or 3.0.x


Cheers,

Gilles


William Hay  wrote:
>We have a couple of clusters with Qlogic Infinipath/Intel TrueScale 
>networking.  While testing a kernel upgrade we find that the Truescale 
>drivers will no longer build against recent RHEL kernels.  Intel tells 
>us that the Omnipath drivers will work for True Scale adapters so we 
>install those.  Basic functionality appears fine however we are having 
>trouble getting OpenMPI to work.
>
>Using our existing builds of OpenMPI 1.10 jobs receive lots of signal
>11 and crash(output attached)
>
>If we modify LD_LIBRARY_PATH to point to the directory containing the 
>compatibility library provides as part of the OmniPath drivers it 
>instead produces complainst about not finding /dev/hfi1_0 which exists 
>on our cluster with actual OmniPath but not on the clusters with 
>TrueScale (output also attached).
>
>We had a similar issue with Intel MPI but there it was possible to get 
>it to work by passing a -psm option to mpirun.  That combined with the 
>mention of PSM2 in the output when complaining about /dev/hfi1_0 makes 
>us think OpenMPI is trying to run with PSM2 rather than the original 
>PSM and failing because that isn't supported by TrueScale.
>
>We hoped that there would be an mca parameter or combination of 
>parameters that would resolve this issue but while Googling has turned 
>up a few things that look like they would force the use of PSM over 
>PSM2 none of them seem to make a difference.
>
>Any suggestions?
>
>William
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


Re: [OMPI users] Installation of openmpi-1.10.7 fails

2018-01-22 Thread Edgar Gabriel
after some further investigation, I am fairly confident that this is not 
an MPI I/O problem.


The input file input_tmp.in is generated in this sequence of 
instructions (which is in Modules/open_close_input_file.f90)


---

  IF ( TRIM(input_file_) /= ' ' ) THEn
 !
 ! copy file to be opened into input_file
 !
 input_file = input_file_
 !
  ELSE
 !
 ! if no file specified then copy from standard input
 !
 input_file="input_tmp.in"
 OPEN(UNIT = stdtmp, FILE=trim(input_file), FORM='formatted', &
  STATUS='unknown', IOSTAT = ierr )
 IF ( ierr > 0 ) GO TO 30
 !
 dummy=' '
 WRITE(stdout, '(5x,a)') "Waiting for input..."
 DO WHILE ( TRIM(dummy) .NE. "MAGICALME" )
    READ (stdin,fmt='(A512)',END=20) dummy
    WRITE (stdtmp,'(A)') trim(dummy)
 END DO
 !
20   CLOSE ( UNIT=stdtmp, STATUS='keep' )



Basically, if no input file has been provided, the input file is 
generated by reading from standard input. Since the application is being 
launched e.g. with


mpirun -np 64 ../bin/pw.x -npool 64 nscf.out

the data comes from nscf.in. I simply do not know enough about IO 
forwarding do be able to tell why we do not see the entire file, but one 
interesting detail is that if I run it in the debugger, the input_tmp.in 
is created correctly. However, if I run it using mpirun as shown above, 
the file is cropped incorrectly, which leads to the error message 
mentioned in this email chain.


Anyway, I would probably need some help here from somebody who knows the 
runtime better than me on what could go wrong at this point.


Thanks

Edgar




On 1/19/2018 1:22 PM, Vahid Askarpour wrote:

Concerning the following error

     from pw_readschemafile : error #         1
     xml data file not found

The nscf run uses files generated by the scf.in run. So I first run 
scf.in and when it finishes, I run nscf.in. If you have done this and 
still get the above error, then this could be another bug. It does not 
happen for me with intel14/openmpi-1.8.8.


Thanks for the update,

Vahid

On Jan 19, 2018, at 3:08 PM, Edgar Gabriel > wrote:


ok, here is what found out so far, will have to stop for now however 
for today:


 1. I can in fact reproduce your bug on my systems.

 2. I can confirm that the problem occurs both with romio314 and 
ompio. I *think* the issue is that the input_tmp.in file is 
incomplete. In both cases (ompio and romio) the end of the file looks 
as follows (and its exactly the same for both libraries):


gabriel@crill-002:/tmp/gabriel/qe-6.2.1/QE_input_files> tail -10 
input_tmp.in

  0.6667  0.5000  0.8333  5.787037e-04
  0.6667  0.5000  0.9167  5.787037e-04
  0.6667  0.5833  0.  5.787037e-04
  0.6667  0.5833  0.0833  5.787037e-04
  0.6667  0.5833  0.1667  5.787037e-04
  0.6667  0.5833  0.2500  5.787037e-04
  0.6667  0.5833  0.  5.787037e-04
  0.6667  0.5833  0.4167  5.787037e-04
  0.6667  0.5833  0.5000  5.787037e-04
  0.6667  0.5833  0.5833  5

which is what I *think* causes the problem.

 3. I tried to find where input_tmp.in is generated, but haven't 
completely identified the location. However, I could not find MPI 
file_write(_all) operations anywhere in the code, although there are 
some MPI_file_read(_all) operations.


 4. I can confirm that the behavior with Open MPI 1.8.x is different. 
input_tmp.in looks more complete (at least it doesn't end in the 
middle of the line). The simulation does still not finish for me, but 
the bug reported is slightly different, I might just be missing a 
file or something



 from pw_readschemafile : error # 1
 xml data file not found

Since I think input_tmp.in is generated from data that is provided in 
nscf.in, it might very well be something in the MPI_File_read(_all) 
operation that causes the issue, but since both ompio and romio are 
affected, there is good chance that something outside of the control 
of io components is causing the trouble (maybe a datatype issue that 
has changed from 1.8.x series to 3.0.x).


 5. Last but not least, I also wanted to mention that I ran all 
parallel tests that I found in the testsuite  (run-tests-cp-parallel, 
run-tests-pw-parallel, run-tests-ph-parallel, run-tests-epw-parallel 
), and they all passed with ompio (and romio314 although I only ran a 
subset of the tests with romio314).


Thanks

Edgar

-




On 01/19/2018 11:44 AM, Vahid Askarpour wrote:

Hi Edgar,

Just to let you know that the nscf run with --mca io ompio crashed 
like the other two runs.


Thank you,

Vahid

On Jan 19, 2018, at 12:46 PM, Edgar Gabriel 
mailto:egabr...@central.uh.edu>> wrote:


ok, thank you for the information. Two short questions and 
requests. I have qe-6.2.1 compiled and running on my system 
(although it is with gcc-6.4 instead of the intel compiler), and I 
am currently runn

Re: [OMPI users] Installation of openmpi-1.10.7 fails

2018-01-22 Thread Edgar Gabriel
well, my final comment on this topic, as somebody suggested earlier in 
this email chain, if you provide the input with the -i argument instead 
of piping from standard input, things seem to work as far as I can see 
(disclaimer: I do not know what the final outcome should be. I just see 
that the application does not complain about the 'end of file while 
reading crystal k points'). So maybe that is the most simple solution.


Thanks

Edgar


On 1/22/2018 1:17 PM, Edgar Gabriel wrote:


after some further investigation, I am fairly confident that this is 
not an MPI I/O problem.


The input file input_tmp.in is generated in this sequence of 
instructions (which is in Modules/open_close_input_file.f90)


---

   IF ( TRIM(input_file_) /= ' ' ) THEn
  !
  ! copy file to be opened into input_file
  !
  input_file = input_file_
  !
   ELSE
  !
  ! if no file specified then copy from standard input
  !
  input_file="input_tmp.in"
  OPEN(UNIT = stdtmp, FILE=trim(input_file), FORM='formatted', &
   STATUS='unknown', IOSTAT = ierr )
  IF ( ierr > 0 ) GO TO 30
  !
  dummy=' '
  WRITE(stdout, '(5x,a)') "Waiting for input..."
  DO WHILE ( TRIM(dummy) .NE. "MAGICALME" )
     READ (stdin,fmt='(A512)',END=20) dummy
     WRITE (stdtmp,'(A)') trim(dummy)
  END DO
  !
20   CLOSE ( UNIT=stdtmp, STATUS='keep' )



Basically, if no input file has been provided, the input file is 
generated by reading from standard input. Since the application is 
being launched e.g. with


mpirun -np 64 ../bin/pw.x -npool 64 nscf.out

the data comes from nscf.in. I simply do not know enough about IO 
forwarding do be able to tell why we do not see the entire file, but 
one interesting detail is that if I run it in the debugger, the 
input_tmp.in is created correctly. However, if I run it using mpirun 
as shown above, the file is cropped incorrectly, which leads to the 
error message mentioned in this email chain.


Anyway, I would probably need some help here from somebody who knows 
the runtime better than me on what could go wrong at this point.


Thanks

Edgar




On 1/19/2018 1:22 PM, Vahid Askarpour wrote:

Concerning the following error

     from pw_readschemafile : error #         1
     xml data file not found

The nscf run uses files generated by the scf.in run. So I first run 
scf.in and when it finishes, I run nscf.in. If you have done this and 
still get the above error, then this could be another bug. It does 
not happen for me with intel14/openmpi-1.8.8.


Thanks for the update,

Vahid

On Jan 19, 2018, at 3:08 PM, Edgar Gabriel > wrote:


ok, here is what found out so far, will have to stop for now however 
for today:


 1. I can in fact reproduce your bug on my systems.

 2. I can confirm that the problem occurs both with romio314 and 
ompio. I *think* the issue is that the input_tmp.in file is 
incomplete. In both cases (ompio and romio) the end of the file 
looks as follows (and its exactly the same for both libraries):


gabriel@crill-002:/tmp/gabriel/qe-6.2.1/QE_input_files> tail -10 
input_tmp.in

  0.6667  0.5000  0.8333  5.787037e-04
  0.6667  0.5000  0.9167  5.787037e-04
  0.6667  0.5833  0.  5.787037e-04
  0.6667  0.5833  0.0833  5.787037e-04
  0.6667  0.5833  0.1667  5.787037e-04
  0.6667  0.5833  0.2500  5.787037e-04
  0.6667  0.5833  0.  5.787037e-04
  0.6667  0.5833  0.4167  5.787037e-04
  0.6667  0.5833  0.5000  5.787037e-04
  0.6667  0.5833  0.5833  5

which is what I *think* causes the problem.

 3. I tried to find where input_tmp.in is generated, but haven't 
completely identified the location. However, I could not find MPI 
file_write(_all) operations anywhere in the code, although there are 
some MPI_file_read(_all) operations.


 4. I can confirm that the behavior with Open MPI 1.8.x is 
different. input_tmp.in looks more complete (at least it doesn't end 
in the middle of the line). The simulation does still not finish for 
me, but the bug reported is slightly different, I might just be 
missing a file or something



 from pw_readschemafile : error # 1
 xml data file not found

Since I think input_tmp.in is generated from data that is provided 
in nscf.in, it might very well be something in the 
MPI_File_read(_all) operation that causes the issue, but since both 
ompio and romio are affected, there is good chance that something 
outside of the control of io components is causing the trouble 
(maybe a datatype issue that has changed from 1.8.x series to 3.0.x).


 5. Last but not least, I also wanted to mention that I ran all 
parallel tests that I found in the testsuite  
(run-tests-cp-parallel, run-tests-pw-parallel, 
run-tests-ph-parallel, run-tests-epw-parallel ), and they all passed 
with ompio (and romio314 although I only ran a subset of