[OMPI users] a question about MPI dynamic process manage

2017-03-22 Thread gzzh...@buaa.edu.cn
Hi team: I have a question about MPI dynamic process manage, I hope you can provide some help. First of all, the MPI program running on multiple nodes, the group with MPI_COMM_WORLD was splitted into some subgroups by nodes and sub-communicators were created respectively so that MPI proces

Re: [OMPI users] Openmpi 1.10.4 crashes with 1024 processes

2017-03-22 Thread Howard Pritchard
Forgot you probably need an equal sign after btl arg Howard Pritchard schrieb am Mi. 22. März 2017 um 18:11: > Hi Goetz > > Thanks for trying these other versions. Looks like a bug. Could you post > the config.log output from your build of the 2.1.0 to the list? > > Also could you try running

Re: [OMPI users] Openmpi 1.10.4 crashes with 1024 processes

2017-03-22 Thread Howard Pritchard
Hi Goetz Thanks for trying these other versions. Looks like a bug. Could you post the config.log output from your build of the 2.1.0 to the list? Also could you try running the job using this extra command line arg to see if the problem goes away? mpirun --mca btl ^vader (rest of your args) H

Re: [OMPI users] migrating to the MPI_F08 module

2017-03-22 Thread Tom Rosmond
Gilles, Yes, I found that definition about 5 minutes after I posted the question. Thanks for the response. Tom On 03/22/2017 03:47 PM, Gilles Gouaillardet wrote: Tom, what if you use type(mpi_datatype) :: mpiint Cheers, Gilles On Thursday, March 23, 2017, Tom Rosmond

Re: [OMPI users] migrating to the MPI_F08 module

2017-03-22 Thread Gilles Gouaillardet
Tom, what if you use type(mpi_datatype) :: mpiint Cheers, Gilles On Thursday, March 23, 2017, Tom Rosmond wrote: > > Hello; > > I am converting some fortran 90/95 programs from the 'mpif.h' include file > to the 'mpi_f08' model and have encountered a problem. Here is a simple > test program t

[OMPI users] migrating to the MPI_F08 module

2017-03-22 Thread Tom Rosmond
Hello; I am converting some fortran 90/95 programs from the 'mpif.h' include file to the 'mpi_f08' model and have encountered a problem. Here is a simple test program that demonstrates it: __- program testf08 !

[OMPI users] Help with Open MPI 2.1.0 and PGI 16.10: Configure and C++

2017-03-22 Thread Matt Thompson
All, I'm hoping one of you knows what I might be doing wrong here. I'm trying to use Open MPI 2.1.0 for PGI 16.10 (Community Edition) on macOS. Now, I built it a la: http://www.pgroup.com/userforum/viewtopic.php?p=21105#21105 and found that it built, but the resulting mpifort, etc were just not

Re: [OMPI users] Openmpi 1.10.4 crashes with 1024 processes

2017-03-22 Thread Götz Waschk
On Wed, Mar 22, 2017 at 7:46 PM, Howard Pritchard wrote: > Hi Goetz, > > Would you mind testing against the 2.1.0 release or the latest from the > 1.10.x series (1.10.6)? Hi Howard, after sending my mail I have tested both 1.10.6 and 2.1.0 and I have received the same error. I have also tested o

Re: [OMPI users] Openmpi 1.10.4 crashes with 1024 processes

2017-03-22 Thread Howard Pritchard
Hi Goetz, Would you mind testing against the 2.1.0 release or the latest from the 1.10.x series (1.10.6)? Thanks, Howard 2017-03-22 6:25 GMT-06:00 Götz Waschk : > Hi everyone, > > I'm testing a new machine with 32 nodes of 32 cores each using the IMB > benchmark. It is working fine with 512 p

Re: [OMPI users] OpenMPI-2.1.0 problem with executing orted when using SGE

2017-03-22 Thread r...@open-mpi.org
Sorry folks - for some reason (probably timing for getting 2.1.0 out), the fix for this got pushed to v2.1.1 - see the PR here: https://github.com/open-mpi/ompi/pull/3163 > On Mar 22, 2017, at 7:49 AM, Reuti wrote: > >> >> Am 22.03.2017 um 15:31

Re: [OMPI users] OpenMPI-2.1.0 problem with executing orted when using SGE

2017-03-22 Thread Reuti
> Am 22.03.2017 um 15:31 schrieb Heinz-Ado Arnolds > : > > Dear Reuti, > > thanks a lot, you're right! But why did the default behavior change but not > the value of this parameter: > > 2.1.0: MCA plm rsh: parameter "plm_rsh_agent" (current value: "ssh : rsh", > data source: default, level:

Re: [OMPI users] "Warning :: opal_list_remove_item" with openmpi-2.1.0rc4

2017-03-22 Thread Gilles Gouaillardet
Roland, the easiest way is to use an external hwloc that is configured with --disable-nvml an other option is to hack the embedded hwloc configure.m4 and pass --disable-nvml to the embedded hwloc configure. note this requires you run autogen.sh and you hence needs recent autotools. i guess Open

Re: [OMPI users] OpenMPI-2.1.0 problem with executing orted when using SGE

2017-03-22 Thread Heinz-Ado Arnolds
Dear Reuti, thanks a lot, you're right! But why did the default behavior change but not the value of this parameter: 2.1.0: MCA plm rsh: parameter "plm_rsh_agent" (current value: "ssh : rsh", data source: default, level: 2 user/detail, type: string, synonyms: pls_rsh_agent, orte_rsh_agent)

Re: [OMPI users] OpenMPI-2.1.0 problem with executing orted when using SGE

2017-03-22 Thread Reuti
Hi, > Am 22.03.2017 um 10:44 schrieb Heinz-Ado Arnolds > : > > Dear users and developers, > > first of all many thanks for all the great work you have done for OpenMPI! > > Up to OpenMPI-1.10.6 the mechanism for starting orted was to use SGE/qrsh: > mpirun -np 8 --map-by ppr:4:node ./myid >

[OMPI users] Openmpi 1.10.4 crashes with 1024 processes

2017-03-22 Thread Götz Waschk
Hi everyone, I'm testing a new machine with 32 nodes of 32 cores each using the IMB benchmark. It is working fine with 512 processes, but it crashes with 1024 processes after a running for a minute: [pax11-17:16978] *** Process received signal *** [pax11-17:16978] Signal: Bus error (7) [pax11-17:

Re: [OMPI users] "Warning :: opal_list_remove_item" with openmpi-2.1.0rc4

2017-03-22 Thread Roland Fehrenbacher
> "SJ" == Sylvain Jeaugey writes: SJ> If you installed CUDA libraries and includes in /usr, then it's SJ> not surprising hwloc finds them even without defining CFLAGS. Well, that's the place where distribution packages install to :) I don't think a build system should misbehave, if l

[OMPI users] OpenMPI-2.1.0 problem with executing orted when using SGE

2017-03-22 Thread Heinz-Ado Arnolds
Dear users and developers, first of all many thanks for all the great work you have done for OpenMPI! Up to OpenMPI-1.10.6 the mechanism for starting orted was to use SGE/qrsh: mpirun -np 8 --map-by ppr:4:node ./myid /opt/sge-8.1.8/bin/lx-amd64/qrsh -inherit -nostdin -V orted --hnp-topo-sig