Re: [OMPI users] GCC 4.9 and MPI_F08?

2014-08-14 Thread Daniels, Marcus G
Hi Jeff,

Works for me!

(With mpi_f08, GCC 4.9.1 absolutely insists on getting the finer details right 
on things like MPI_User_function types for MPI_Op_create.  So I'll assume the 
rest of the type checking is just as good, and be glad I took that minor 
detour..)

Thanks,

Marcus

-Original Message-
From: Jeff Squyres (jsquyres) [mailto:jsquy...@cisco.com] 
Sent: Wednesday, August 13, 2014 10:00 AM
To: Open MPI User's List
Cc: Daniels, Marcus G
Subject: Re: [OMPI users] GCC 4.9 and MPI_F08?

Marcus --

The fix was applied yesterday to the v1.8 branch.  Would you mind testing the 
v1.8 nightly tarball from last night, just to make sure it works for you?

http://www.open-mpi.org/nightly/v1.8/



On Aug 12, 2014, at 2:54 PM, Jeff Squyres (jsquyres)  wrote:

> @#$%@#$%#@$%
> 
> I was very confused by this bug report, because in my mind, a) this 
> functionality is on the SVN trunk, and b) we had moved the gcc functionality 
> to the v1.8 branch long ago.
> 
> I just checked the SVN/Trac records:
> 
> 1. I'm right: this functionality *is* on the trunk.  If you build the OMPI 
> SVN trunk with gcc/gfortran 4.9, you get the ignore TKR "mpi" module and the 
> mpi_f08 module.  I just tried it myself to verify this.
> 
> 2. I'm (sorta) right: we CMR'ed the "add the GCC ignore TKR functionality" in 
> https://svn.open-mpi.org/trac/ompi/ticket/4058 for v1.7.5, but it looks like 
> that CMR was botched somehow and the functionality wasn't applied (!) ...even 
> though the log message says it was.  Sad panda.
> 
> I'll open a ticket to track this.  We'll have to see how the RM feels about 
> putting this in at the last second; it may or may not make the cutoff for 
> 1.8.2 (the freeze has already occurred).
> 
> 
> 
> On Aug 12, 2014, at 12:32 PM, Daniels, Marcus G  wrote:
> 
>> Hi Jeff,
>> 
>> On Tue, 2014-08-12 at 16:18 +, Jeff Squyres (jsquyres) wrote:
>>> Can you send the output from configure, the config.log file, and the 
>>> ompi_config.h file?
>> 
>> Attached.  configure.log comes from
>> 
>> (./configure --prefix=/usr/projects/eap/tools/openmpi/1.8.2rc3  2>&1) 
>> > configure.log
>> 
>> Seems fishy that there is no "checking for Fortran compiler support of !
>> GCC$ ATTRIBUTES NO_ARG_CHECK".
>> 
>> checking for Fortran compiler module include flag... -I checking 
>> Fortran compiler ignore TKR syntax... not cached; checking variants 
>> checking for Fortran compiler support of TYPE(*), DIMENSION(*)... no 
>> checking for Fortran compiler support of !DEC$ ATTRIBUTES 
>> NO_ARG_CHECK... no checking for Fortran compiler support of !$PRAGMA 
>> IGNORE_TKR... no checking for Fortran compiler support of !DIR$ 
>> IGNORE_TKR... no checking for Fortran compiler support of !IBM* 
>> IGNORE_TKR... no checking Fortran compiler ignore TKR syntax... 
>> 0:real:!
>> checking if Fortran compiler supports ISO_C_BINDING... yes checking 
>> if building Fortran 'use mpi' bindings... yes checking if building 
>> Fortran 'use mpi_f08' bindings... no
>> 
>> Marcus
>> 
>> 
>> 
> 
> 
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/08/25004.php


--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



[OMPI users] mpi+openshmem hybrid

2014-08-14 Thread Timur Ismagilov

Hello!
I use Open MPI v1.9a132520.

Can I use hybrid mpi+openshmem?
Where can i read about?

I have some problems in simple programm:
#include 
#include "shmem.h"
#include "mpi.h"
int main(int argc, char* argv[])
{
int proc, nproc;
int rank, size, len;
char version[MPI_MAX_LIBRARY_VERSION_STRING];
MPI_Init(&argc, &argv);
start_pes(0);
MPI_Finalize();
return 0;
}

I compile with oshcc, with mpicc i got a compile error.

1. When i run this programm with mpirun/oshrun i got an output

[1408002416.687274] [node1-130-01:26354:0] proto.c:64 MXM WARN mxm is destroyed 
but still has pending receive requests
[1408002416.687604] [node1-130-01:26355:0] proto.c:64 MXM WARN mxm is destroyed 
but still has pending receive requests

2. If in programm, i use this code
start_pes(0);
MPI_Init(&argc, &argv);
MPI_Finalize();

i got an error:
--
Calling MPI_Init or MPI_Init_thread twice is erroneous.
--
[node1-130-01:26469] *** An error occurred in MPI_Init
[node1-130-01:26469] *** reported by process [2397634561,140733193388033]
[node1-130-01:26469] *** on communicator MPI_COMM_WORLD
[node1-130-01:26469] *** MPI_ERR_OTHER: known error not in list
[node1-130-01:26469] *** MPI_ERRORS_ARE_FATAL (processes in this communicator 
will now abort,
[node1-130-01:26469] *** and potentially your MPI job)
[node1-130-01:26468] [[36585,1],0] ORTE_ERROR_LOG: Not found in file 
routed_radix.c at line 395
[node1-130-01:26469] [[36585,1],1] ORTE_ERROR_LOG: Not found in file 
routed_radix.c at line 395
[compiler-2:02175] 1 more process has sent help message help-mpi-errors.txt / 
mpi_errors_are_fatal
[compiler-2:02175] Set MCA parameter "orte_base_help_aggregate" to 0 to see all 
help / error messages

--
Calling MPI_Init or MPI_Init_thread twice is erroneous.
--
[node1-130-01:26469] *** An error occurred in MPI_Init
[node1-130-01:26469] *** reported by process [2397634561,140733193388033]
[node1-130-01:26469] *** on communicator MPI_COMM_WORLD
[node1-130-01:26469] *** MPI_ERR_OTHER: known error not in list
[node1-130-01:26469] *** MPI_ERRORS_ARE_FATAL (processes in this communicator 
will now abort,
[node1-130-01:26469] ***    and potentially your MPI job)
[node1-130-01:26468] [[36585,1],0] ORTE_ERROR_LOG: Not found in file 
routed_radix.c at line 395
[node1-130-01:26469] [[36585,1],1] ORTE_ERROR_LOG: Not found in file 
routed_radix.c at line 395
[compiler-2:02175] 1 more process has sent help message help-mpi-errors.txt / 
mpi_errors_are_fatal
[compiler-2:02175] Set MCA parameter "orte_base_help_aggregate" to 0 to see all 
help / error messages

Re: [OMPI users] mpi+openshmem hybrid

2014-08-14 Thread Mike Dubman
You can use hybrid mode.
following code works for me with ompi 1.8.2

#include 
#include 
#include "shmem.h"
#include "mpi.h"

int main(int argc, char *argv[])
{
MPI_Init(&argc, &argv);
start_pes(0);

{
int version = 0;
int subversion = 0;
int num_proc = 0;
int my_proc = 0;
int comm_size = 0;
int comm_rank = 0;

MPI_Get_version(&version, &subversion);
fprintf(stdout, "MPI version: %d.%d\n", version, subversion);

num_proc = _num_pes();
my_proc = _my_pe();

fprintf(stdout, "PE#%d of %d\n", my_proc, num_proc);

MPI_Comm_size(MPI_COMM_WORLD, &comm_size);
MPI_Comm_rank(MPI_COMM_WORLD, &comm_rank);

fprintf(stdout, "Comm rank#%d of %d\n", comm_rank, comm_size);
}

return 0;
}



On Thu, Aug 14, 2014 at 11:05 AM, Timur Ismagilov 
wrote:

> Hello!
> I use Open MPI v1.9a132520.
>
> Can I use hybrid mpi+openshmem?
> Where can i read about?
>
> I have some problems in simple programm:
>
> #include 
>
> #include "shmem.h"
> #include "mpi.h"
>
> int main(int argc, char* argv[])
> {
> int proc, nproc;
> int rank, size, len;
> char version[MPI_MAX_LIBRARY_VERSION_STRING];
>
> MPI_Init(&argc, &argv);
> start_pes(0);
> MPI_Finalize();
>
> return 0;
> }
>
> I compile with oshcc, with mpicc i got a compile error.
>
> 1. When i run this programm with mpirun/oshrun i got an output
>
> [1408002416.687274] [node1-130-01:26354:0] proto.c:64 MXM WARN mxm is
> destroyed but still has pending receive requests
> [1408002416.687604] [node1-130-01:26355:0] proto.c:64 MXM WARN mxm is
> destroyed but still has pending receive requests
>
> 2. If in programm, i use this code
> start_pes(0);
> MPI_Init(&argc, &argv);
> MPI_Finalize();
>
> i got an error:
>
> --
> Calling MPI_Init or MPI_Init_thread twice is erroneous.
> --
> [node1-130-01:26469] *** An error occurred in MPI_Init
> [node1-130-01:26469] *** reported by process [2397634561,140733193388033]
> [node1-130-01:26469] *** on communicator MPI_COMM_WORLD
> [node1-130-01:26469] *** MPI_ERR_OTHER: known error not in list
> [node1-130-01:26469] *** MPI_ERRORS_ARE_FATAL (processes in this
> communicator will now abort,
> [node1-130-01:26469] *** and potentially your MPI job)
> [node1-130-01:26468] [[36585,1],0] ORTE_ERROR_LOG: Not found in file
> routed_radix.c at line 395
> [node1-130-01:26469] [[36585,1],1] ORTE_ERROR_LOG: Not found in file
> routed_radix.c at line 395
> [compiler-2:02175] 1 more process has sent help message
> help-mpi-errors.txt / mpi_errors_are_fatal
> [compiler-2:02175] Set MCA parameter "orte_base_help_aggregate" to 0 to
> see all help / error messages
>
>
> --
> Calling MPI_Init or MPI_Init_thread twice is erroneous.
> --
> [node1-130-01:26469] *** An error occurred in MPI_Init
> [node1-130-01:26469] *** reported by process [2397634561,140733193388033]
> [node1-130-01:26469] *** on communicator MPI_COMM_WORLD
> [node1-130-01:26469] *** MPI_ERR_OTHER: known error not in list
> [node1-130-01:26469] *** MPI_ERRORS_ARE_FATAL (processes in this
> communicator will now abort,
> [node1-130-01:26469] ***and potentially your MPI job)
> [node1-130-01:26468] [[36585,1],0] ORTE_ERROR_LOG: Not found in file
> routed_radix.c at line 395
> [node1-130-01:26469] [[36585,1],1] ORTE_ERROR_LOG: Not found in file
> routed_radix.c at line 395
> [compiler-2:02175] 1 more process has sent help message
> help-mpi-errors.txt / mpi_errors_are_fatal
> [compiler-2:02175] Set MCA parameter "orte_base_help_aggregate" to 0 to
> see all help / error messages
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2014/08/25010.php
>


[OMPI users] how MPI_Get can work with multiple memory regions attached to a window via MPI_Win_attach

2014-08-14 Thread madhurima madhunapanthula
 Hi,

In http://www.mpich.org/static/docs/v3.1/www3/MPI_Win_attach.html,  for MPI
3,  The API MPI_Win_attach is supported :





int MPI_Win_attach(MPI_Win win, void *base, MPI_Aint size)



It allows multiple (but non-overlapping) memory regions to be attached to
the same window, after the window is created. Therefore, I should be able
to have multiple such “attach” calls for the same window, with different
“base” address to be specified in each attach call.

If that is the case, when I issue the MPI_Get, how can I specify which
“base address” on the window is what I need to fetch the data?  As MPI_get
has the window handle to be one of the input parameters, but not“base
address”.  The base address should be resolved by the target process from
the registered window handle.

So my question is: how can  MPI_Get  handle the situation where multiple
memory regions are attached to the same window?

Thank you!


-- 
Lokah samasta sukhinobhavanthu

Thanks,
Madhurima


Re: [OMPI users] GCC 4.9 and MPI_F08?

2014-08-14 Thread Christoph Niethammer
Hello,

I just gave gcc 4.9.0 a try and the mpi_f09 module is there but it seems to 
miss some functions:

mpifort test.f90
/tmp/ccHCEbXC.o: In function `MAIN__':
test.f90:(.text+0x35a): undefined reference to `mpi_win_lock_all_'
test.f90:(.text+0x373): undefined reference to `mpi_win_lock_all_'
test.f90:(.text+0x964): undefined reference to `mpi_win_sync_'
test.f90:(.text+0x978): undefined reference to `mpi_win_sync_'
test.f90:(.text+0xb20): undefined reference to `mpi_win_sync_'
test.f90:(.text+0xb34): undefined reference to `mpi_win_sync_'
test.f90:(.text+0x1772): undefined reference to `mpi_win_unlock_all_'
test.f90:(.text+0x1786): undefined reference to `mpi_win_unlock_all_'
collect2: error: ld returned 1 exit status


Here also some configure log content:

OMPI_BUILD_FORTRAN_USEMPI_IGNORE_TKR_BINDINGS_FALSE='#'
OMPI_BUILD_FORTRAN_USEMPI_IGNORE_TKR_BINDINGS_TRUE=''
OMPI_BUILD_FORTRAN_USEMPI_TKR_BINDINGS_FALSE=''
OMPI_BUILD_FORTRAN_USEMPI_TKR_BINDINGS_TRUE='#'
OMPI_FORTRAN_IGNORE_TKR_PREDECL='!GCC$ ATTRIBUTES NO_ARG_CHECK ::'
OMPI_FORTRAN_IGNORE_TKR_TYPE='type(*), dimension(*)'
OMPI_FORTRAN_USEMPI_DIR='mpi/fortran/use-mpi-ignore-tkr'
OMPI_FORTRAN_USEMPI_LIB='-lmpi_usempi_ignore_tkr'
libmpi_usempi_ignore_tkr_so_version=''
libmpi_usempi_tkr_so_version='4:2:3'
#define OMPI_FORTRAN_IGNORE_TKR_PREDECL "!GCC$ ATTRIBUTES NO_ARG_CHECK ::"
#define OMPI_FORTRAN_IGNORE_TKR_TYPE 
#define OMPI_FORTRAN_HAVE_IGNORE_TKR 1

configure:10267: result: yes (mpif.h, mpi and mpi_f08 modules)
configure:10417: checking which 'use mpi_f08' implementation to use
configure:58804: checking which mpi_f08 implementation to build
configure:58845: checking if building Fortran 'use mpi_f08' bindings
OMPI_F08_SUFFIX='_f08'


Regards
Christoph Niethammer

--

Christoph Niethammer
High Performance Computing Center Stuttgart (HLRS)
Nobelstrasse 19
70569 Stuttgart

Tel: ++49(0)711-685-87203
email: nietham...@hlrs.de
http://www.hlrs.de/people/niethammer



- Original Message -
From: "Marcus G Daniels" 
To: "Jeff Squyres (jsquyres)" , "Open MPI User's List" 

Sent: Thursday, August 14, 2014 8:13:27 AM
Subject: Re: [OMPI users] GCC 4.9 and MPI_F08?

Hi Jeff,

Works for me!

(With mpi_f08, GCC 4.9.1 absolutely insists on getting the finer details right 
on things like MPI_User_function types for MPI_Op_create.  So I'll assume the 
rest of the type checking is just as good, and be glad I took that minor 
detour..)

Thanks,

Marcus

-Original Message-
From: Jeff Squyres (jsquyres) [mailto:jsquy...@cisco.com] 
Sent: Wednesday, August 13, 2014 10:00 AM
To: Open MPI User's List
Cc: Daniels, Marcus G
Subject: Re: [OMPI users] GCC 4.9 and MPI_F08?

Marcus --

The fix was applied yesterday to the v1.8 branch.  Would you mind testing the 
v1.8 nightly tarball from last night, just to make sure it works for you?

http://www.open-mpi.org/nightly/v1.8/



On Aug 12, 2014, at 2:54 PM, Jeff Squyres (jsquyres)  wrote:

> @#$%@#$%#@$%
> 
> I was very confused by this bug report, because in my mind, a) this 
> functionality is on the SVN trunk, and b) we had moved the gcc functionality 
> to the v1.8 branch long ago.
> 
> I just checked the SVN/Trac records:
> 
> 1. I'm right: this functionality *is* on the trunk.  If you build the OMPI 
> SVN trunk with gcc/gfortran 4.9, you get the ignore TKR "mpi" module and the 
> mpi_f08 module.  I just tried it myself to verify this.
> 
> 2. I'm (sorta) right: we CMR'ed the "add the GCC ignore TKR functionality" in 
> https://svn.open-mpi.org/trac/ompi/ticket/4058 for v1.7.5, but it looks like 
> that CMR was botched somehow and the functionality wasn't applied (!) ...even 
> though the log message says it was.  Sad panda.
> 
> I'll open a ticket to track this.  We'll have to see how the RM feels about 
> putting this in at the last second; it may or may not make the cutoff for 
> 1.8.2 (the freeze has already occurred).
> 
> 
> 
> On Aug 12, 2014, at 12:32 PM, Daniels, Marcus G  wrote:
> 
>> Hi Jeff,
>> 
>> On Tue, 2014-08-12 at 16:18 +, Jeff Squyres (jsquyres) wrote:
>>> Can you send the output from configure, the config.log file, and the 
>>> ompi_config.h file?
>> 
>> Attached.  configure.log comes from
>> 
>> (./configure --prefix=/usr/projects/eap/tools/openmpi/1.8.2rc3  2>&1) 
>> > configure.log
>> 
>> Seems fishy that there is no "checking for Fortran compiler support of !
>> GCC$ ATTRIBUTES NO_ARG_CHECK".
>> 
>> checking for Fortran compiler module include flag... -I checking 
>> Fortran compiler ignore TKR syntax... not cached; checking variants 
>> checking for Fortran compiler support of TYPE(*), DIMENSION(*)... no 
>> checking for Fortran compiler support of !DEC$ ATTRIBUTES 
>> NO_ARG_CHECK... no checking for Fortran compiler support of !$PRAGMA 
>> IGNORE_TKR... no checking for Fortran compiler support of !DIR$ 
>> IGNORE_TKR... no checking for Fortran compiler support of !IBM* 
>> IGNORE_TKR... no checking Fortran compiler ignore TKR syntax... 
>> 0:real:!
>> checking if Fortran

[OMPI users] Segmentation fault in OpenMPI 1.8.1

2014-08-14 Thread Maxime Boissonneault

Hi,
I compiled Charm++ 6.6.0rc3 using
./build charm++ mpi-linux-x86_64 smp --with-production

When compiling the simple example
mpi-linux-x86_64-smp/tests/charm++/simplearrayhello/

I get a segmentation fault that traces back to OpenMPI :
[mboisson@helios-login1 simplearrayhello]$ ./hello
[helios-login1:01813] *** Process received signal ***
[helios-login1:01813] Signal: Segmentation fault (11)
[helios-login1:01813] Signal code: Address not mapped (1)
[helios-login1:01813] Failing at address: 0x30
[helios-login1:01813] [ 0] /lib64/libpthread.so.0[0x381c00f710]
[helios-login1:01813] [ 1] 
/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(+0xf78f8)[0x7f2cd1f6b8f8]
[helios-login1:01813] [ 2] 
/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(+0xf8f64)[0x7f2cd1f6cf64]
[helios-login1:01813] [ 3] 
/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(ompi_btl_openib_connect_base_select_for_local_port+0xcf)[0x7f2cd1f672af]
[helios-login1:01813] [ 4] 
/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(+0xe1ad7)[0x7f2cd1f55ad7]
[helios-login1:01813] [ 5] 
/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(mca_btl_base_select+0x168)[0x7f2cd1f4bf28]
[helios-login1:01813] [ 6] 
/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(mca_bml_r2_component_init+0x11)[0x7f2cd1f4b851]
[helios-login1:01813] [ 7] 
/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(mca_bml_base_init+0x7f)[0x7f2cd1f4a03f]
[helios-login1:01813] [ 8] 
/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(+0x1e0d17)[0x7f2cd2054d17]
[helios-login1:01813] [ 9] 
/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(mca_pml_base_select+0x3b6)[0x7f2cd20529d6]
[helios-login1:01813] [10] 
/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(ompi_mpi_init+0x4e4)[0x7f2cd1ef0c14]
[helios-login1:01813] [11] 
/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(MPI_Init_thread+0x15d)[0x7f2cd1f1065d]

[helios-login1:01813] [12] ./hello(LrtsInit+0x72)[0x4fcf02]
[helios-login1:01813] [13] ./hello(ConverseInit+0x70)[0x4ff680]
[helios-login1:01813] [14] ./hello(main+0x27)[0x470767]
[helios-login1:01813] [15] 
/lib64/libc.so.6(__libc_start_main+0xfd)[0x381bc1ed1d]

[helios-login1:01813] [16] ./hello[0x470b71]


Anyone has a clue how to fix this ?

Thanks,

--
-
Maxime Boissonneault
Analyste de calcul - Calcul Québec, Université Laval
Ph. D. en physique



Re: [OMPI users] Segmentation fault in OpenMPI 1.8.1

2014-08-14 Thread Maxime Boissonneault

Note that if I do the same build with OpenMPI 1.6.5, it works flawlessly.

Maxime


Le 2014-08-14 08:39, Maxime Boissonneault a écrit :

Hi,
I compiled Charm++ 6.6.0rc3 using
./build charm++ mpi-linux-x86_64 smp --with-production

When compiling the simple example
mpi-linux-x86_64-smp/tests/charm++/simplearrayhello/

I get a segmentation fault that traces back to OpenMPI :
[mboisson@helios-login1 simplearrayhello]$ ./hello
[helios-login1:01813] *** Process received signal ***
[helios-login1:01813] Signal: Segmentation fault (11)
[helios-login1:01813] Signal code: Address not mapped (1)
[helios-login1:01813] Failing at address: 0x30
[helios-login1:01813] [ 0] /lib64/libpthread.so.0[0x381c00f710]
[helios-login1:01813] [ 1] 
/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(+0xf78f8)[0x7f2cd1f6b8f8]
[helios-login1:01813] [ 2] 
/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(+0xf8f64)[0x7f2cd1f6cf64]
[helios-login1:01813] [ 3] 
/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(ompi_btl_openib_connect_base_select_for_local_port+0xcf)[0x7f2cd1f672af]
[helios-login1:01813] [ 4] 
/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(+0xe1ad7)[0x7f2cd1f55ad7]
[helios-login1:01813] [ 5] 
/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(mca_btl_base_select+0x168)[0x7f2cd1f4bf28]
[helios-login1:01813] [ 6] 
/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(mca_bml_r2_component_init+0x11)[0x7f2cd1f4b851]
[helios-login1:01813] [ 7] 
/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(mca_bml_base_init+0x7f)[0x7f2cd1f4a03f]
[helios-login1:01813] [ 8] 
/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(+0x1e0d17)[0x7f2cd2054d17]
[helios-login1:01813] [ 9] 
/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(mca_pml_base_select+0x3b6)[0x7f2cd20529d6]
[helios-login1:01813] [10] 
/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(ompi_mpi_init+0x4e4)[0x7f2cd1ef0c14]
[helios-login1:01813] [11] 
/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(MPI_Init_thread+0x15d)[0x7f2cd1f1065d]

[helios-login1:01813] [12] ./hello(LrtsInit+0x72)[0x4fcf02]
[helios-login1:01813] [13] ./hello(ConverseInit+0x70)[0x4ff680]
[helios-login1:01813] [14] ./hello(main+0x27)[0x470767]
[helios-login1:01813] [15] 
/lib64/libc.so.6(__libc_start_main+0xfd)[0x381bc1ed1d]

[helios-login1:01813] [16] ./hello[0x470b71]


Anyone has a clue how to fix this ?

Thanks,




--
-
Maxime Boissonneault
Analyste de calcul - Calcul Québec, Université Laval
Ph. D. en physique



[OMPI users] Running a hybrid MPI+openMP program

2014-08-14 Thread Oscar Mojica

Hello everybody
I am trying to run a hybrid mpi + openmp program in a cluster.  I created a 
queue with 14 machines, each one with 16 cores. The program divides the work 
among the 14 processors with MPI and within each processor a loop is also 
divided into 8 threads for example, using openmp. The problem is that when I 
submit the job to the queue the MPI processes don't divide the work into 
threads and the program prints the number of threads  that are working within 
each process as one. 
I made a simple test program that uses openmp and  I logged in one machine of 
the fourteen. I compiled it using gfortran -fopenmp program.f -o exe,  set the 
OMP_NUM_THREADS environment variable equal to 8  and when I ran directly in the 
terminal the loop was effectively divided among the cores and for example in 
this case the program printed the number of threads equal to 8
This is my Makefile # Start of the makefile# Defining variablesobjects = 
inv_grav3d.o funcpdf.o gr3dprm.o fdjac.o dsvd.o#f90comp = 
/opt/openmpi/bin/mpif90f90comp = /usr/bin/mpif90#switch = -O3executable = 
inverse.exe# Makefileall : $(executable)$(executable) : $(objects) 
$(f90comp) -fopenmp -g -O -o $(executable) $(objects)   rm $(objects)%.o: %.f   
$(f90comp) -c $<# Cleaning everythingclean: rm $(executable) #  rm 
$(objects)# End of the makefile
and the script that i am using is 
#!/bin/bash#$ -cwd#$ -j y#$ -S /bin/bash#$ -pe orte 14#$ -N job#$ -q new.q
export OMP_NUM_THREADS=8/usr/bin/time -f "%E" /opt/openmpi/bin/mpirun -v -np 
$NSLOTS ./inverse.exe 
am I forgetting something?
Thanks,
Oscar Fabian Mojica Ladino
Geologist M.S. in  Geophysics
  

Re: [OMPI users] Running a hybrid MPI+openMP program

2014-08-14 Thread Jeff Squyres (jsquyres)
I don't know much about OpenMP, but do you need to disable Open MPI's default 
bind-to-core functionality (I'm assuming you're using Open MPI 1.8.x)?

You can try "mpirun --bind-to none ...", which will have Open MPI not bind MPI 
processes to cores, which might allow OpenMP to think that it can use all the 
cores, and therefore it will spawn num_cores threads...?


On Aug 14, 2014, at 9:50 AM, Oscar Mojica  wrote:

> 
> Hello everybody
> 
> I am trying to run a hybrid mpi + openmp program in a cluster.  I created a 
> queue with 14 machines, each one with 16 cores. The program divides the work 
> among the 14 processors with MPI and within each processor a loop is also 
> divided into 8 threads for example, using openmp. The problem is that when I 
> submit the job to the queue the MPI processes don't divide the work into 
> threads and the program prints the number of threads  that are working within 
> each process as one. 
> 
> I made a simple test program that uses openmp and  I logged in one machine of 
> the fourteen. I compiled it using gfortran -fopenmp program.f -o exe,  set 
> the OMP_NUM_THREADS environment variable equal to 8  and when I ran directly 
> in the terminal the loop was effectively divided among the cores and for 
> example in this case the program printed the number of threads equal to 8
> 
> This is my Makefile
>  
> # Start of the makefile
> # Defining variables
> objects = inv_grav3d.o funcpdf.o gr3dprm.o fdjac.o dsvd.o
> #f90comp = /opt/openmpi/bin/mpif90
> f90comp = /usr/bin/mpif90
> #switch = -O3
> executable = inverse.exe
> # Makefile
> all : $(executable)
> $(executable) : $(objects)
>   $(f90comp) -fopenmp -g -O -o $(executable) $(objects)
>   rm $(objects)
> %.o: %.f
>   $(f90comp) -c $<
> # Cleaning everything
> clean:
>   rm $(executable) 
> # rm $(objects)
> # End of the makefile
> 
> and the script that i am using is 
> 
> #!/bin/bash
> #$ -cwd
> #$ -j y
> #$ -S /bin/bash
> #$ -pe orte 14
> #$ -N job
> #$ -q new.q
> 
> export OMP_NUM_THREADS=8
> /usr/bin/time -f "%E" /opt/openmpi/bin/mpirun -v -np $NSLOTS ./inverse.exe 
> 
> am I forgetting something?
> 
> Thanks,
> 
> Oscar Fabian Mojica Ladino
> Geologist M.S. in  Geophysics
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/08/25016.php


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI users] Segmentation fault in OpenMPI 1.8.1

2014-08-14 Thread Jeff Squyres (jsquyres)
Can you try the latest 1.8.2 rc tarball?  (just released yesterday)

http://www.open-mpi.org/software/ompi/v1.8/



On Aug 14, 2014, at 8:39 AM, Maxime Boissonneault 
 wrote:

> Hi,
> I compiled Charm++ 6.6.0rc3 using
> ./build charm++ mpi-linux-x86_64 smp --with-production
> 
> When compiling the simple example
> mpi-linux-x86_64-smp/tests/charm++/simplearrayhello/
> 
> I get a segmentation fault that traces back to OpenMPI :
> [mboisson@helios-login1 simplearrayhello]$ ./hello
> [helios-login1:01813] *** Process received signal ***
> [helios-login1:01813] Signal: Segmentation fault (11)
> [helios-login1:01813] Signal code: Address not mapped (1)
> [helios-login1:01813] Failing at address: 0x30
> [helios-login1:01813] [ 0] /lib64/libpthread.so.0[0x381c00f710]
> [helios-login1:01813] [ 1] 
> /software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(+0xf78f8)[0x7f2cd1f6b8f8]
> [helios-login1:01813] [ 2] 
> /software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(+0xf8f64)[0x7f2cd1f6cf64]
> [helios-login1:01813] [ 3] 
> /software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(ompi_btl_openib_connect_base_select_for_local_port+0xcf)[0x7f2cd1f672af]
> [helios-login1:01813] [ 4] 
> /software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(+0xe1ad7)[0x7f2cd1f55ad7]
> [helios-login1:01813] [ 5] 
> /software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(mca_btl_base_select+0x168)[0x7f2cd1f4bf28]
> [helios-login1:01813] [ 6] 
> /software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(mca_bml_r2_component_init+0x11)[0x7f2cd1f4b851]
> [helios-login1:01813] [ 7] 
> /software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(mca_bml_base_init+0x7f)[0x7f2cd1f4a03f]
> [helios-login1:01813] [ 8] 
> /software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(+0x1e0d17)[0x7f2cd2054d17]
> [helios-login1:01813] [ 9] 
> /software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(mca_pml_base_select+0x3b6)[0x7f2cd20529d6]
> [helios-login1:01813] [10] 
> /software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(ompi_mpi_init+0x4e4)[0x7f2cd1ef0c14]
> [helios-login1:01813] [11] 
> /software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(MPI_Init_thread+0x15d)[0x7f2cd1f1065d]
> [helios-login1:01813] [12] ./hello(LrtsInit+0x72)[0x4fcf02]
> [helios-login1:01813] [13] ./hello(ConverseInit+0x70)[0x4ff680]
> [helios-login1:01813] [14] ./hello(main+0x27)[0x470767]
> [helios-login1:01813] [15] 
> /lib64/libc.so.6(__libc_start_main+0xfd)[0x381bc1ed1d]
> [helios-login1:01813] [16] ./hello[0x470b71]
> 
> 
> Anyone has a clue how to fix this ?
> 
> Thanks,
> 
> -- 
> -
> Maxime Boissonneault
> Analyste de calcul - Calcul Québec, Université Laval
> Ph. D. en physique
> 
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/08/25014.php


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI users] Running a hybrid MPI+openMP program

2014-08-14 Thread Reuti
Hi,

Am 14.08.2014 um 15:50 schrieb Oscar Mojica:

> I am trying to run a hybrid mpi + openmp program in a cluster.  I created a 
> queue with 14 machines, each one with 16 cores. The program divides the work 
> among the 14 processors with MPI and within each processor a loop is also 
> divided into 8 threads for example, using openmp. The problem is that when I 
> submit the job to the queue the MPI processes don't divide the work into 
> threads and the program prints the number of threads  that are working within 
> each process as one. 
> 
> I made a simple test program that uses openmp and  I logged in one machine of 
> the fourteen. I compiled it using gfortran -fopenmp program.f -o exe,  set 
> the OMP_NUM_THREADS environment variable equal to 8  and when I ran directly 
> in the terminal the loop was effectively divided among the cores and for 
> example in this case the program printed the number of threads equal to 8
> 
> This is my Makefile
>  
> # Start of the makefile
> # Defining variables
> objects = inv_grav3d.o funcpdf.o gr3dprm.o fdjac.o dsvd.o
> #f90comp = /opt/openmpi/bin/mpif90
> f90comp = /usr/bin/mpif90
> #switch = -O3
> executable = inverse.exe
> # Makefile
> all : $(executable)
> $(executable) : $(objects)
>   $(f90comp) -fopenmp -g -O -o $(executable) $(objects)
>   rm $(objects)
> %.o: %.f
>   $(f90comp) -c $<
> # Cleaning everything
> clean:
>   rm $(executable) 
> # rm $(objects)
> # End of the makefile
> 
> and the script that i am using is 
> 
> #!/bin/bash
> #$ -cwd
> #$ -j y
> #$ -S /bin/bash
> #$ -pe orte 14

What is the output of `qconf -sp orte`?


> #$ -N job
> #$ -q new.q

Looks like you are using SGE The installed Open MPI was compiled with 
"--with-sge" to achieve a Tight Integration*, and the processes are distributed 
to all machines correctly (disregarding the thread issue here, just a plain MPI 
job)?

There is also to note, that in either case the generated $PE_HOSTFILE needs to 
be adjusted, as you have to request 14 times 8 cores in total for your 
computation to avoid that SGE will oversubscribe the machines.

-- Reuti

* This will also forward the environment variables to the slave machines. 
Without the Tight Integration there is the option "-x OMP_NUM_THREADS" to 
`mpirun` in Open MPI.


> export OMP_NUM_THREADS=8
> /usr/bin/time -f "%E" /opt/openmpi/bin/mpirun -v -np $NSLOTS ./inverse.exe 
> 
> am I forgetting something?
> 
> Thanks,
> 
> Oscar Fabian Mojica Ladino
> Geologist M.S. in  Geophysics
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/08/25016.php



Re: [OMPI users] Running a hybrid MPI+openMP program

2014-08-14 Thread Maxime Boissonneault

Hi,
You DEFINITELY need to disable OpenMPI's new default binding. Otherwise, 
your N threads will run on a single core. --bind-to socket would be my 
recommendation for hybrid jobs.


Maxime


Le 2014-08-14 10:04, Jeff Squyres (jsquyres) a écrit :

I don't know much about OpenMP, but do you need to disable Open MPI's default 
bind-to-core functionality (I'm assuming you're using Open MPI 1.8.x)?

You can try "mpirun --bind-to none ...", which will have Open MPI not bind MPI 
processes to cores, which might allow OpenMP to think that it can use all the cores, and 
therefore it will spawn num_cores threads...?


On Aug 14, 2014, at 9:50 AM, Oscar Mojica  wrote:


Hello everybody

I am trying to run a hybrid mpi + openmp program in a cluster.  I created a 
queue with 14 machines, each one with 16 cores. The program divides the work 
among the 14 processors with MPI and within each processor a loop is also 
divided into 8 threads for example, using openmp. The problem is that when I 
submit the job to the queue the MPI processes don't divide the work into 
threads and the program prints the number of threads  that are working within 
each process as one.

I made a simple test program that uses openmp and  I logged in one machine of 
the fourteen. I compiled it using gfortran -fopenmp program.f -o exe,  set the 
OMP_NUM_THREADS environment variable equal to 8  and when I ran directly in the 
terminal the loop was effectively divided among the cores and for example in 
this case the program printed the number of threads equal to 8

This is my Makefile
  
# Start of the makefile

# Defining variables
objects = inv_grav3d.o funcpdf.o gr3dprm.o fdjac.o dsvd.o
#f90comp = /opt/openmpi/bin/mpif90
f90comp = /usr/bin/mpif90
#switch = -O3
executable = inverse.exe
# Makefile
all : $(executable)
$(executable) : $(objects)  
$(f90comp) -fopenmp -g -O -o $(executable) $(objects)
rm $(objects)
%.o: %.f
$(f90comp) -c $<
# Cleaning everything
clean:
rm $(executable)
#   rm $(objects)
# End of the makefile

and the script that i am using is

#!/bin/bash
#$ -cwd
#$ -j y
#$ -S /bin/bash
#$ -pe orte 14
#$ -N job
#$ -q new.q

export OMP_NUM_THREADS=8
/usr/bin/time -f "%E" /opt/openmpi/bin/mpirun -v -np $NSLOTS ./inverse.exe

am I forgetting something?

Thanks,

Oscar Fabian Mojica Ladino
Geologist M.S. in  Geophysics
___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2014/08/25016.php





--
-
Maxime Boissonneault
Analyste de calcul - Calcul Québec, Université Laval
Ph. D. en physique



Re: [OMPI users] Segmentation fault in OpenMPI 1.8.1

2014-08-14 Thread Maxime Boissonneault

Hi,
I just did with 1.8.2rc4 and it does the same :

[mboisson@helios-login1 simplearrayhello]$ ./hello
[helios-login1:11739] *** Process received signal ***
[helios-login1:11739] Signal: Segmentation fault (11)
[helios-login1:11739] Signal code: Address not mapped (1)
[helios-login1:11739] Failing at address: 0x30
[helios-login1:11739] [ 0] /lib64/libpthread.so.0[0x381c00f710]
[helios-login1:11739] [ 1] 
/software-gpu/mpi/openmpi/1.8.2rc4_gcc4.8_cuda6.0.37/lib/libmpi.so.1(+0xfa238)[0x7f7166a04238]
[helios-login1:11739] [ 2] 
/software-gpu/mpi/openmpi/1.8.2rc4_gcc4.8_cuda6.0.37/lib/libmpi.so.1(+0xfbad4)[0x7f7166a05ad4]
[helios-login1:11739] [ 3] 
/software-gpu/mpi/openmpi/1.8.2rc4_gcc4.8_cuda6.0.37/lib/libmpi.so.1(ompi_btl_openib_connect_base_select_for_local_port+0xcf)[0x7f71669ffddf]
[helios-login1:11739] [ 4] 
/software-gpu/mpi/openmpi/1.8.2rc4_gcc4.8_cuda6.0.37/lib/libmpi.so.1(+0xe4773)[0x7f71669ee773]
[helios-login1:11739] [ 5] 
/software-gpu/mpi/openmpi/1.8.2rc4_gcc4.8_cuda6.0.37/lib/libmpi.so.1(mca_btl_base_select+0x168)[0x7f71669e46a8]
[helios-login1:11739] [ 6] 
/software-gpu/mpi/openmpi/1.8.2rc4_gcc4.8_cuda6.0.37/lib/libmpi.so.1(mca_bml_r2_component_init+0x11)[0x7f71669e3fd1]
[helios-login1:11739] [ 7] 
/software-gpu/mpi/openmpi/1.8.2rc4_gcc4.8_cuda6.0.37/lib/libmpi.so.1(mca_bml_base_init+0x7f)[0x7f71669e275f]
[helios-login1:11739] [ 8] 
/software-gpu/mpi/openmpi/1.8.2rc4_gcc4.8_cuda6.0.37/lib/libmpi.so.1(+0x1e602f)[0x7f7166af002f]
[helios-login1:11739] [ 9] 
/software-gpu/mpi/openmpi/1.8.2rc4_gcc4.8_cuda6.0.37/lib/libmpi.so.1(mca_pml_base_select+0x3b6)[0x7f7166aedc26]
[helios-login1:11739] [10] 
/software-gpu/mpi/openmpi/1.8.2rc4_gcc4.8_cuda6.0.37/lib/libmpi.so.1(ompi_mpi_init+0x4e3)[0x7f7166988863]
[helios-login1:11739] [11] 
/software-gpu/mpi/openmpi/1.8.2rc4_gcc4.8_cuda6.0.37/lib/libmpi.so.1(MPI_Init_thread+0x15d)[0x7f71669a86fd]

[helios-login1:11739] [12] ./hello(LrtsInit+0x72)[0x4fcf02]
[helios-login1:11739] [13] ./hello(ConverseInit+0x70)[0x4ff680]
[helios-login1:11739] [14] ./hello(main+0x27)[0x470767]
[helios-login1:11739] [15] 
/lib64/libc.so.6(__libc_start_main+0xfd)[0x381bc1ed1d]

[helios-login1:11739] [16] ./hello[0x470b71]
[helios-login1:11739] *** End of error message



Maxime

Le 2014-08-14 10:04, Jeff Squyres (jsquyres) a écrit :

Can you try the latest 1.8.2 rc tarball?  (just released yesterday)

 http://www.open-mpi.org/software/ompi/v1.8/



On Aug 14, 2014, at 8:39 AM, Maxime Boissonneault 
 wrote:


Hi,
I compiled Charm++ 6.6.0rc3 using
./build charm++ mpi-linux-x86_64 smp --with-production

When compiling the simple example
mpi-linux-x86_64-smp/tests/charm++/simplearrayhello/

I get a segmentation fault that traces back to OpenMPI :
[mboisson@helios-login1 simplearrayhello]$ ./hello
[helios-login1:01813] *** Process received signal ***
[helios-login1:01813] Signal: Segmentation fault (11)
[helios-login1:01813] Signal code: Address not mapped (1)
[helios-login1:01813] Failing at address: 0x30
[helios-login1:01813] [ 0] /lib64/libpthread.so.0[0x381c00f710]
[helios-login1:01813] [ 1] 
/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(+0xf78f8)[0x7f2cd1f6b8f8]
[helios-login1:01813] [ 2] 
/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(+0xf8f64)[0x7f2cd1f6cf64]
[helios-login1:01813] [ 3] 
/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(ompi_btl_openib_connect_base_select_for_local_port+0xcf)[0x7f2cd1f672af]
[helios-login1:01813] [ 4] 
/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(+0xe1ad7)[0x7f2cd1f55ad7]
[helios-login1:01813] [ 5] 
/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(mca_btl_base_select+0x168)[0x7f2cd1f4bf28]
[helios-login1:01813] [ 6] 
/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(mca_bml_r2_component_init+0x11)[0x7f2cd1f4b851]
[helios-login1:01813] [ 7] 
/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(mca_bml_base_init+0x7f)[0x7f2cd1f4a03f]
[helios-login1:01813] [ 8] 
/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(+0x1e0d17)[0x7f2cd2054d17]
[helios-login1:01813] [ 9] 
/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(mca_pml_base_select+0x3b6)[0x7f2cd20529d6]
[helios-login1:01813] [10] 
/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(ompi_mpi_init+0x4e4)[0x7f2cd1ef0c14]
[helios-login1:01813] [11] 
/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/libmpi.so.1(MPI_Init_thread+0x15d)[0x7f2cd1f1065d]
[helios-login1:01813] [12] ./hello(LrtsInit+0x72)[0x4fcf02]
[helios-login1:01813] [13] ./hello(ConverseInit+0x70)[0x4ff680]
[helios-login1:01813] [14] ./hello(main+0x27)[0x470767]
[helios-login1:01813] [15] 
/lib64/libc.so.6(__libc_start_main+0xfd)[0x381bc1ed1d]
[helios-login1:01813] [16] ./hello[0x470b71]


Anyone has a clue how to fix this ?

Thanks,

--
-
Maxime Boissonneault
Analyste de calcul - Calcul Québec, Université Laval
Ph. D. en physique


[OMPI users] Intermittent, somewhat architecture-dependent hang with Open MPI 1.8.1

2014-08-14 Thread Matt Thompson
Open MPI Users,

I work on a large climate model called GEOS-5 and we've recently managed to
get it to compile with gfortran 4.9.1 (our usual compilers are Intel and
PGI for performance). In doing so, we asked our admins to install Open MPI
1.8.1 as the MPI stack instead of MVAPICH2 2.0 mainly because we figure the
gfortran port is more geared to a desktop.

So, the model builds just fine but when we run it, it stalls in our
"History" component whose job is to write out netCDF files of output. The
odd thing is, though, this stall seems to happen more on our Sandy Bridge
nodes than on our Westmere nodes, but both hang.

A colleague has made a single-file code that emulates our History component
(the MPI traffic part) that we've used to report bugs to MVAPICH and I
asked him to try it with this issue and it seems to duplicate it.

To wit, a "successful" run of the code is:

(1003) $ mpirun -np 96 ./mpi_reproducer.x 4 24
srun.slurm: cluster configuration lacks support for cpu binding
srun.slurm: cluster configuration lacks support for cpu binding
--
WARNING: Open MPI will create a shared memory backing file in a
directory that appears to be mounted on a network filesystem.
Creating the shared memory backup file on a network file system, such
as NFS or Lustre is not recommended -- it may cause excessive network
traffic to your file servers and/or cause shared memory traffic in
Open MPI to be much slower than expected.

You may want to check what the typical temporary directory is on your
node.  Possible sources of the location of this temporary directory
include the $TEMPDIR, $TEMP, and $TMP environment variables.

Note, too, that system administrators can set a list of filesystems
where Open MPI is disallowed from creating temporary files by setting
the MCA parameter "orte_no_session_dir".

  Local host: borg01s026
  Fileame:
 
/gpfsm/dnb31/tdirs/pbs/slurm.2202701.mathomp4/openmpi-sessions-mathomp4@borg01s026_0
/60464/1/shared_mem_pool.borg01s026

You can set the MCA paramter shmem_mmap_enable_nfs_warning to 0 to
disable this message.
--
 nx:4
 ny:   24
 comm size is   96
 local array sizes are  12  12
 filling local arrays
 creating requests
 igather
 before collective wait
 after collective wait
 result is1   1.   1.
 result is2   1.41421354   1.41421354
 result is3   1.73205078   1.73205078
 result is4   2.   2.
 result is5   2.23606801   2.23606801
 result is6   2.44948983   2.44948983
 result is7   2.64575124   2.64575124
 result is8   2.82842708   2.82842708
...snip...
 result is  939   30.6431065   30.6431065
 result is  940   30.6594200   30.6594200
 result is  941   30.6757240   30.6757240
 result is  942   30.6920185   30.6920185
 result is  943   30.7083054   30.7083054
 result is  944   30.7245827   30.7245827
 result is  945   30.7408524   30.7408524

Where the second and third columns of numbers are just the square root of
the first.

But, often, the runs do this (note I'm removing the
shmem_mmap_enable_nfs_warning message for sanity's sake from these copy and
pastes):

(1196) $ mpirun -np 96 ./mpi_reproducer.x 4 24
srun.slurm: cluster configuration lacks support for cpu binding
srun.slurm: cluster configuration lacks support for cpu binding
 nx:4
 ny:   24
 comm size is   96
 local array sizes are  12  12
 filling local arrays
 creating requests
 igather
 before collective wait
 after collective wait
 result is1   1.   1.
 result is2   1.41421354   1.41421354
[borg01w021:09264] 89 more processes have sent help message
help-opal-shmem-mmap.txt / mmap on nfs
[borg01w021:09264] Set MCA parameter "orte_base_help_aggregate" to 0 to see
all help / error messages

where it prints out a few results.

The worst case is most often seen on Sandy Bridge and is the most often
failure:

(1197) $ mpirun -np 96 ./mpi_reproducer.x 4 24
srun.slurm: cluster configuration lacks support for cpu binding
srun.slurm: cluster configuration lacks support for cpu binding
 nx:4
 ny:   24
 comm size is   96
 local array sizes are  12  12
 filling local arrays
 creating requests
 igather
 before collective wait
[borg01w021:09367] 89 more processes have sent help message
help-opal-shmem-mmap.txt / mmap on nfs
[borg01w021:09367] Set MCA parameter "orte_base_help_aggregate" to 0 to see
all help / error messages

This halt best compares to our full model code. It halts much at the same
"place" around a collective wait.

Finally, if I setenv OMPI_MCA_orte_base_help_aggreg

Re: [OMPI users] Segmentation fault in OpenMPI 1.8.1

2014-08-14 Thread Joshua Ladd
Hi, Maxime

Just curious, are you able to run a vanilla MPI program? Can you try one
one of the example programs in the "examples" subdirectory. Looks like a
threading issue to me.

Thanks,

Josh


Re: [OMPI users] Segmentation fault in OpenMPI 1.8.1

2014-08-14 Thread Maxime Boissonneault

Hi,
I ran gromacs successfully with OpenMPI 1.8.1 and Cuda 6.0.37 on a 
single node, with 8 ranks and multiple OpenMP threads.


Maxime


Le 2014-08-14 14:15, Joshua Ladd a écrit :

Hi, Maxime

Just curious, are you able to run a vanilla MPI program? Can you try 
one one of the example programs in the "examples" subdirectory. Looks 
like a threading issue to me.


Thanks,

Josh



___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2014/08/25023.php





Re: [OMPI users] Segmentation fault in OpenMPI 1.8.1

2014-08-14 Thread Joshua Ladd
What about between nodes? Since this is coming from the OpenIB BTL, would
be good to check this.

Do you know what the MPI thread level is set to when used with the Charm++
runtime? Is it MPI_THREAD_MULTIPLE? The OpenIB BTL is not thread safe.

Josh


On Thu, Aug 14, 2014 at 2:17 PM, Maxime Boissonneault <
maxime.boissonnea...@calculquebec.ca> wrote:

>  Hi,
> I ran gromacs successfully with OpenMPI 1.8.1 and Cuda 6.0.37 on a single
> node, with 8 ranks and multiple OpenMP threads.
>
> Maxime
>
>
> Le 2014-08-14 14:15, Joshua Ladd a écrit :
>
>  Hi, Maxime
>
>  Just curious, are you able to run a vanilla MPI program? Can you try one
> one of the example programs in the "examples" subdirectory. Looks like a
> threading issue to me.
>
>  Thanks,
>
>  Josh
>
>
>
> ___
> users mailing listus...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/08/25023.php
>
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2014/08/25024.php
>


Re: [OMPI users] Segmentation fault in OpenMPI 1.8.1

2014-08-14 Thread Maxime Boissonneault
I just tried Gromacs with two nodes. It crashes, but with a different 
error. I get

[gpu-k20-13:142156] *** Process received signal ***
[gpu-k20-13:142156] Signal: Segmentation fault (11)
[gpu-k20-13:142156] Signal code: Address not mapped (1)
[gpu-k20-13:142156] Failing at address: 0x8
[gpu-k20-13:142156] [ 0] /lib64/libpthread.so.0(+0xf710)[0x2ac5d070c710]
[gpu-k20-13:142156] [ 1] 
/usr/lib64/nvidia/libcuda.so.1(+0x263acf)[0x2ac5ddfbcacf]
[gpu-k20-13:142156] [ 2] 
/usr/lib64/nvidia/libcuda.so.1(+0x229a83)[0x2ac5ddf82a83]
[gpu-k20-13:142156] [ 3] 
/usr/lib64/nvidia/libcuda.so.1(+0x15b2da)[0x2ac5ddeb42da]
[gpu-k20-13:142156] [ 4] 
/usr/lib64/nvidia/libcuda.so.1(cuInit+0x43)[0x2ac5ddea0933]
[gpu-k20-13:142156] [ 5] 
/software-gpu/cuda/6.0.37/lib64/libcudart.so.6.0(+0x15965)[0x2ac5d0930965]
[gpu-k20-13:142156] [ 6] 
/software-gpu/cuda/6.0.37/lib64/libcudart.so.6.0(+0x15a0a)[0x2ac5d0930a0a]
[gpu-k20-13:142156] [ 7] 
/software-gpu/cuda/6.0.37/lib64/libcudart.so.6.0(+0x15a3b)[0x2ac5d0930a3b]
[gpu-k20-13:142156] [ 8] 
/software-gpu/cuda/6.0.37/lib64/libcudart.so.6.0(cudaDriverGetVersion+0x4a)[0x2ac5d094602a]
[gpu-k20-13:142156] [ 9] 
/software-gpu/apps/gromacs/4.6.5_gcc/lib/libgmxmpi.so.8(gmx_print_version_info_gpu+0x55)[0x2ac5cf9a90b5]
[gpu-k20-13:142156] [10] 
/software-gpu/apps/gromacs/4.6.5_gcc/lib/libgmxmpi.so.8(gmx_log_open+0x17e)[0x2ac5cf54b9be]

[gpu-k20-13:142156] [11] mdrunmpi(cmain+0x1cdb)[0x43b4bb]
[gpu-k20-13:142156] [12] 
/lib64/libc.so.6(__libc_start_main+0xfd)[0x2ac5d1534d1d]

[gpu-k20-13:142156] [13] mdrunmpi[0x407be1]
[gpu-k20-13:142156] *** End of error message ***
--
mpiexec noticed that process rank 0 with PID 142156 on node gpu-k20-13 
exited on signal 11 (Segmentation fault).

--



We do not have MPI_THREAD_MULTIPLE enabled in our build, so Charm++ 
cannot be using this level of threading. The configure line for OpenMPI was

./configure --prefix=$PREFIX \
  --with-threads --with-verbs=yes --enable-shared --enable-static \
  --with-io-romio-flags="--with-file-system=nfs+lustre" \
   --without-loadleveler --without-slurm --with-tm \
   --with-cuda=$(dirname $(dirname $(which nvcc)))

Maxime


Le 2014-08-14 14:20, Joshua Ladd a écrit :
What about between nodes? Since this is coming from the OpenIB BTL, 
would be good to check this.


Do you know what the MPI thread level is set to when used with the 
Charm++ runtime? Is it MPI_THREAD_MULTIPLE? The OpenIB BTL is not 
thread safe.


Josh


On Thu, Aug 14, 2014 at 2:17 PM, Maxime Boissonneault 
> wrote:


Hi,
I ran gromacs successfully with OpenMPI 1.8.1 and Cuda 6.0.37 on a
single node, with 8 ranks and multiple OpenMP threads.

Maxime


Le 2014-08-14 14:15, Joshua Ladd a écrit :

Hi, Maxime

Just curious, are you able to run a vanilla MPI program? Can you
try one one of the example programs in the "examples"
subdirectory. Looks like a threading issue to me.

Thanks,

Josh



___ users mailing
list us...@open-mpi.org  Subscription:
http://www.open-mpi.org/mailman/listinfo.cgi/users

Link to this 
post:http://www.open-mpi.org/community/lists/users/2014/08/25023.php




___
users mailing list
us...@open-mpi.org 
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2014/08/25024.php




___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2014/08/25025.php



--
-
Maxime Boissonneault
Analyste de calcul - Calcul Québec, Université Laval
Ph. D. en physique



Re: [OMPI users] Segmentation fault in OpenMPI 1.8.1

2014-08-14 Thread Joshua Ladd
Hmmm...weird. Seems like maybe a mismatch between libraries. Did you build
OMPI with the same compiler as you did GROMACS/Charm++?

I'm stealing this suggestion from an old Gromacs forum with essentially the
same symptom:

"Did you compile Open MPI and Gromacs with the same compiler (i.e. both gcc
and the same version)? You write you tried different OpenMPI versions and
different GCC versions but it is unclear whether those match. Can you
provide more detail how you compiled (including all options you specified)?
Have you tested any other MPI program linked against those Open MPI
versions? Please make sure (e.g. with ldd) that the MPI and pthread library
you compiled against is also used for execution. If you compiled and run on
different hosts, check whether the error still occurs when executing on the
build host."

http://redmine.gromacs.org/issues/1025

Josh




On Thu, Aug 14, 2014 at 2:40 PM, Maxime Boissonneault <
maxime.boissonnea...@calculquebec.ca> wrote:

>  I just tried Gromacs with two nodes. It crashes, but with a different
> error. I get
> [gpu-k20-13:142156] *** Process received signal ***
> [gpu-k20-13:142156] Signal: Segmentation fault (11)
> [gpu-k20-13:142156] Signal code: Address not mapped (1)
> [gpu-k20-13:142156] Failing at address: 0x8
> [gpu-k20-13:142156] [ 0] /lib64/libpthread.so.0(+0xf710)[0x2ac5d070c710]
> [gpu-k20-13:142156] [ 1]
> /usr/lib64/nvidia/libcuda.so.1(+0x263acf)[0x2ac5ddfbcacf]
> [gpu-k20-13:142156] [ 2]
> /usr/lib64/nvidia/libcuda.so.1(+0x229a83)[0x2ac5ddf82a83]
> [gpu-k20-13:142156] [ 3]
> /usr/lib64/nvidia/libcuda.so.1(+0x15b2da)[0x2ac5ddeb42da]
> [gpu-k20-13:142156] [ 4]
> /usr/lib64/nvidia/libcuda.so.1(cuInit+0x43)[0x2ac5ddea0933]
> [gpu-k20-13:142156] [ 5]
> /software-gpu/cuda/6.0.37/lib64/libcudart.so.6.0(+0x15965)[0x2ac5d0930965]
> [gpu-k20-13:142156] [ 6]
> /software-gpu/cuda/6.0.37/lib64/libcudart.so.6.0(+0x15a0a)[0x2ac5d0930a0a]
> [gpu-k20-13:142156] [ 7]
> /software-gpu/cuda/6.0.37/lib64/libcudart.so.6.0(+0x15a3b)[0x2ac5d0930a3b]
> [gpu-k20-13:142156] [ 8]
> /software-gpu/cuda/6.0.37/lib64/libcudart.so.6.0(cudaDriverGetVersion+0x4a)[0x2ac5d094602a]
> [gpu-k20-13:142156] [ 9]
> /software-gpu/apps/gromacs/4.6.5_gcc/lib/libgmxmpi.so.8(gmx_print_version_info_gpu+0x55)[0x2ac5cf9a90b5]
> [gpu-k20-13:142156] [10]
> /software-gpu/apps/gromacs/4.6.5_gcc/lib/libgmxmpi.so.8(gmx_log_open+0x17e)[0x2ac5cf54b9be]
> [gpu-k20-13:142156] [11] mdrunmpi(cmain+0x1cdb)[0x43b4bb]
> [gpu-k20-13:142156] [12]
> /lib64/libc.so.6(__libc_start_main+0xfd)[0x2ac5d1534d1d]
> [gpu-k20-13:142156] [13] mdrunmpi[0x407be1]
> [gpu-k20-13:142156] *** End of error message ***
> --
> mpiexec noticed that process rank 0 with PID 142156 on node gpu-k20-13
> exited on signal 11 (Segmentation fault).
> --
>
>
>
> We do not have MPI_THREAD_MULTIPLE enabled in our build, so Charm++ cannot
> be using this level of threading. The configure line for OpenMPI was
> ./configure --prefix=$PREFIX \
>   --with-threads --with-verbs=yes --enable-shared --enable-static \
>   --with-io-romio-flags="--with-file-system=nfs+lustre" \
>--without-loadleveler --without-slurm --with-tm \
>--with-cuda=$(dirname $(dirname $(which nvcc)))
>
> Maxime
>
>
> Le 2014-08-14 14:20, Joshua Ladd a écrit :
>
>  What about between nodes? Since this is coming from the OpenIB BTL,
> would be good to check this.
>
> Do you know what the MPI thread level is set to when used with the Charm++
> runtime? Is it MPI_THREAD_MULTIPLE? The OpenIB BTL is not thread safe.
>
>  Josh
>
>
> On Thu, Aug 14, 2014 at 2:17 PM, Maxime Boissonneault <
> maxime.boissonnea...@calculquebec.ca> wrote:
>
>>  Hi,
>> I ran gromacs successfully with OpenMPI 1.8.1 and Cuda 6.0.37 on a single
>> node, with 8 ranks and multiple OpenMP threads.
>>
>> Maxime
>>
>>
>> Le 2014-08-14 14:15, Joshua Ladd a écrit :
>>
>>   Hi, Maxime
>>
>>  Just curious, are you able to run a vanilla MPI program? Can you try one
>> one of the example programs in the "examples" subdirectory. Looks like a
>> threading issue to me.
>>
>>  Thanks,
>>
>>  Josh
>>
>>
>>
>>  ___
>> users mailing listus...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2014/08/25023.php
>>
>>
>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2014/08/25024.php
>>
>
>
>
> ___
> users mailing listus...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/08/25025.php
>
>

Re: [OMPI users] Segmentation fault in OpenMPI 1.8.1

2014-08-14 Thread Maxime Boissonneault

Yes,
Everything has been built with GCC 4.8.x, although x might have changed 
between the OpenMPI 1.8.1 build and the gromacs build. For OpenMPI 
1.8.2rc4 however, it was the exact same compiler for everything.


Maxime

Le 2014-08-14 14:57, Joshua Ladd a écrit :
Hmmm...weird. Seems like maybe a mismatch between libraries. Did you 
build OMPI with the same compiler as you did GROMACS/Charm++?


I'm stealing this suggestion from an old Gromacs forum with 
essentially the same symptom:


"Did you compile Open MPI and Gromacs with the same compiler (i.e. 
both gcc and the same version)? You write you tried different OpenMPI 
versions and different GCC versions but it is unclear whether those 
match. Can you provide more detail how you compiled (including all 
options you specified)? Have you tested any other MPI program linked 
against those Open MPI versions? Please make sure (e.g. with ldd) that 
the MPI and pthread library you compiled against is also used for 
execution. If you compiled and run on different hosts, check whether 
the error still occurs when executing on the build host."


http://redmine.gromacs.org/issues/1025

Josh




On Thu, Aug 14, 2014 at 2:40 PM, Maxime Boissonneault 
> wrote:


I just tried Gromacs with two nodes. It crashes, but with a
different error. I get
[gpu-k20-13:142156] *** Process received signal ***
[gpu-k20-13:142156] Signal: Segmentation fault (11)
[gpu-k20-13:142156] Signal code: Address not mapped (1)
[gpu-k20-13:142156] Failing at address: 0x8
[gpu-k20-13:142156] [ 0]
/lib64/libpthread.so.0(+0xf710)[0x2ac5d070c710]
[gpu-k20-13:142156] [ 1]
/usr/lib64/nvidia/libcuda.so.1(+0x263acf)[0x2ac5ddfbcacf]
[gpu-k20-13:142156] [ 2]
/usr/lib64/nvidia/libcuda.so.1(+0x229a83)[0x2ac5ddf82a83]
[gpu-k20-13:142156] [ 3]
/usr/lib64/nvidia/libcuda.so.1(+0x15b2da)[0x2ac5ddeb42da]
[gpu-k20-13:142156] [ 4]
/usr/lib64/nvidia/libcuda.so.1(cuInit+0x43)[0x2ac5ddea0933]
[gpu-k20-13:142156] [ 5]
/software-gpu/cuda/6.0.37/lib64/libcudart.so.6.0(+0x15965)[0x2ac5d0930965]
[gpu-k20-13:142156] [ 6]
/software-gpu/cuda/6.0.37/lib64/libcudart.so.6.0(+0x15a0a)[0x2ac5d0930a0a]
[gpu-k20-13:142156] [ 7]
/software-gpu/cuda/6.0.37/lib64/libcudart.so.6.0(+0x15a3b)[0x2ac5d0930a3b]
[gpu-k20-13:142156] [ 8]

/software-gpu/cuda/6.0.37/lib64/libcudart.so.6.0(cudaDriverGetVersion+0x4a)[0x2ac5d094602a]
[gpu-k20-13:142156] [ 9]

/software-gpu/apps/gromacs/4.6.5_gcc/lib/libgmxmpi.so.8(gmx_print_version_info_gpu+0x55)[0x2ac5cf9a90b5]
[gpu-k20-13:142156] [10]

/software-gpu/apps/gromacs/4.6.5_gcc/lib/libgmxmpi.so.8(gmx_log_open+0x17e)[0x2ac5cf54b9be]
[gpu-k20-13:142156] [11] mdrunmpi(cmain+0x1cdb)[0x43b4bb]
[gpu-k20-13:142156] [12]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x2ac5d1534d1d]
[gpu-k20-13:142156] [13] mdrunmpi[0x407be1]
[gpu-k20-13:142156] *** End of error message ***
--
mpiexec noticed that process rank 0 with PID 142156 on node
gpu-k20-13 exited on signal 11 (Segmentation fault).
--



We do not have MPI_THREAD_MULTIPLE enabled in our build, so
Charm++ cannot be using this level of threading. The configure
line for OpenMPI was
./configure --prefix=$PREFIX \
  --with-threads --with-verbs=yes --enable-shared
--enable-static \
--with-io-romio-flags="--with-file-system=nfs+lustre" \
   --without-loadleveler --without-slurm --with-tm \
   --with-cuda=$(dirname $(dirname $(which nvcc)))

Maxime


Le 2014-08-14 14:20, Joshua Ladd a écrit :

What about between nodes? Since this is coming from the OpenIB
BTL, would be good to check this.

Do you know what the MPI thread level is set to when used with
the Charm++ runtime? Is it MPI_THREAD_MULTIPLE? The OpenIB BTL is
not thread safe.

Josh


On Thu, Aug 14, 2014 at 2:17 PM, Maxime Boissonneault
mailto:maxime.boissonnea...@calculquebec.ca>> wrote:

Hi,
I ran gromacs successfully with OpenMPI 1.8.1 and Cuda 6.0.37
on a single node, with 8 ranks and multiple OpenMP threads.

Maxime


Le 2014-08-14 14:15, Joshua Ladd a écrit :

Hi, Maxime

Just curious, are you able to run a vanilla MPI program? Can
you try one one of the example programs in the "examples"
subdirectory. Looks like a threading issue to me.

Thanks,

Josh



___ users
mailing list us...@open-mpi.org 
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users

Link to this 
post:http://www.open-mpi.org/community/lists/users/2014/08/25023.php




_

Re: [OMPI users] Segmentation fault in OpenMPI 1.8.1

2014-08-14 Thread Joshua Ladd
Can you try to run the example code "ring_c" across nodes?

Josh


On Thu, Aug 14, 2014 at 3:14 PM, Maxime Boissonneault <
maxime.boissonnea...@calculquebec.ca> wrote:

>  Yes,
> Everything has been built with GCC 4.8.x, although x might have changed
> between the OpenMPI 1.8.1 build and the gromacs build. For OpenMPI 1.8.2rc4
> however, it was the exact same compiler for everything.
>
> Maxime
>
> Le 2014-08-14 14:57, Joshua Ladd a écrit :
>
>  Hmmm...weird. Seems like maybe a mismatch between libraries. Did you
> build OMPI with the same compiler as you did GROMACS/Charm++?
>
> I'm stealing this suggestion from an old Gromacs forum with essentially
> the same symptom:
>
> "Did you compile Open MPI and Gromacs with the same compiler (i.e. both
> gcc and the same version)? You write you tried different OpenMPI versions
> and different GCC versions but it is unclear whether those match. Can you
> provide more detail how you compiled (including all options you specified)?
> Have you tested any other MPI program linked against those Open MPI
> versions? Please make sure (e.g. with ldd) that the MPI and pthread library
> you compiled against is also used for execution. If you compiled and run on
> different hosts, check whether the error still occurs when executing on the
> build host."
>
> http://redmine.gromacs.org/issues/1025
>
>  Josh
>
>
>
>
> On Thu, Aug 14, 2014 at 2:40 PM, Maxime Boissonneault <
> maxime.boissonnea...@calculquebec.ca> wrote:
>
>>  I just tried Gromacs with two nodes. It crashes, but with a different
>> error. I get
>> [gpu-k20-13:142156] *** Process received signal ***
>> [gpu-k20-13:142156] Signal: Segmentation fault (11)
>> [gpu-k20-13:142156] Signal code: Address not mapped (1)
>> [gpu-k20-13:142156] Failing at address: 0x8
>> [gpu-k20-13:142156] [ 0] /lib64/libpthread.so.0(+0xf710)[0x2ac5d070c710]
>> [gpu-k20-13:142156] [ 1]
>> /usr/lib64/nvidia/libcuda.so.1(+0x263acf)[0x2ac5ddfbcacf]
>> [gpu-k20-13:142156] [ 2]
>> /usr/lib64/nvidia/libcuda.so.1(+0x229a83)[0x2ac5ddf82a83]
>> [gpu-k20-13:142156] [ 3]
>> /usr/lib64/nvidia/libcuda.so.1(+0x15b2da)[0x2ac5ddeb42da]
>> [gpu-k20-13:142156] [ 4]
>> /usr/lib64/nvidia/libcuda.so.1(cuInit+0x43)[0x2ac5ddea0933]
>> [gpu-k20-13:142156] [ 5]
>> /software-gpu/cuda/6.0.37/lib64/libcudart.so.6.0(+0x15965)[0x2ac5d0930965]
>> [gpu-k20-13:142156] [ 6]
>> /software-gpu/cuda/6.0.37/lib64/libcudart.so.6.0(+0x15a0a)[0x2ac5d0930a0a]
>> [gpu-k20-13:142156] [ 7]
>> /software-gpu/cuda/6.0.37/lib64/libcudart.so.6.0(+0x15a3b)[0x2ac5d0930a3b]
>> [gpu-k20-13:142156] [ 8]
>> /software-gpu/cuda/6.0.37/lib64/libcudart.so.6.0(cudaDriverGetVersion+0x4a)[0x2ac5d094602a]
>> [gpu-k20-13:142156] [ 9]
>> /software-gpu/apps/gromacs/4.6.5_gcc/lib/libgmxmpi.so.8(gmx_print_version_info_gpu+0x55)[0x2ac5cf9a90b5]
>> [gpu-k20-13:142156] [10]
>> /software-gpu/apps/gromacs/4.6.5_gcc/lib/libgmxmpi.so.8(gmx_log_open+0x17e)[0x2ac5cf54b9be]
>> [gpu-k20-13:142156] [11] mdrunmpi(cmain+0x1cdb)[0x43b4bb]
>> [gpu-k20-13:142156] [12]
>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x2ac5d1534d1d]
>> [gpu-k20-13:142156] [13] mdrunmpi[0x407be1]
>> [gpu-k20-13:142156] *** End of error message ***
>> --
>> mpiexec noticed that process rank 0 with PID 142156 on node gpu-k20-13
>> exited on signal 11 (Segmentation fault).
>> --
>>
>>
>>
>> We do not have MPI_THREAD_MULTIPLE enabled in our build, so Charm++
>> cannot be using this level of threading. The configure line for OpenMPI was
>> ./configure --prefix=$PREFIX \
>>   --with-threads --with-verbs=yes --enable-shared --enable-static \
>>   --with-io-romio-flags="--with-file-system=nfs+lustre" \
>>--without-loadleveler --without-slurm --with-tm \
>>--with-cuda=$(dirname $(dirname $(which nvcc)))
>>
>> Maxime
>>
>>
>> Le 2014-08-14 14:20, Joshua Ladd a écrit :
>>
>>   What about between nodes? Since this is coming from the OpenIB BTL,
>> would be good to check this.
>>
>> Do you know what the MPI thread level is set to when used with the
>> Charm++ runtime? Is it MPI_THREAD_MULTIPLE? The OpenIB BTL is not thread
>> safe.
>>
>>  Josh
>>
>>
>> On Thu, Aug 14, 2014 at 2:17 PM, Maxime Boissonneault <
>> maxime.boissonnea...@calculquebec.ca> wrote:
>>
>>>  Hi,
>>> I ran gromacs successfully with OpenMPI 1.8.1 and Cuda 6.0.37 on a
>>> single node, with 8 ranks and multiple OpenMP threads.
>>>
>>> Maxime
>>>
>>>
>>> Le 2014-08-14 14:15, Joshua Ladd a écrit :
>>>
>>>   Hi, Maxime
>>>
>>>  Just curious, are you able to run a vanilla MPI program? Can you try
>>> one one of the example programs in the "examples" subdirectory. Looks like
>>> a threading issue to me.
>>>
>>>  Thanks,
>>>
>>>  Josh
>>>
>>>
>>>
>>>  ___
>>> users mailing listus...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>> Link to this post: 
>>> 

Re: [OMPI users] Segmentation fault in OpenMPI 1.8.1

2014-08-14 Thread Joshua Ladd
And maybe include your LD_LIBRARY_PATH

Josh


On Thu, Aug 14, 2014 at 3:16 PM, Joshua Ladd  wrote:

> Can you try to run the example code "ring_c" across nodes?
>
> Josh
>
>
> On Thu, Aug 14, 2014 at 3:14 PM, Maxime Boissonneault <
> maxime.boissonnea...@calculquebec.ca> wrote:
>
>>  Yes,
>> Everything has been built with GCC 4.8.x, although x might have changed
>> between the OpenMPI 1.8.1 build and the gromacs build. For OpenMPI 1.8.2rc4
>> however, it was the exact same compiler for everything.
>>
>> Maxime
>>
>> Le 2014-08-14 14:57, Joshua Ladd a écrit :
>>
>>  Hmmm...weird. Seems like maybe a mismatch between libraries. Did you
>> build OMPI with the same compiler as you did GROMACS/Charm++?
>>
>> I'm stealing this suggestion from an old Gromacs forum with essentially
>> the same symptom:
>>
>> "Did you compile Open MPI and Gromacs with the same compiler (i.e. both
>> gcc and the same version)? You write you tried different OpenMPI versions
>> and different GCC versions but it is unclear whether those match. Can you
>> provide more detail how you compiled (including all options you specified)?
>> Have you tested any other MPI program linked against those Open MPI
>> versions? Please make sure (e.g. with ldd) that the MPI and pthread library
>> you compiled against is also used for execution. If you compiled and run on
>> different hosts, check whether the error still occurs when executing on the
>> build host."
>>
>> http://redmine.gromacs.org/issues/1025
>>
>>  Josh
>>
>>
>>
>>
>> On Thu, Aug 14, 2014 at 2:40 PM, Maxime Boissonneault <
>> maxime.boissonnea...@calculquebec.ca> wrote:
>>
>>>  I just tried Gromacs with two nodes. It crashes, but with a different
>>> error. I get
>>> [gpu-k20-13:142156] *** Process received signal ***
>>> [gpu-k20-13:142156] Signal: Segmentation fault (11)
>>> [gpu-k20-13:142156] Signal code: Address not mapped (1)
>>> [gpu-k20-13:142156] Failing at address: 0x8
>>> [gpu-k20-13:142156] [ 0] /lib64/libpthread.so.0(+0xf710)[0x2ac5d070c710]
>>> [gpu-k20-13:142156] [ 1]
>>> /usr/lib64/nvidia/libcuda.so.1(+0x263acf)[0x2ac5ddfbcacf]
>>> [gpu-k20-13:142156] [ 2]
>>> /usr/lib64/nvidia/libcuda.so.1(+0x229a83)[0x2ac5ddf82a83]
>>> [gpu-k20-13:142156] [ 3]
>>> /usr/lib64/nvidia/libcuda.so.1(+0x15b2da)[0x2ac5ddeb42da]
>>> [gpu-k20-13:142156] [ 4]
>>> /usr/lib64/nvidia/libcuda.so.1(cuInit+0x43)[0x2ac5ddea0933]
>>> [gpu-k20-13:142156] [ 5]
>>> /software-gpu/cuda/6.0.37/lib64/libcudart.so.6.0(+0x15965)[0x2ac5d0930965]
>>> [gpu-k20-13:142156] [ 6]
>>> /software-gpu/cuda/6.0.37/lib64/libcudart.so.6.0(+0x15a0a)[0x2ac5d0930a0a]
>>> [gpu-k20-13:142156] [ 7]
>>> /software-gpu/cuda/6.0.37/lib64/libcudart.so.6.0(+0x15a3b)[0x2ac5d0930a3b]
>>> [gpu-k20-13:142156] [ 8]
>>> /software-gpu/cuda/6.0.37/lib64/libcudart.so.6.0(cudaDriverGetVersion+0x4a)[0x2ac5d094602a]
>>> [gpu-k20-13:142156] [ 9]
>>> /software-gpu/apps/gromacs/4.6.5_gcc/lib/libgmxmpi.so.8(gmx_print_version_info_gpu+0x55)[0x2ac5cf9a90b5]
>>> [gpu-k20-13:142156] [10]
>>> /software-gpu/apps/gromacs/4.6.5_gcc/lib/libgmxmpi.so.8(gmx_log_open+0x17e)[0x2ac5cf54b9be]
>>> [gpu-k20-13:142156] [11] mdrunmpi(cmain+0x1cdb)[0x43b4bb]
>>> [gpu-k20-13:142156] [12]
>>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x2ac5d1534d1d]
>>> [gpu-k20-13:142156] [13] mdrunmpi[0x407be1]
>>> [gpu-k20-13:142156] *** End of error message ***
>>>
>>> --
>>> mpiexec noticed that process rank 0 with PID 142156 on node gpu-k20-13
>>> exited on signal 11 (Segmentation fault).
>>>
>>> --
>>>
>>>
>>>
>>> We do not have MPI_THREAD_MULTIPLE enabled in our build, so Charm++
>>> cannot be using this level of threading. The configure line for OpenMPI was
>>> ./configure --prefix=$PREFIX \
>>>   --with-threads --with-verbs=yes --enable-shared --enable-static \
>>>   --with-io-romio-flags="--with-file-system=nfs+lustre" \
>>>--without-loadleveler --without-slurm --with-tm \
>>>--with-cuda=$(dirname $(dirname $(which nvcc)))
>>>
>>> Maxime
>>>
>>>
>>> Le 2014-08-14 14:20, Joshua Ladd a écrit :
>>>
>>>   What about between nodes? Since this is coming from the OpenIB BTL,
>>> would be good to check this.
>>>
>>> Do you know what the MPI thread level is set to when used with the
>>> Charm++ runtime? Is it MPI_THREAD_MULTIPLE? The OpenIB BTL is not thread
>>> safe.
>>>
>>>  Josh
>>>
>>>
>>> On Thu, Aug 14, 2014 at 2:17 PM, Maxime Boissonneault <
>>> maxime.boissonnea...@calculquebec.ca> wrote:
>>>
  Hi,
 I ran gromacs successfully with OpenMPI 1.8.1 and Cuda 6.0.37 on a
 single node, with 8 ranks and multiple OpenMP threads.

 Maxime


 Le 2014-08-14 14:15, Joshua Ladd a écrit :

   Hi, Maxime

  Just curious, are you able to run a vanilla MPI program? Can you try
 one one of the example programs in the "examples" subdirectory. Looks like
 a threading issue to me.
>>>

Re: [OMPI users] Segmentation fault in OpenMPI 1.8.1

2014-08-14 Thread Joshua Ladd
One more, Maxime, can you please make sure you've covered everything here:

http://www.open-mpi.org/community/help/

Josh


On Thu, Aug 14, 2014 at 3:18 PM, Joshua Ladd  wrote:

> And maybe include your LD_LIBRARY_PATH
>
> Josh
>
>
> On Thu, Aug 14, 2014 at 3:16 PM, Joshua Ladd  wrote:
>
>> Can you try to run the example code "ring_c" across nodes?
>>
>> Josh
>>
>>
>> On Thu, Aug 14, 2014 at 3:14 PM, Maxime Boissonneault <
>> maxime.boissonnea...@calculquebec.ca> wrote:
>>
>>>  Yes,
>>> Everything has been built with GCC 4.8.x, although x might have changed
>>> between the OpenMPI 1.8.1 build and the gromacs build. For OpenMPI 1.8.2rc4
>>> however, it was the exact same compiler for everything.
>>>
>>> Maxime
>>>
>>> Le 2014-08-14 14:57, Joshua Ladd a écrit :
>>>
>>>  Hmmm...weird. Seems like maybe a mismatch between libraries. Did you
>>> build OMPI with the same compiler as you did GROMACS/Charm++?
>>>
>>> I'm stealing this suggestion from an old Gromacs forum with essentially
>>> the same symptom:
>>>
>>> "Did you compile Open MPI and Gromacs with the same compiler (i.e. both
>>> gcc and the same version)? You write you tried different OpenMPI versions
>>> and different GCC versions but it is unclear whether those match. Can you
>>> provide more detail how you compiled (including all options you specified)?
>>> Have you tested any other MPI program linked against those Open MPI
>>> versions? Please make sure (e.g. with ldd) that the MPI and pthread library
>>> you compiled against is also used for execution. If you compiled and run on
>>> different hosts, check whether the error still occurs when executing on the
>>> build host."
>>>
>>> http://redmine.gromacs.org/issues/1025
>>>
>>>  Josh
>>>
>>>
>>>
>>>
>>> On Thu, Aug 14, 2014 at 2:40 PM, Maxime Boissonneault <
>>> maxime.boissonnea...@calculquebec.ca> wrote:
>>>
  I just tried Gromacs with two nodes. It crashes, but with a different
 error. I get
 [gpu-k20-13:142156] *** Process received signal ***
 [gpu-k20-13:142156] Signal: Segmentation fault (11)
 [gpu-k20-13:142156] Signal code: Address not mapped (1)
 [gpu-k20-13:142156] Failing at address: 0x8
 [gpu-k20-13:142156] [ 0] /lib64/libpthread.so.0(+0xf710)[0x2ac5d070c710]
 [gpu-k20-13:142156] [ 1]
 /usr/lib64/nvidia/libcuda.so.1(+0x263acf)[0x2ac5ddfbcacf]
 [gpu-k20-13:142156] [ 2]
 /usr/lib64/nvidia/libcuda.so.1(+0x229a83)[0x2ac5ddf82a83]
 [gpu-k20-13:142156] [ 3]
 /usr/lib64/nvidia/libcuda.so.1(+0x15b2da)[0x2ac5ddeb42da]
 [gpu-k20-13:142156] [ 4]
 /usr/lib64/nvidia/libcuda.so.1(cuInit+0x43)[0x2ac5ddea0933]
 [gpu-k20-13:142156] [ 5]
 /software-gpu/cuda/6.0.37/lib64/libcudart.so.6.0(+0x15965)[0x2ac5d0930965]
 [gpu-k20-13:142156] [ 6]
 /software-gpu/cuda/6.0.37/lib64/libcudart.so.6.0(+0x15a0a)[0x2ac5d0930a0a]
 [gpu-k20-13:142156] [ 7]
 /software-gpu/cuda/6.0.37/lib64/libcudart.so.6.0(+0x15a3b)[0x2ac5d0930a3b]
 [gpu-k20-13:142156] [ 8]
 /software-gpu/cuda/6.0.37/lib64/libcudart.so.6.0(cudaDriverGetVersion+0x4a)[0x2ac5d094602a]
 [gpu-k20-13:142156] [ 9]
 /software-gpu/apps/gromacs/4.6.5_gcc/lib/libgmxmpi.so.8(gmx_print_version_info_gpu+0x55)[0x2ac5cf9a90b5]
 [gpu-k20-13:142156] [10]
 /software-gpu/apps/gromacs/4.6.5_gcc/lib/libgmxmpi.so.8(gmx_log_open+0x17e)[0x2ac5cf54b9be]
 [gpu-k20-13:142156] [11] mdrunmpi(cmain+0x1cdb)[0x43b4bb]
 [gpu-k20-13:142156] [12]
 /lib64/libc.so.6(__libc_start_main+0xfd)[0x2ac5d1534d1d]
 [gpu-k20-13:142156] [13] mdrunmpi[0x407be1]
 [gpu-k20-13:142156] *** End of error message ***

 --
 mpiexec noticed that process rank 0 with PID 142156 on node gpu-k20-13
 exited on signal 11 (Segmentation fault).

 --



 We do not have MPI_THREAD_MULTIPLE enabled in our build, so Charm++
 cannot be using this level of threading. The configure line for OpenMPI was
 ./configure --prefix=$PREFIX \
   --with-threads --with-verbs=yes --enable-shared --enable-static \
   --with-io-romio-flags="--with-file-system=nfs+lustre" \
--without-loadleveler --without-slurm --with-tm \
--with-cuda=$(dirname $(dirname $(which nvcc)))

 Maxime


 Le 2014-08-14 14:20, Joshua Ladd a écrit :

   What about between nodes? Since this is coming from the OpenIB BTL,
 would be good to check this.

 Do you know what the MPI thread level is set to when used with the
 Charm++ runtime? Is it MPI_THREAD_MULTIPLE? The OpenIB BTL is not thread
 safe.

  Josh


 On Thu, Aug 14, 2014 at 2:17 PM, Maxime Boissonneault <
 maxime.boissonnea...@calculquebec.ca> wrote:

>  Hi,
> I ran gromacs successfully with OpenMPI 1.8.1 and Cuda 6.0.37 on a
> single node, with 8 ranks and multiple OpenMP

Re: [OMPI users] Running a hybrid MPI+openMP program

2014-08-14 Thread Oscar Mojica
Guys

I changed the line to run the program in the script with both options
/usr/bin/time -f "%E" /opt/openmpi/bin/mpirun -v --bind-to-none -np $NSLOTS 
./inverse.exe
/usr/bin/time -f "%E" /opt/openmpi/bin/mpirun -v --bind-to-socket -np $NSLOTS 
./inverse.exe

but I got the same results. When I use man mpirun appears:

   -bind-to-none, --bind-to-none
  Do not bind processes.  (Default.)

and the output of 'qconf -sp orte' is

pe_nameorte
slots  
user_lists NONE
xuser_listsNONE
start_proc_args/bin/true
stop_proc_args /bin/true
allocation_rule$fill_up
control_slaves TRUE
job_is_first_task  FALSE
urgency_slots  min
accounting_summary TRUE

I don't know if the installed Open MPI was compiled with '--with-sge'. How can 
i know that?
before to think in an hybrid application i was using only MPI and the program 
used few processors (14). The cluster possesses 28 machines, 15 with 16 cores 
and 13 with 8 cores totalizing 344 units of processing. When I submitted the 
job (only MPI), the MPI processes were spread to the cores directly, for that 
reason I created a new queue with 14 machines trying to gain more time.  the 
results were the same in both cases. In the last case i could prove that the 
processes were distributed to all machines correctly.

What I must to do?
Thanks 

Oscar Fabian Mojica Ladino
Geologist M.S. in  Geophysics


> Date: Thu, 14 Aug 2014 10:10:17 -0400
> From: maxime.boissonnea...@calculquebec.ca
> To: us...@open-mpi.org
> Subject: Re: [OMPI users] Running a hybrid MPI+openMP program
> 
> Hi,
> You DEFINITELY need to disable OpenMPI's new default binding. Otherwise, 
> your N threads will run on a single core. --bind-to socket would be my 
> recommendation for hybrid jobs.
> 
> Maxime
> 
> 
> Le 2014-08-14 10:04, Jeff Squyres (jsquyres) a écrit :
> > I don't know much about OpenMP, but do you need to disable Open MPI's 
> > default bind-to-core functionality (I'm assuming you're using Open MPI 
> > 1.8.x)?
> >
> > You can try "mpirun --bind-to none ...", which will have Open MPI not bind 
> > MPI processes to cores, which might allow OpenMP to think that it can use 
> > all the cores, and therefore it will spawn num_cores threads...?
> >
> >
> > On Aug 14, 2014, at 9:50 AM, Oscar Mojica  wrote:
> >
> >> Hello everybody
> >>
> >> I am trying to run a hybrid mpi + openmp program in a cluster.  I created 
> >> a queue with 14 machines, each one with 16 cores. The program divides the 
> >> work among the 14 processors with MPI and within each processor a loop is 
> >> also divided into 8 threads for example, using openmp. The problem is that 
> >> when I submit the job to the queue the MPI processes don't divide the work 
> >> into threads and the program prints the number of threads  that are 
> >> working within each process as one.
> >>
> >> I made a simple test program that uses openmp and  I logged in one machine 
> >> of the fourteen. I compiled it using gfortran -fopenmp program.f -o exe,  
> >> set the OMP_NUM_THREADS environment variable equal to 8  and when I ran 
> >> directly in the terminal the loop was effectively divided among the cores 
> >> and for example in this case the program printed the number of threads 
> >> equal to 8
> >>
> >> This is my Makefile
> >>   
> >> # Start of the makefile
> >> # Defining variables
> >> objects = inv_grav3d.o funcpdf.o gr3dprm.o fdjac.o dsvd.o
> >> #f90comp = /opt/openmpi/bin/mpif90
> >> f90comp = /usr/bin/mpif90
> >> #switch = -O3
> >> executable = inverse.exe
> >> # Makefile
> >> all : $(executable)
> >> $(executable) : $(objects) 
> >>$(f90comp) -fopenmp -g -O -o $(executable) $(objects)
> >>rm $(objects)
> >> %.o: %.f
> >>$(f90comp) -c $<
> >> # Cleaning everything
> >> clean:
> >>rm $(executable)
> >> #  rm $(objects)
> >> # End of the makefile
> >>
> >> and the script that i am using is
> >>
> >> #!/bin/bash
> >> #$ -cwd
> >> #$ -j y
> >> #$ -S /bin/bash
> >> #$ -pe orte 14
> >> #$ -N job
> >> #$ -q new.q
> >>
> >> export OMP_NUM_THREADS=8
> >> /usr/bin/time -f "%E" /opt/openmpi/bin/mpirun -v -np $NSLOTS ./inverse.exe
> >>
> >> am I forgetting something?
> >>
> >> Thanks,
> >>
> >> Oscar Fabian Mojica Ladino
> >> Geologist M.S. in  Geophysics
> >> ___
> >> users mailing list
> >> us...@open-mpi.org
> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> >> Link to this post: 
> >> http://www.open-mpi.org/community/lists/users/2014/08/25016.php
> >
> 
> 
> -- 
> -
> Maxime Boissonneault
> Analyste de calcul - Calcul Québec, Université Laval
> Ph. D. en physique
> 
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/08/25020.php
  

Re: [OMPI users] GCC 4.9 and MPI_F08?

2014-08-14 Thread Jeff Squyres (jsquyres)
On Aug 14, 2014, at 5:52 AM, Christoph Niethammer  wrote:

> I just gave gcc 4.9.0 a try and the mpi_f09 module

Wow -- that must be 1 better than the mpi_f08 module!

:-p

> is there but it seems to miss some functions:
> 
> mpifort test.f90
> /tmp/ccHCEbXC.o: In function `MAIN__':
> test.f90:(.text+0x35a): undefined reference to `mpi_win_lock_all_'

Turns out that this is not a problem with the mpi_f08 module, per se -- we 
didn't have Fortran bindings (at all) for MPI_WIN_LOCK_ALL, MPI_WIN_UNLOCK_ALL, 
and MPI_WIN_SYNC.  :-(

I just added them to the trunk, and will be adding tests to the test suite 
shortly...

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI users] Running a hybrid MPI+openMP program

2014-08-14 Thread Reuti
Hi,

I think this is a broader issue in case an MPI library is used in conjunction 
with threads while running inside a queuing system. First: whether your actual 
installation of Open MPI is SGE-aware you can check with:

$ ompi_info | grep grid
 MCA ras: gridengine (MCA v2.0, API v2.0, Component v1.6.5)

Then we can look at the definition of your PE: "allocation_rule$fill_up". 
This means that SGE will grant you 14 slots in total in any combination on the 
available machines, means 8+4+2 slots allocation is an allowed combination like 
4+4+3+3 and so on. Depending on the SGE-awareness it's a question: will your 
application just start processes on all nodes and completely disregard the 
granted allocation, or as the other extreme does it stays on one and the same 
machine for all started processes? On the master node of the parallel job you 
can issue:

$ ps -e f

(f w/o -) to have a look whether `ssh` or `qrsh -inhert ...` is used to reach 
other machines and their requested process count.


Now to the common problem in such a set up:

AFAICS: for now there is no way in the Open MPI + SGE combination to specify 
the number of MPI processes and intended number of threads which are 
automatically read by Open MPI while staying inside the granted slot count and 
allocation. So it seems to be necessary to have the intended number of threads 
being honored by Open MPI too.

Hence specifying e.g. "allocation_rule 8" in such a setup while requesting 32 
processes, would for now start 32 processes by MPI already, as Open MP reads 
the $PE_HOSTFILE and acts accordingly.

Open MPI would have to read the generated machine file in a slightly different 
way regarding threads: a) read the $PE_HOSTFILE, b) divide the granted slots 
per machine by OMP_NUM_THREADS, c) throw an error in case it's not divisible by 
OMP_NUM_THREADS. Then start one process per quotient.

Would this work for you?

-- Reuti

PS: This would also mean to have a couple of PEs in SGE having a fixed 
"allocation_rule". While this works right now, an extension in SGE could be 
"$fill_up_omp"/"$round_robin_omp" and using  OMP_NUM_THREADS there too, hence 
it must not be specified as an `export` in the job script but either on the 
command line or inside the job script in #$ lines as job requests. This would 
mean to collect slots in bunches of OMP_NUM_THREADS on each machine to reach 
the overall specified slot count. Whether OMP_NUM_THREADS or n times 
OMP_NUM_THREADS is allowed per machine needs to be discussed.
 
PS2: As Univa SGE can also supply a list of granted cores in the $PE_HOSTFILE, 
it would be an extension to feed this to Open MPI to allow any UGE aware 
binding.


Am 14.08.2014 um 21:52 schrieb Oscar Mojica:

> Guys
> 
> I changed the line to run the program in the script with both options
> /usr/bin/time -f "%E" /opt/openmpi/bin/mpirun -v --bind-to-none -np $NSLOTS 
> ./inverse.exe
> /usr/bin/time -f "%E" /opt/openmpi/bin/mpirun -v --bind-to-socket -np $NSLOTS 
> ./inverse.exe
> 
> but I got the same results. When I use man mpirun appears:
> 
>-bind-to-none, --bind-to-none
>   Do not bind processes.  (Default.)
> 
> and the output of 'qconf -sp orte' is
> 
> pe_nameorte
> slots  
> user_lists NONE
> xuser_listsNONE
> start_proc_args/bin/true
> stop_proc_args /bin/true
> allocation_rule$fill_up
> control_slaves TRUE
> job_is_first_task  FALSE
> urgency_slots  min
> accounting_summary TRUE
> 
> I don't know if the installed Open MPI was compiled with '--with-sge'. How 
> can i know that?
> before to think in an hybrid application i was using only MPI and the program 
> used few processors (14). The cluster possesses 28 machines, 15 with 16 cores 
> and 13 with 8 cores totalizing 344 units of processing. When I submitted the 
> job (only MPI), the MPI processes were spread to the cores directly, for that 
> reason I created a new queue with 14 machines trying to gain more time.  the 
> results were the same in both cases. In the last case i could prove that the 
> processes were distributed to all machines correctly.
> 
> What I must to do?
> Thanks 
> 
> Oscar Fabian Mojica Ladino
> Geologist M.S. in  Geophysics
> 
> 
> > Date: Thu, 14 Aug 2014 10:10:17 -0400
> > From: maxime.boissonnea...@calculquebec.ca
> > To: us...@open-mpi.org
> > Subject: Re: [OMPI users] Running a hybrid MPI+openMP program
> > 
> > Hi,
> > You DEFINITELY need to disable OpenMPI's new default binding. Otherwise, 
> > your N threads will run on a single core. --bind-to socket would be my 
> > recommendation for hybrid jobs.
> > 
> > Maxime
> > 
> > 
> > Le 2014-08-14 10:04, Jeff Squyres (jsquyres) a écrit :
> > > I don't know much about OpenMP, but do you need to disable Open MPI's 
> > > default bind-to-core functionality (I'm assuming you're using Open MPI 
> > > 1.8.x)?
> > >
> > > You can try "mpirun --bind-to none ...", which will have Open MPI not 
> > > bi