Re: [OMPI users] Questions regarding xpmem

2015-03-17 Thread Tobias Kloeffel

Hello Nathan,

I am using:
IMB 4.0 Update 2
gcc version 4.8.1
Intel compilers 15.0.1 20141023
xpmem from your github

I also tested pwscf (QuatumEespresso), here I can observe the same 
behavior. The entire calculation runs without problems, but a few mpi 
procs just stay alive and refuse to die, even with signal 9.

openmpi and pw was build with the intel compilers, xpmem with gcc.


Kind regards,
Tobias

On 03/16/2015 05:56 PM, Nathan Hjelm wrote:

What program are you using for the benchmark? Are you using the xpmem
branch in my github? For my testing I used a stock ubuntu 3.13 kernel
but I have not full stress-tested my xpmem branch.

I will see if I can reproduce and fix the hang.

-Nathan

On Mon, Mar 16, 2015 at 05:32:26PM +0100, Tobias Kloeffel wrote:

Hello everyone,

currently I am benchmarking the different single copy mechanisms
knem/cma/xpmem on a Xeon E5 V3 machine.
I am using openmpi 1.8.4 with the CMA patch for vader.

While it turns out that xpmem is the clear winner (reproducing Nathan
Hjelm's results) I always ran into a problem at the mpi finalizing step. At
this step, at least one process hangs, and can't be killed anymore. To get
rid of the hanging process, the server has to be rebooted.

The applications finish successfully.

Unfortunately, I can't find any further development of the xpmem module. Is
this bug known to anyone? What kernel versions do you use?

Any help would be appreciated.

Tested kernel versions:
3.11.25-desktop (openSUSE)
3.18.9 (vanilla)
3.19.1 (vanilla)

--
M.Sc. Tobias Klöffel
===
Interdisciplinary Center for Molecular Materials (ICMM)
and Computer-Chemistry-Center (CCC)
Department Chemie und Pharmazie
Friedrich-Alexander-Universität Erlangen-Nürnberg
Nägelsbachstr. 25
D-91052 Erlangen, Germany

Room: 2.307
Phone: +49 (0) 9131 / 85 - 20421
Fax: +49 (0) 9131 / 85 - 26565

===
Department of Materials Science and Engineering
Institute I: General Materials Properties
Friedrich-Alexander-Universität Erlangen-Nürnberg
Martensstr. 5, D-91058 Erlangen, Germany
Office 3.40
Phone: (+49) 9131 85 27 -486
http://www.gmp.ww.uni-erlangen.de

E-mail: tobias.kloef...@fau.de

___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2015/03/26479.php


___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2015/03/26480.php


--
M.Sc. Tobias Klöffel
===
Interdisciplinary Center for Molecular Materials (ICMM)
and Computer-Chemistry-Center (CCC)
Department Chemie und Pharmazie
Friedrich-Alexander-Universität Erlangen-Nürnberg
Nägelsbachstr. 25
D-91052 Erlangen, Germany

Room: 2.307
Phone: +49 (0) 9131 / 85 - 20421
Fax: +49 (0) 9131 / 85 - 26565

===
Department of Materials Science and Engineering
Institute I: General Materials Properties
Friedrich-Alexander-Universität Erlangen-Nürnberg

Martensstr. 5, D-91058 Erlangen, Germany
Office 3.40
Phone: (+49) 9131 85 27 -486
http://www.gmp.ww.uni-erlangen.de

E-mail: tobias.kloef...@fau.de



[OMPI users] monitoring the status of processors

2015-03-17 Thread etcamargo

Hi, All

I would like to know if there is a (MPI) tool for monitoring the status 
of a processor (and your cores) at runtime, i.e., while I am running a 
MPI application.


Let's suppose that some physical processors become overloaded while a 
MPI application is running. I am looking for a way to know which are the 
"busy" or the "slow" processors.


Thanks in advance!

Edson


Re: [OMPI users] monitoring the status of processors

2015-03-17 Thread Ralph Castain
Not at the moment - at least, not integrated into OMPI at this time. We used to 
have sensors for such purposes in the OMPI code itself, but they weren’t used 
and so we removed them.

The resource manager generally does keep track of such things - see for example 
ORCM:

https://github.com/open-mpi/orcm/wiki 

Some of us are working on an extended version of PMI (called PMIx) that will 
include support for requesting such info from the resource manager in its 
upcoming version 2.0 release (sometime this summer). So that might help, and 
would be portable across environments.

https://github.com/open-mpi/pmix/wiki 


> On Mar 17, 2015, at 7:38 AM, etcamargo  wrote:
> 
> Hi, All
> 
> I would like to know if there is a (MPI) tool for monitoring the status of a 
> processor (and your cores) at runtime, i.e., while I am running a MPI 
> application.
> 
> Let's suppose that some physical processors become overloaded while a MPI 
> application is running. I am looking for a way to know which are the "busy" 
> or the "slow" processors.
> 
> Thanks in advance!
> 
> Edson
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/03/26484.php



Re: [OMPI users] monitoring the status of processors

2015-03-17 Thread Damien

Ganglia might help:

http://ganglia.sourceforge.net/

Could be too high-level though.

Damien

On 2015-03-17 8:59 AM, Ralph Castain wrote:
Not at the moment - at least, not integrated into OMPI at this time. 
We used to have sensors for such purposes in the OMPI code itself, but 
they weren’t used and so we removed them.


The resource manager generally does keep track of such things - see 
for example ORCM:


https://github.com/open-mpi/orcm/wiki

Some of us are working on an extended version of PMI (called PMIx) 
that will include support for requesting such info from the resource 
manager in its upcoming version 2.0 release (sometime this summer). So 
that might help, and would be portable across environments.


https://github.com/open-mpi/pmix/wiki


On Mar 17, 2015, at 7:38 AM, etcamargo > wrote:


Hi, All

I would like to know if there is a (MPI) tool for monitoring the 
status of a processor (and your cores) at runtime, i.e., while I am 
running a MPI application.


Let's suppose that some physical processors become overloaded while a 
MPI application is running. I am looking for a way to know which are 
the "busy" or the "slow" processors.


Thanks in advance!

Edson
___
users mailing list
us...@open-mpi.org 
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2015/03/26484.php




___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2015/03/26485.php




Re: [OMPI users] Questions regarding xpmem

2015-03-17 Thread Nathan Hjelm

I was able to reproduce the issue on ubuntu with a 3.13 kernel. I think
I know what is going wrong and I am working on a fix.

-Nathan

On Tue, Mar 17, 2015 at 12:02:43PM +0100, Tobias Kloeffel wrote:
>Hello Nathan,
> 
>I am using:
>IMB 4.0 Update 2
>gcc version 4.8.1
>Intel compilers 15.0.1 20141023
>xpmem from your github
> 
>I also tested pwscf (QuatumEespresso), here I can observe the same
>behavior. The entire calculation runs without problems, but a few mpi
>procs just stay alive and refuse to die, even with signal 9.
>openmpi and pw was build with the intel compilers, xpmem with gcc.
> 
>Kind regards,
>Tobias
> 
>On 03/16/2015 05:56 PM, Nathan Hjelm wrote:
> 
>  What program are you using for the benchmark? Are you using the xpmem
>  branch in my github? For my testing I used a stock ubuntu 3.13 kernel
>  but I have not full stress-tested my xpmem branch.
> 
>  I will see if I can reproduce and fix the hang.
> 
>  -Nathan
> 
>  On Mon, Mar 16, 2015 at 05:32:26PM +0100, Tobias Kloeffel wrote:
> 
>  Hello everyone,
> 
>  currently I am benchmarking the different single copy mechanisms
>  knem/cma/xpmem on a Xeon E5 V3 machine.
>  I am using openmpi 1.8.4 with the CMA patch for vader.
> 
>  While it turns out that xpmem is the clear winner (reproducing Nathan
>  Hjelm's results) I always ran into a problem at the mpi finalizing step. At
>  this step, at least one process hangs, and can't be killed anymore. To get
>  rid of the hanging process, the server has to be rebooted.
> 
>  The applications finish successfully.
> 
>  Unfortunately, I can't find any further development of the xpmem module. Is
>  this bug known to anyone? What kernel versions do you use?
> 
>  Any help would be appreciated.
> 
>  Tested kernel versions:
>  3.11.25-desktop (openSUSE)
>  3.18.9 (vanilla)
>  3.19.1 (vanilla)
> 
>  --
>  M.Sc. Tobias Klo:ffel
>  ===
>  Interdisciplinary Center for Molecular Materials (ICMM)
>  and Computer-Chemistry-Center (CCC)
>  Department Chemie und Pharmazie
>  Friedrich-Alexander-Universita:t Erlangen-Nu:rnberg
>  Na:gelsbachstr. 25
>  D-91052 Erlangen, Germany
> 
>  Room: 2.307
>  Phone: +49 (0) 9131 / 85 - 20421
>  Fax: +49 (0) 9131 / 85 - 26565
> 
>  ===
>  Department of Materials Science and Engineering
>  Institute I: General Materials Properties
>  Friedrich-Alexander-Universita:t Erlangen-Nu:rnberg
>  Martensstr. 5, D-91058 Erlangen, Germany
>  Office 3.40
>  Phone: (+49) 9131 85 27 -486
>  http://www.gmp.ww.uni-erlangen.de
> 
>  E-mail: tobias.kloef...@fau.de
> 
>  ___
>  users mailing list
>  us...@open-mpi.org
>  Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>  Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/03/26479.php
> 
>  ___
>  users mailing list
>  us...@open-mpi.org
>  Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>  Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/03/26480.php
> 
>  --
>  M.Sc. Tobias Klo:ffel
>  ===
>  Interdisciplinary Center for Molecular Materials (ICMM)
>  and Computer-Chemistry-Center (CCC)
>  Department Chemie und Pharmazie
>  Friedrich-Alexander-Universita:t Erlangen-Nu:rnberg
>  Na:gelsbachstr. 25
>  D-91052 Erlangen, Germany
> 
>  Room: 2.307
>  Phone: +49 (0) 9131 / 85 - 20421
>  Fax: +49 (0) 9131 / 85 - 26565
> 
>  ===
>  Department of Materials Science and Engineering
>  Institute I: General Materials Properties
>  Friedrich-Alexander-Universita:t Erlangen-Nu:rnberg
>  
>  Martensstr. 5, D-91058 Erlangen, Germany
>  Office 3.40
>  Phone: (+49) 9131 85 27 -486
>  http://www.gmp.ww.uni-erlangen.de
> 
>  E-mail: tobias.kloef...@fau.de

> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/03/26483.php



pgpdwzxHSTmlR.pgp
Description: PGP signature


[OMPI users] openmpi 1.8 error

2015-03-17 Thread Ahmed Salama
when i configure openmpi-1.8.2 e.g. $./configure --enable-mpi-java 
--with-jdk-bindir=/usr/jdk6/bin --with-jdk-headers=/usr/jdk6/include 
--prefix=/usr/openmpi8 configure not complete, and I have in end of configure
onfig.status: executing depfiles commandsconfig.status: executing 
opal/mca/event/libevent2021/libevent/include/event2/event-config.h 
commandsopal/mca/event/libevent2021/libevent/include/event2/event-config.h is 
unchangedconfig.status: executing libtool commands
, when $make all install, I have have at the end the following error
    return getData(buffer, index);                  
^../../../../ompi/mpi/java/java/Struct.java:722: type parameters of T cannot 
be determined; no unique maximal instance exists for type variable T with upper 
bounds D,mpi.Struct.Data        return s.newData(buffer, offset + field);       
                 ^../../../../ompi/mpi/java/java/Struct.java:737: type 
parameters of T cannot be determined; no unique maximal instance exists for 
type variable T with upper bounds D,mpi.Struct.Data        return 
s.newData(buffer, offset + field + index * s.extent);                        ^6 
errorsmake[3]: *** [mpi/MPI.class] Error 1make[3]: Leaving directory 
`/usr/openmpi-1.7.5/ompi/mpi/java/java'make[2]: *** [all-recursive] Error 
1make[2]: Leaving directory `/usr/openmpi-1.7.5/ompi/mpi/java'make[1]: *** 
[all-recursive] Error 1make[1]: Leaving directory 
`/usr/openmpi-1.7.5/ompi'make: *** [all-recursive] Error 1
how can i slove this problem


Re: [OMPI users] how to compile without ib support

2015-03-17 Thread Ahmed Salama

  From: Jeff Squyres (jsquyres) 
 To: "t...@riseup.net" ; Open MPI User's List 
 
 Sent: Monday, 9 March 2015, 18:53:17
 Subject: Re: [OMPI users] how to compile without ib support
   
On Mar 9, 2015, at 12:19 PM, Tus  wrote:
> 
> I configured and installed 1.8.4 on my system. I was getting openfabric
> erros and started to specify -mca btl ^openib which is working but very
> slow.
> 
> I would like to complile again excluding openfabric or ib support. I do
> have a 10GbE fast network in addition to 1G net. What flags are need to
> ignore ib support and how can I verify/force openmpi to use my 10GbE net?

You have several options:

1. If you did a default Open MPI install, you can simply rm the 
"mca_btl_openib.so" plugin that was installed under $prefix/lib/openmpi.  Open 
MPI then won't know that that plugin exists, and it won't try to use 
OpenFabrics-based devices.

2. You can set a system-wide MCA parameter to ignore the openib BTL.  That way, 
you don't have to type it on the mpirun command line every time.

3. You can rebuild Open MPI with the --without-verbs configure command line 
switch.  This will ultimately have the same effect as #1 (i.e., the openib 
plugin won't be in the installation tree).

As for using your 10G, I assume you mean over TCP sockets, right?

If so, you can use --mca btl_tcp_if_include  or 
.  E.g.:

  mpirun --mca btl_tcp_if_include eth1 ...
  mpirun --mca btl_tcp_if_include 10.20.30.0/24 ...

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2015/03/26445.php




Re: [OMPI users] Questions regarding xpmem

2015-03-17 Thread Nathan Hjelm

It appears Cray solved the issue awhile ago. I reimported from the
lastest version I have from Cray and applied my re-applied my
patches. The new version has been pushed up to github. It appears to be
stable enough for testing but there may be outstanding bugs. I will
spend some time over the next couple of weeks testing the updated code.

-Nathan

On Tue, Mar 17, 2015 at 12:02:43PM +0100, Tobias Kloeffel wrote:
>Hello Nathan,
> 
>I am using:
>IMB 4.0 Update 2
>gcc version 4.8.1
>Intel compilers 15.0.1 20141023
>xpmem from your github
> 
>I also tested pwscf (QuatumEespresso), here I can observe the same
>behavior. The entire calculation runs without problems, but a few mpi
>procs just stay alive and refuse to die, even with signal 9.
>openmpi and pw was build with the intel compilers, xpmem with gcc.
> 
>Kind regards,
>Tobias
> 
>On 03/16/2015 05:56 PM, Nathan Hjelm wrote:
> 
>  What program are you using for the benchmark? Are you using the xpmem
>  branch in my github? For my testing I used a stock ubuntu 3.13 kernel
>  but I have not full stress-tested my xpmem branch.
> 
>  I will see if I can reproduce and fix the hang.
> 
>  -Nathan
> 
>  On Mon, Mar 16, 2015 at 05:32:26PM +0100, Tobias Kloeffel wrote:
> 
>  Hello everyone,
> 
>  currently I am benchmarking the different single copy mechanisms
>  knem/cma/xpmem on a Xeon E5 V3 machine.
>  I am using openmpi 1.8.4 with the CMA patch for vader.
> 
>  While it turns out that xpmem is the clear winner (reproducing Nathan
>  Hjelm's results) I always ran into a problem at the mpi finalizing step. At
>  this step, at least one process hangs, and can't be killed anymore. To get
>  rid of the hanging process, the server has to be rebooted.
> 
>  The applications finish successfully.
> 
>  Unfortunately, I can't find any further development of the xpmem module. Is
>  this bug known to anyone? What kernel versions do you use?
> 
>  Any help would be appreciated.
> 
>  Tested kernel versions:
>  3.11.25-desktop (openSUSE)
>  3.18.9 (vanilla)
>  3.19.1 (vanilla)
> 
>  --
>  M.Sc. Tobias Klo:ffel
>  ===
>  Interdisciplinary Center for Molecular Materials (ICMM)
>  and Computer-Chemistry-Center (CCC)
>  Department Chemie und Pharmazie
>  Friedrich-Alexander-Universita:t Erlangen-Nu:rnberg
>  Na:gelsbachstr. 25
>  D-91052 Erlangen, Germany
> 
>  Room: 2.307
>  Phone: +49 (0) 9131 / 85 - 20421
>  Fax: +49 (0) 9131 / 85 - 26565
> 
>  ===
>  Department of Materials Science and Engineering
>  Institute I: General Materials Properties
>  Friedrich-Alexander-Universita:t Erlangen-Nu:rnberg
>  Martensstr. 5, D-91058 Erlangen, Germany
>  Office 3.40
>  Phone: (+49) 9131 85 27 -486
>  http://www.gmp.ww.uni-erlangen.de
> 
>  E-mail: tobias.kloef...@fau.de
> 
>  ___
>  users mailing list
>  us...@open-mpi.org
>  Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>  Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/03/26479.php
> 
>  ___
>  users mailing list
>  us...@open-mpi.org
>  Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>  Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/03/26480.php
> 
>  --
>  M.Sc. Tobias Klo:ffel
>  ===
>  Interdisciplinary Center for Molecular Materials (ICMM)
>  and Computer-Chemistry-Center (CCC)
>  Department Chemie und Pharmazie
>  Friedrich-Alexander-Universita:t Erlangen-Nu:rnberg
>  Na:gelsbachstr. 25
>  D-91052 Erlangen, Germany
> 
>  Room: 2.307
>  Phone: +49 (0) 9131 / 85 - 20421
>  Fax: +49 (0) 9131 / 85 - 26565
> 
>  ===
>  Department of Materials Science and Engineering
>  Institute I: General Materials Properties
>  Friedrich-Alexander-Universita:t Erlangen-Nu:rnberg
>  
>  Martensstr. 5, D-91058 Erlangen, Germany
>  Office 3.40
>  Phone: (+49) 9131 85 27 -486
>  http://www.gmp.ww.uni-erlangen.de
> 
>  E-mail: tobias.kloef...@fau.de

> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/03/26483.php



pgpJ3d2K27jXT.pgp
Description: PGP signature


[OMPI users] Configuration error with external hwloc

2015-03-17 Thread Peter Gottesman
Hey all,
I am trying to compile Open MPI on a 32bit laptop running debian wheezy
7.8.0. When I

> ../ompi-master/configure --prefix=$HOME/ompi-master/build
> --with-hwloc=$HOME/openmpi/hwloc/build
> --with-hwloc-libdir=$HOME/openmpi/hwloc/build/lib

I get the error code:

> checking whether we are cross compiling... configure: error: in
> `/home/peter/openmpi/build/opal/mca/event/libevent2022/libevent':
> configure: error: cannot run C compiled programs.
> If you meant to cross compile, use `--host'.
> See `config.log' for more details
> configure: /bin/bash
> '../../../../../../ompi-master/opal/mca/event/libevent2022/libevent/configure'
> *failed* for opal/mca/event/libevent2022/libevent
> configure: WARNING: Event library failed to configure
> configure: error: Cannot continue

I have looked at a previous message in this mailing list, and I have a
working compiler, so I do not believe that that is the problem here. Any
help is appreciated.
Thanks,
Peter