date:20140201

Re: [OMPI users] MPI hangs when application compiled with -O3, runs fine with -O0

2014-02-01 Thread Jeff Squyres (jsquyres)

Sorry for the massive delay in replying; I'm going through my inbox this 
morning and finding old mails that I initially missed.  :-\

More below.

On Jan 17, 2014, at 8:45 AM, Julien Bodart  wrote:

> version: 1.6.5 (compiled with Intel compilers)
> 
> command used:
> mpirun --machinefile mfile --debug-daemons -np 16 myapp
> 
> Description of the problem:
> When myapp is compiled without optimizations everything runs fine
> if compiled with -O3, then the application hangs. I cannot reproduce the 
> problem with a hello world test.

-O3 is notorious for both:

- exposing compiler bugs
- exposing application bugs

I just ran across a -O3 compiler bug in gcc 4.8.1 yesterday.

But more often, it reveals bugs in the application -- something you thought was 
safe/correct, but actually isn't, and when a compiler makes an aggressive 
optimization around it, the Badness is revealed.

> when using --debug-daemons I see the following behavior (PATHTOAPPLICATION= 
> my path to the application)

I'm not quite sure from the context of your mail: are you saying that the 
difference is whether you compile *your application* with -O3 vs. -O0, or 
whether you compile *Open MPI* with -O3 vs. -O0?

I'd also suggest ensuring that you have the latest release of the Intel 
compiler for your series.  The Intel compiler is like every other piece of 
software: it has bugs. If you have Intel compiler 12, for example, ensure you 
have the latest version of the Intel compiler 12 so that you get all the latest 
bug fixes, etc.  We've seen this kind of thing many times from the different 
compiler suites.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Re: [OMPI users] Implementation of TCP v/s OpenIB (Eager and Rendezvous) protocols

2014-02-01 Thread Jeff Squyres (jsquyres)

On Jan 31, 2014, at 2:49 AM, Siddhartha Jana  wrote:

> Sorry for the typo:
> **  I was hoping to understand the impact of OpenMPI's implementation of 
> these protocols using traditional TCP.
> 
> This is the paper I was referring to:
> Woodall, et al., "High Performance RDMA Protocols in HPC".
> 
> 
> On 31 January 2014 00:43, Siddhartha Jana  wrote:
> Good evening
> Is there any documentation describing the difference in MPI-level 
> implementation of the eager and rendezvous protocols in OpenIB BTL versus TCP 
> BTL ?

Unfortunately, there is not, sorry.  Just the code.  :-\

> I am only aware of the following paper. While this presents an excellent 
> overview of how RDMA capabilities of modern interconnects can be leveraged 
> for implementing these protocols, I was hoping to understand how OpenMPI 
> implications of handling these protocols using traditional TCP.

The easiest way to think about it is that the TCP BTL could well be implemented 
with just the "send" method (and no "get" or "put" methods).  

That being said, the TCP BTL does emulate the "put" method (meaning: there's 
obviously no hardware support for a direct data placement using a general 
socket in TCP like there is with OpenFabrics-style RDMA) simply because it 
allows us to be slightly more efficient on the receiver (IIRC; it's been a 
lng time since I've looked at that code).

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Re: [OMPI users] Running on two nodes slower than running on one node

2014-02-01 Thread Victor

Thank you all for your help. --bind-to-core increased the cluster
performance by approximately 10%, so in addition to the improvements
through the implementation of Open-MX, the performance now scales within
expectations - not linear, but much better than with the original setup.


On 30 January 2014 20:43, Tim Prince  wrote:

>
> On 1/29/2014 11:30 PM, Ralph Castain wrote:
>
>
>  On Jan 29, 2014, at 7:56 PM, Victor  wrote:
>
>  Thanks for the insights Tim. I was aware that the CPUs will choke beyond
> a certain point. From memory on my machine this happens with 5 concurrent
> MPI jobs with that benchmark that I am using.
>
>  My primary question was about scaling between the nodes. I was not
> getting close to double the performance when running MPI jobs acros two 4
> core nodes. It may be better now since I have Open-MX in place, but I have
> not repeated the benchmarks yet since I need to get one simulation job done
> asap.
>
>
>  Some of that may be due to expected loss of performance when you switch
> from shared memory to inter-node transports. While it is true about
> saturation of the memory path, what you reported could be more consistent
> with that transition - i.e., it isn't unusual to see applications perform
> better when run on a single node, depending upon how they are written, up
> to a certain size of problem (which your code may not be hitting).
>
>
>  Regarding your mention of setting affinities and MPI ranks do you have a
> specific (as in syntactically specific since I am a novice and easily
> confused...) examples how I may want to set affinities to get the Westmere
> node performing better?
>
>
>  mpirun --bind-to-core -cpus-per-rank 2 ...
>
>  will bind each MPI rank to 2 cores. Note that this will definitely *not*
> be a good idea if you are running more than two threads in your process -
> if you are, then set --cpus-per-rank to the number of threads, keeping in
> mind that you want things to break evenly across the sockets. In other
> words, if you have two 6 core/socket Westmere's on the node, then you
> either want to run 6 process at cpus-per-rank=2 if each process runs 2
> threads, or 4 processes with cpus-per-rank=3 if each process runs 3
> threads, or 2 processes with no cpus-per-rank but --bind-to-socket instead
> of --bind-to-core for any other thread number > 3.
>
>  You would not want to run any other number of processes on the node or
> else the binding pattern will cause a single process to split its threads
> across the sockets - which will definitely hurt performance.
>
>
>  -cpus-per-rank 2 is an effective choice for this platform.  As Ralph
> said, it should work automatically for 2 threads per rank.
> Ralph's point about not splitting a process across sockets is an important
> one.  Even splitting a process across internal busses, which would happen
> with 3 threads per process, seems problematical.
>
> --
> Tim Prince
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] Compiling OpenMPI with PGI pgc++

2014-02-01 Thread Jiri Kraus

Hi Reuti,

sorry but I don't know the details of the issue. But although the error is 
reported as pgc++ not being link compatible to pgcc by OpenMPI configure the 
error in the config.log is a complier error. So I don't think that this is a 
linking issue.

> When I get it right, it should be a feature of `pgc++` to be link compatible 
> with `gcc`, while `pgcpp` links with `pgcc` objects.

To my understanding C compliers like gcc and pgcc do not have the linking 
issues that C++ compliers have they simply follow the ABI of the OS and do not 
have name mangling issues. So most C compliers are link compatible. And you can 
link C code from C++. The C++ compiler just needs to know that something C and 
not C++ is used (extern "C" is used for this).
Because of function overloading, Templates and maybe other things, I am not 
aware of, C++ needs name mangling to encode these information. And here pgcpp 
and pgc++ implement different ABI's and name mangling schemes. pgc++ implements 
the same ABI as g++ and thus is link compatible with g++. On the other hand 
pgcpp implements its own ABI and is compatible with itself.

Jiri

Sent from my Nexus 7, I apologize for spelling errors and auto correction typos.

-Original Message-
Date: Fri, 31 Jan 2014 22:50:40 +0100
From: Reuti 
To: Open MPI Users 
Subject: Re: [OMPI users] Compiling OpenMPI with PGI pgc++
Message-ID: 

Content-Type: text/plain; charset=us-ascii

Hi,

Am 31.01.2014 um 18:59 schrieb Jiri Kraus:

> Thanks for taking a look. I just learned from PGI that this is a known bug 
> that will be fixed in the 14.2 release (Februrary 2014).

Will `pgc++` then link to `gcc` or `pgcc`? When I get it right, it should be a 
feature of `pgc++` to be link compatible with `gcc`, while `pgcpp` links with 
`pgcc` objects.

-- Reuti

> Thanks

>

> Jiri

>

>> -Original Message-

>> Date: Wed, 29 Jan 2014 18:12:46 +

>> From: "Jeff Squyres (jsquyres)" 

>> To: Open MPI Users 

>> Subject: Re: [OMPI users] Compiling OpenMPI with PGI pgc++

>> Message-ID: <556094df-cd27-4908-aec1-a6ad9efb6...@cisco.com>

>> Content-Type: text/plain; charset="us-ascii"

>>

>> On Jan 29, 2014, at 12:35 PM, Reuti  wrote:

>>

 I don't know the difference between pgc++ and pgcpp, unfortunately.

>>>

>>> It's a matter of the ABI:

>>>

>>> http://www.pgroup.com/lit/articles/insider/v4n1a2.htm

>>>

>>> pgc++ uses the new ABI.

>>

>>

>> Must be more than that -- this is a compile issue, not a link issue.

>>

>> --

>> Jeff Squyres

>> jsquy...@cisco.com

>> For corporate legal information go to:

>> http://www.cisco.com/web/about/doing_business/legal/cri/

> NVIDIA GmbH, Wuerselen, Germany, Amtsgericht Aachen, HRB 8361

> Managing Director: Karen Theresa Burns

>

> ---

> This email message is for the sole use of the intended recipient(s) and may 
> contain

> confidential information. Any unauthorized review, use, disclosure or 
> distribution

> is prohibited. If you are not the intended recipient, please contact the 
> sender by

> reply email and destroy all copies of the original message.

> 

Sent from my Nexus 7, I apologize for spelling errors and auto correction typos.

Re: [OMPI users] openmpi 1.7.4rc1 and f08 interface

2014-02-01 Thread Jeff Squyres (jsquyres)

Thanks!

I noted your comment on the ticket so that it doesn't get lost.  I haven't had 
a chance to look into this yet because we've been focusing on getting 1.7.4 out 
the door, and this has been identified as a 1.7.5 fix.


On Jan 31, 2014, at 3:31 PM, Åke Sandgren  wrote:

> On 01/28/2014 08:26 PM, Jeff Squyres (jsquyres) wrote:
>> Ok, will do.
>> 
>> Yesterday, I put in a temporary behavioral test in configure that will 
>> exclude ekopath 5.0 in 1.7.4.  We'll remove this behavioral test once OMPI 
>> fixes the bug correctly (for 1.7.5).
> 
> I'm not 100% sure yet (my F2k3 spec is at work and I'm not) but the 
> ompi_funloc.tar.gz code in https://svn.open-mpi.org/trac/ompi/ticket/4157 
> seems to be non comformant.
> 
>   abstract interface
> !! This is the prototype for ONE of the MPI callback routines
> !
> function callback_variant1(val)
>   integer :: val, callback_variant1
> end function
>   end interface
> 
>   interface
> !! This is the OMPI conversion routine for ONE of the MPI callback 
> routines
> !
>  function ompi_funloc_variant1(fn)
>use, intrinsic :: iso_c_binding, only: c_funptr
>procedure(callback_variant1) :: fn
>type(c_funptr) :: ompi_funloc_variant1
>  end function ompi_funloc_variant1
>   end interface
> 
> I think that ompi_funloc_variant1 needs to do IMPORT to have access to the 
> callback_variant1 definition before using it to define "FN"
> I.e.
> !
>  function ompi_funloc_variant1(fn)
>use, intrinsic :: iso_c_binding, only: c_funptr
>import
>procedure(callback_variant1) :: fn
> 
> -- 
> Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
> Internet: a...@hpc2n.umu.se   Phone: +46 90 7866134 Fax: +46 90 7866126
> Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Re: [OMPI users] Implementation of TCP v/s OpenIB (Eager and Rendezvous) protocols

2014-02-01 Thread Siddhartha Jana

Thanks for the reply Jeff. This is directional.
On 01-Feb-2014 7:51 am, "Jeff Squyres (jsquyres)" 
wrote:

> On Jan 31, 2014, at 2:49 AM, Siddhartha Jana 
> wrote:
>
> > Sorry for the typo:
> > **  I was hoping to understand the impact of OpenMPI's implementation of
> these protocols using traditional TCP.
> >
> > This is the paper I was referring to:
> > Woodall, et al., "High Performance RDMA Protocols in HPC".
> >
> >
> > On 31 January 2014 00:43, Siddhartha Jana 
> wrote:
> > Good evening
> > Is there any documentation describing the difference in MPI-level
> implementation of the eager and rendezvous protocols in OpenIB BTL versus
> TCP BTL ?
>
> Unfortunately, there is not, sorry.  Just the code.  :-\
>
> > I am only aware of the following paper. While this presents an excellent
> overview of how RDMA capabilities of modern interconnects can be leveraged
> for implementing these protocols, I was hoping to understand how OpenMPI
> implications of handling these protocols using traditional TCP.
>
> The easiest way to think about it is that the TCP BTL could well be
> implemented with just the "send" method (and no "get" or "put" methods).
>
> That being said, the TCP BTL does emulate the "put" method (meaning:
> there's obviously no hardware support for a direct data placement using a
> general socket in TCP like there is with OpenFabrics-style RDMA) simply
> because it allows us to be slightly more efficient on the receiver (IIRC;
> it's been a lng time since I've looked at that code).
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

[OMPI users] Use of __float128 with openmpi

2014-02-01 Thread Patrick Boehl

Hi all,

I have a question on datatypes in openmpi:

Is there an (easy?) way to use __float128 variables with openmpi?

Specifically, functions like  

MPI_Allreduce

seem to give weird results with __float128.

Essentially all I found was

http://beige.ucs.indiana.edu/I590/node100.html

where they state

MPI_LONG_DOUBLE
  This is a quadruple precision, 128-bit long floating point number.


But as far as I have seen, MPI_LONG_DOUBLE is only used for long doubles.

The Open MPI Version is 1.6.3 and gcc is 4.7.3 on a x86_64 machine.

Any help or comment is very appreciated!

Best regards,
Patrick

Re: [OMPI users] Use of __float128 with openmpi

2014-02-01 Thread Tim Prince



On 02/01/2014 12:42 PM, Patrick Boehl wrote:

Hi all,

I have a question on datatypes in openmpi:

Is there an (easy?) way to use __float128 variables with openmpi?

Specifically, functions like

MPI_Allreduce

seem to give weird results with __float128.

Essentially all I found was

http://beige.ucs.indiana.edu/I590/node100.html

where they state

MPI_LONG_DOUBLE
   This is a quadruple precision, 128-bit long floating point number.


But as far as I have seen, MPI_LONG_DOUBLE is only used for long doubles.

The Open MPI Version is 1.6.3 and gcc is 4.7.3 on a x86_64 machine.

It seems unlikely that 10 year old course notes on an unspecified MPI 
implementation (hinted to be IBM power3) would deal with specific 
details of openmpi on a different architecture.
Where openmpi refers to "portable C types" I would take long double to 
be the 80-bit hardware format you would have in a standard build of gcc 
for x86_64.  You should be able to gain some insight by examining your 
openmpi build logs to see if it builds for both __float80 and __float128 
(or neither).  gfortran has a 128-bit data type (software floating point 
real(16), corresponding to __float128); you should be able to see in the 
build logs whether that data type was used.

Re: [OMPI users] Use of __float128 with openmpi

2014-02-01 Thread Tim Prince



On 02/01/2014 12:42 PM, Patrick Boehl wrote:

Hi all,

I have a question on datatypes in openmpi:

Is there an (easy?) way to use __float128 variables with openmpi?

Specifically, functions like

MPI_Allreduce

seem to give weird results with __float128.

Essentially all I found was

http://beige.ucs.indiana.edu/I590/node100.html

where they state

MPI_LONG_DOUBLE
   This is a quadruple precision, 128-bit long floating point number.


But as far as I have seen, MPI_LONG_DOUBLE is only used for long doubles.

The Open MPI Version is 1.6.3 and gcc is 4.7.3 on a x86_64 machine.

It seems unlikely that 10 year old course notes on an unspecified MPI 
implementation (hinted to be IBM power3) would deal with specific 
details of openmpi on a different architecture.
Where openmpi refers to "portable C types" I would take long double to 
be the 80-bit hardware format you would have in a standard build of gcc 
for x86_64.  You should be able to gain some insight by examining your 
openmpi build logs to see if it builds for both __float80 and __float128 
(or neither).  gfortran has a 128-bit data type (software floating point 
real(16), corresponding to __float128); you should be able to see in the 
build logs whether that data type was used.

Re: [OMPI users] Use of __float128 with openmpi

2014-02-01 Thread Jeff Hammond

See Section 5.9.5 of MPI-3 or the section named "User-Defined
Reduction Operations" but presumably numbered differently in older
copies of the MPI standard.

An older but still relevant online reference is
http://www.mpi-forum.org/docs/mpi-2.2/mpi22-report/node107.htm

There is a proposal to support __float128 in the MPI standard in the
future but it has not been formally considered by the MPI Forum yet
[https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/318].

Best,

Jeff

On Sat, Feb 1, 2014 at 2:28 PM, Tim Prince  wrote:
>
> On 02/01/2014 12:42 PM, Patrick Boehl wrote:
>>
>> Hi all,
>>
>> I have a question on datatypes in openmpi:
>>
>> Is there an (easy?) way to use __float128 variables with openmpi?
>>
>> Specifically, functions like
>>
>> MPI_Allreduce
>>
>> seem to give weird results with __float128.
>>
>> Essentially all I found was
>>
>> http://beige.ucs.indiana.edu/I590/node100.html
>>
>> where they state
>> 
>> MPI_LONG_DOUBLE
>>This is a quadruple precision, 128-bit long floating point number.
>> 
>>
>> But as far as I have seen, MPI_LONG_DOUBLE is only used for long doubles.
>>
>> The Open MPI Version is 1.6.3 and gcc is 4.7.3 on a x86_64 machine.
>>
> It seems unlikely that 10 year old course notes on an unspecified MPI
> implementation (hinted to be IBM power3) would deal with specific details of
> openmpi on a different architecture.
> Where openmpi refers to "portable C types" I would take long double to be
> the 80-bit hardware format you would have in a standard build of gcc for
> x86_64.  You should be able to gain some insight by examining your openmpi
> build logs to see if it builds for both __float80 and __float128 (or
> neither).  gfortran has a 128-bit data type (software floating point
> real(16), corresponding to __float128); you should be able to see in the
> build logs whether that data type was used.
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



-- 
Jeff Hammond
jeff.scie...@gmail.com

Re: [OMPI users] Compiling OpenMPI with PGI pgc++

2014-02-01 Thread Reuti

Hi,

Am 01.02.2014 um 15:10 schrieb Jiri Kraus:

> sorry but I don't know the details of the issue. But although the error is 
> reported as pgc++ not being link compatible to pgcc by OpenMPI configure the 
> error in the config.log is a complier error. So I don't think that this is a 
> linking issue. 
> 
> > When I get it right, it should be a feature of `pgc++` to be link 
> > compatible with `gcc`, while `pgcpp` links with `pgcc` objects.
> 
> To my understanding C compliers like gcc and pgcc do not have the linking 
> issues that C++ compliers have they simply follow the ABI of the OS and do 
> not have name mangling issues. So most C compliers are link compatible. And 
> you can link C code from C++. The C++ compiler just needs to know that 
> something C and not C++ is used (extern "C" is used for this).
> Because of function overloading, Templates and maybe other things, I am not 
> aware of, C++ needs name mangling to encode these information. And here pgcpp 
> and pgc++ implement different ABI's and name mangling schemes. pgc++ 
> implements the same ABI as g++ and thus is link compatible with g++. On the 
> other hand pgcpp implements its own ABI and is compatible with itself. 

thx for this clarification.

-- Reuti


> Jiri
> 
> Sent from my Nexus 7, I apologize for spelling errors and auto correction 
> typos.
> 
> -Original Message-
> Date: Fri, 31 Jan 2014 22:50:40 +0100
> From: Reuti 
> To: Open MPI Users 
> Subject: Re: [OMPI users] Compiling OpenMPI with PGI pgc++
> Message-ID: 
> 
> Content-Type: text/plain; charset=us-ascii
> 
> 
> 
> Hi,
> 
> 
> 
> Am 31.01.2014 um 18:59 schrieb Jiri Kraus:
> 
> 
> 
> > Thanks for taking a look. I just learned from PGI that this is a known bug 
> > that will be fixed in the 14.2 release (Februrary 2014).
> 
> 
> 
> Will `pgc++` then link to `gcc` or `pgcc`? When I get it right, it should be 
> a feature of `pgc++` to be link compatible with `gcc`, while `pgcpp` links 
> with `pgcc` objects.
> 
> 
> 
> -- Reuti
> 
> 
> 
> 
> 
> > Thanks 
> 
> > 
> 
> > Jiri
> 
> > 
> 
> >> -Original Message-
> 
> >> Date: Wed, 29 Jan 2014 18:12:46 +
> 
> >> From: "Jeff Squyres (jsquyres)" 
> 
> >> To: Open MPI Users 
> 
> >> Subject: Re: [OMPI users] Compiling OpenMPI with PGI pgc++
> 
> >> Message-ID: <556094df-cd27-4908-aec1-a6ad9efb6...@cisco.com>
> 
> >> Content-Type: text/plain; charset="us-ascii"
> 
> >> 
> 
> >> On Jan 29, 2014, at 12:35 PM, Reuti  wrote:
> 
> >> 
> 
>  I don't know the difference between pgc++ and pgcpp, unfortunately.
> 
> >>> 
> 
> >>> It's a matter of the ABI:
> 
> >>> 
> 
> >>> http://www.pgroup.com/lit/articles/insider/v4n1a2.htm
> 
> >>> 
> 
> >>> pgc++ uses the new ABI.
> 
> >> 
> 
> >> 
> 
> >> Must be more than that -- this is a compile issue, not a link issue.
> 
> >> 
> 
> >> --
> 
> >> Jeff Squyres
> 
> >> jsquy...@cisco.com
> 
> >> For corporate legal information go to:
> 
> >> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> > NVIDIA GmbH, Wuerselen, Germany, Amtsgericht Aachen, HRB 8361
> 
> > Managing Director: Karen Theresa Burns
> 
> > 
> 
> > ---
> 
> > This email message is for the sole use of the intended recipient(s) and may 
> > contain
> 
> > confidential information. Any unauthorized review, use, disclosure or 
> > distribution
> 
> > is prohibited. If you are not the intended recipient, please contact the 
> > sender by
> 
> > reply email and destroy all copies of the original message.
> 
> > 
> 
> 
> Sent from my Nexus 7, I apologize for spelling errors and auto correction 
> typos.
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] MPI hangs when application compiled with -O3, runs fine with -O0

Re: [OMPI users] Implementation of TCP v/s OpenIB (Eager and Rendezvous) protocols

Re: [OMPI users] Running on two nodes slower than running on one node

Re: [OMPI users] Compiling OpenMPI with PGI pgc++

Re: [OMPI users] openmpi 1.7.4rc1 and f08 interface

Re: [OMPI users] Implementation of TCP v/s OpenIB (Eager and Rendezvous) protocols

[OMPI users] Use of __float128 with openmpi

Re: [OMPI users] Use of __float128 with openmpi

Re: [OMPI users] Use of __float128 with openmpi

Re: [OMPI users] Use of __float128 with openmpi

Re: [OMPI users] Compiling OpenMPI with PGI pgc++

11 matches

Site Navigation

Mail list logo

Footer information