Re: [OMPI users] [Open MPI Announce] Open MPI v1.3.3 released

2009-07-14 Thread jimkress_58
Does use of 1.3.3 require recompilation of applications that were compiled
using 1.3.2?

Jim

-Original Message-
From: announce-boun...@open-mpi.org [mailto:announce-boun...@open-mpi.org]
On Behalf Of Ralph Castain
Sent: Tuesday, July 14, 2009 2:11 PM
To: OpenMPI Announce
Subject: [Open MPI Announce] Open MPI v1.3.3 released

The Open MPI Team, representing a consortium of research, academic,
and industry partners, is pleased to announce the release of Open MPI
version 1.3.3. This release is mainly a bug fix release over the v1.3.3
release, but there are few new features, including support for Microsoft
Windows.  We strongly recommend that all users upgrade to version 1.3.3
if possible.

Version 1.3.3 can be downloaded from the main Open MPI web site or
any of its mirrors (mirrors will be updating shortly).

Here is a list of changes in v1.3.3 as compared to v1.3.2:

- Fix a number of issues with the openib BTL (OpenFabrics) RDMA CM,
   including a memory corruption bug, a shutdown deadlock, and a route
   timeout.  Thanks to David McMillen and Hal Rosenstock for help in
   tracking down the issues.
- Change the behavior of the EXTRA_STATE parameter that is passed to
   Fortran attribute callback functions: this value is now stored
   internally in MPI -- it no longer references the original value
   passed by MPI_*_CREATE_KEYVAL.
- Allow the overriding RFC1918 and RFC3330 for the specification of
   "private" networks, thereby influencing Open MPI's TCP
   "reachability" computations.
- Improve flow control issues in the sm btl, by both tweaking the
   shared memory progression rules and by enabling the "sync" collective
   to barrier every 1,000th collective.
- Various fixes for the IBM XL C/C++ v10.1 compiler.
- Allow explicit disabling of ptmalloc2 hooks at runtime (e.g., enable
   support for Debain's builtroot system).  Thanks to Manuel Prinz and
   the rest of the Debian crew for helping identify and fix this issue.
- Various minor fixes for the I/O forwarding subsystem.
- Big endian iWARP fixes in the Open Fabrics RDMA CM support.
- Update support for various OpenFabrics devices in the openib BTL's
   .ini file.
- Fixed undefined symbol issue with Open MPI's parallel debugger
   message queue support so it can be compiled by Sun Studio compilers.
- Update MPI_SUBVERSION to 1 in the Fortran bindings.
- Fix MPI_GRAPH_CREATE Fortran 90 binding.
- Fix MPI_GROUP_COMPARE behavior with regards to MPI_IDENT.  Thanks to
   Geoffrey Irving for identifying the problem and supplying the fix.
- Silence gcc 4.1 compiler warnings about type punning.  Thanks to
   Number Cruncher for the fix.
- Added more Valgrind and other memory-cleanup fixes.  Thanks to
   various Open MPI users for help with these issues.
- Miscellaneous VampirTrace fixes.
- More fixes for openib credits in heavy-congestion scenarios.
- Slightly decrease the latency in the openib BTL in some conditions
   (add "send immediate" support to the openib BTL).
- Ensure to allow MPI_REQUEST_GET_STATUS to accept an
   MPI_STATUS_IGNORE parameter.  Thanks to Shaun Jackman for the bug
   report.
- Added Microsoft Windows support.  See README.WINDOWS file for
   details.

___
announce mailing list
annou...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/announce



Re: [OMPI users] Open MPI:Problem with 64-bit openMPIandintel compiler

2009-07-24 Thread jimkress_58
You can avoid the "library confusion problem" by building 64 bit and 32 bit
version of openMPI in two different directories and then use mpi-selector
(on your head and compute nodes) to switch between the two.

Just my $0.02

Jim

-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
Behalf Of Jeff Squyres
Sent: Friday, July 24, 2009 7:22 AM
To: Open MPI Users
Subject: Re: [OMPI users] Open MPI:Problem with 64-bit openMPIandintel
compiler

On Jul 23, 2009, at 11:14 PM, Ralph Castain wrote:

> 3. get a multi-node allocation and run "pbsdsh echo $LD_LIBRARY_PATH"
> and see what libs you are defaulting to on the other nodes.
>


Be careful with this one; you want to ensure that your local shell  
doesn't expand $LD_LIBRARY_PATH and simply display the same value on  
all nodes.  It might be easiest to write a 2 line script and run that:

$ cat myscript
#!/bin/sh
echo LD_LIB_PATH on `hostname` is: $LD_LIBRARY_PATH
$ chmod +x myscript
$ pdsh myscript

-- 
Jeff Squyres
jsquy...@cisco.com

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] very bad parallel scaling of vasp using openmpi

2009-08-18 Thread jimkress_58
Gbit Ethernet is well known to perform poorly for fine grained code like
VASP.  The latencies for Gbit Ethernet are much too high.

If you want good scaling in a cluster for VASP, you'll need to run
InfiniBand or some other high speed/ low latency network.

Jim

-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
Behalf Of Jeff Squyres
Sent: Monday, August 17, 2009 9:24 PM
To: Open MPI Users
Cc: David Hibbitts
Subject: Re: [OMPI users] very bad parallel scaling of vasp using openmpi

You might want to run some performance testing of you TCP stacks and  
the switch -- use a non-MPI application such as NetPIPE (or others --  
google around) and see what kind of throughput you get.  Try it  
between individual server peers and then try running it simultaneously  
between a bunch of peers and see if the results are different, etc.

On Aug 17, 2009, at 5:51 PM, Craig Plaisance wrote:

> Hi - I have compiled vasp 4.6.34 using the Intel fortran compiler 11.1
> with openmpi 1.3.3 on a cluster of 104 nodes running Rocks 5.2 with  
> two
> quad core opterons connected by a Gbit ethernet.  Running in  
> parallel on
> one node (8 cores) runs very well, faster than any other cluster I  
> have
> run it on.  However, running on 2 nodes in parallel only improves the
> performance by 10% over the one node case while running on 4 and 8  
> nodes
> yields no improvement over the two node case.  Furthermore, when  
> running
> multiple (3-4) jobs simultaneously, the performance decreases by  
> around
> 50% compared to running only a single job on the entire cluster.  The
> nodes are connected by a Dell Powerconnect 6248 managed switch.  I get
> the same performance with mpich2, so I don't think it is a problem
> specific to openmpi.  Other vasp users have reported very good scaling
> up to 4 nodes on a similar cluster, so I don't think the problem is  
> vasp
> either.  Could something be wrong with the way mpi is configured to  
> work
> with the switch?  Or the operating system is not configured to work  
> with
> the switch properly?  Or the switch itself needs to be configured?   
> Thanks!
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


-- 
Jeff Squyres
jsquy...@cisco.com

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] very bad parallel scaling of vasp using openmpi

2009-08-24 Thread jimkress_58
Gus,

You hit the nail on the head.  CPMD and VASP are both fine grained parallel
Quantum Mechanics Molecular Dynamics Codes.  I believe CPMD has implemented
the domain decomposition methodology found in gromacs (a classical fine
grained molecular dynamics code) which significantly diminishes the scaling
problem).  I do not believe VASP has done the same.

Jim

-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
Behalf Of Gus Correa
Sent: Tuesday, August 18, 2009 6:43 PM
To: Open MPI Users
Subject: Re: [OMPI users] very bad parallel scaling of vasp using openmpi

Hi Craig, list

Independent of any issues with your GigE switch,
which you may need to address,
you may want to take a look at the performance of the default
OpenMPI MPI_Alltoall algorithm, which you say is a cornerstone of VASP.
You can perhaps try alternative algorithms for different message
sizes, using OpenMPI tuned collectives.

Please, see this long thread from last May,
where it was reported that the CPMD code (seems to be another
molecular dynamics code, like VASP, right?),
which also uses MPI_Alltoall,
didn't perform well for not-so-large messages,
and the scaling was poor.
I suppose your messages also get smaller
when you increase the number of processors,
assuming the problem size is kept constant, right?
The thread suggests diagnostics and solutions,
and I found it quite helpful:

http://www.open-mpi.org/community/lists/users/2009/05/9355.php

Sorry, we're not computational chemists here,
but our programs also use MPI collectives.

Gus Correa
-
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
-


Craig Plaisance wrote:
> I ran a test of tcp using NetPIPE and got throughput of 850 Mb/s at 
> message sizes of 128 Kb.  The latency was 50 us.  At message sizes above 
> 1000 Kb, the throughput oscillated wildly between 850 Mb/s and values as 
> low as 200 Mb/s.  This test was done with no other network traffic.  I 
> then ran four tests simultaneously between different pairs of compute 
> nodes and saw a drastic decrease in performance.  The highest stable 
> (non-oscillating) throughput was about 500 Mb/s at a message size of 16 
> Kb.  The throughput then oscillated wildly, with the maximum value 
> climbing to 850 Mb/s at a message size greater than 128 Kb and dropping 
> to values as low as 100 Mb/s.  The code I am using (VASP) has 100 to 
> 1000 double complex (16 byte) arrays containing 100,000 to 1,000,000 
> elements each.  Typically, the arrays are distributed among the nodes.  
> The most communication intensive part involves executing an MPI_alltoall 
> to redistribute the arrays so that node i contains the ith block of each 
> array.  The default message size is 1000 elements (128 Kb), so according 
> to the NetPIPE test, I should be getting very good throughput when there 
> is no other network traffic.  I will run a NetPIPE test with openmpi and 
> mpich2 now and post the results.  So, does anyone know what causes the 
> wild oscillations in the throughput at larger message sizes and higher 
> network traffic?  Thanks!
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users