Re: [OMPI users] Parallel I/O with MPI-1

2008-07-24 Thread Robert Latham
On Wed, Jul 23, 2008 at 02:24:03PM +0200, Gabriele Fatigati wrote:
> >You could always effect your own parallel IO (e.g., use MPI sends and
> receives to coordinate parallel reads and writes), but >why?  It's already
> done in the MPI-IO implementation.
> 
> Just a moment: you're saying that i can do fwrite without any lock? OpenMPI
> does this?

You use MPI to describe your I/O regions.  In fact, these I/O regions
can even overlap (something that you can't do efficiently with
lock-based approaches).  Even better, if you do your I/O
"collectively" the MPI library will optimize the heck out of your
accesses.

When I was learning all this way back when, it took me a long time to
get all the details straight (memory types, file views, tiling,
independent vs. collective), but a few readings of the I/O chapter of
"Using MPI-2" set me straight. 

==rob

-- 
Rob Latham
Mathematics and Computer Science DivisionA215 0178 EA2D B059 8CDF
Argonne National Lab, IL USA B29D F333 664A 4280 315B


Re: [OMPI users] Parallel I/O with MPI-1

2008-07-24 Thread Robert Latham
On Wed, Jul 23, 2008 at 01:28:53PM +0100, Neil Storer wrote:
> Unless you have a parallel filesystem, such as GPFS, which is
> well-defined and does support file-locking, I would suggest writing to
> different files, or doing I/O via a single MPI task, or via MPI-IO.

I concur that NFS for a parallel I/O application should be avoided at
all costs.  MPI-IO implementations do the best they can to function
correctly in the face of NFS challenges, but do so at a terrible cost
to performance.

There are several good parallel file systems available, both free and
commercial.  No one in MPI-IO land should be stuck with NFS v3.

(there are many reasons to be hopeful that pNFS changes this story)

==rob

-- 
Rob Latham
Mathematics and Computer Science DivisionA215 0178 EA2D B059 8CDF
Argonne National Lab, IL USA B29D F333 664A 4280 315B


Re: [OMPI users] Parallel I/O with MPI-1

2008-07-24 Thread Robert Latham
On Wed, Jul 23, 2008 at 09:47:56AM -0400, Robert Kubrick wrote:
> HDF5 supports parallel I/O through MPI-I/O. I've never used it, but I  
> think the API is easier than direct MPI-I/O, maybe even easier than raw 
> read/writes given its support for hierarchal objects and metadata.

In addition to the API provided by parallel HDF5 and parallel-NetCDF,
these high level libraries offer a self-describing portable file
format.  Pretty nice when collaborating with others. Plus there are a
host of viewers for these file formats, so that's another thing you
don't have to worry about.

> HDF5 supports multiple storage models and it supports MPI-IO.
> HDF5 has an open interface to access raw storage. This enables HDF5  
> files to be written to a variety of media, including sequential files, 
> families of files, memory, Unix sockets (i.e., a network).
> New "Virtual File" drivers can be added to support new storage access  
> mechanisms.
> HDF5 also supports MPI-IO with Parallel HDF5. When building HDF5,  
> parallel support is included by configuring with the --enable-parallel 
> option. A tutorial for Parallel HDF5 is included with the HDF5 Tutorial 
> at:
>   /HDF5/Tutor/

It's a very good tutorial. Do read the parallel I/O chapter closely,
especially the parts about enabling collective I/O via property lists
and transfer templates.  For many HDF5 workloads today, collective I/O
is the key to getting good performance (this was not always the case
back in the bad old days of MPICH1 and LAM, but has been since at
least the HDF5-1.6 series).

==rob

-- 
Rob Latham
Mathematics and Computer Science DivisionA215 0178 EA2D B059 8CDF
Argonne National Lab, IL USA B29D F333 664A 4280 315B


[OMPI users] MPI_Init segfault on Ubuntu 8.04 version 1.2.7~rc2

2008-07-24 Thread Adam C Powell IV
Greetings,

I'm seeing a segfault in a code on Ubuntu 8.04 with gcc 4.2.  I
recompiled the Debian lenny openmpi 1.2.7~rc2 package on Ubuntu, and
compiled the Debian lenny petsc and libmesh packages against that.

Everything works just fine in Debian lenny (gcc 4.3), but in Ubuntu
hardy it fails during MPI_Init:

[Thread debugging using libthread_db enabled]
[New Thread 0x7faceea6f6f0 (LWP 5376)]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7faceea6f6f0 (LWP 5376)]
0x7faceb265b8b in _int_malloc () from /usr/lib/libopen-pal.so.0
(gdb) backtrace
#0  0x7faceb265b8b in _int_malloc () from /usr/lib/libopen-pal.so.0
#1  0x7faceb266e58 in malloc () from /usr/lib/libopen-pal.so.0
#2  0x7faceb248bfb in opal_class_initialize ()
   from /usr/lib/libopen-pal.so.0
#3  0x7faceb25ce2b in opal_malloc_init () from /usr/lib/libopen-pal.so.0
#4  0x7faceb249d97 in opal_init_util () from /usr/lib/libopen-pal.so.0
#5  0x7faceb249e76 in opal_init () from /usr/lib/libopen-pal.so.0
#6  0x7faced05a723 in ompi_mpi_init () from /usr/lib/libmpi.so.0
#7  0x7faced07c106 in PMPI_Init () from /usr/lib/libmpi.so.0
#8  0x7facee144d92 in libMesh::init () from /usr/lib/libmesh.so.0.6.2
#9  0x00411f61 in main ()

libMesh::init() just has an assertion and command line check before
MPI_Init, so I think it's safe to conclude this is an OpenMPI problem.

How can I help to test and fix this?

This might be related to Vincent Rotival's problem in
http://www.open-mpi.org/community/lists/users/2008/04/5427.php or maybe
http://www.open-mpi.org/community/lists/users/2008/05/5668.php .  On the
latter, I'm building the Debian package, which should have the
LDFLAGS="" fix.  Hmm, nope, no LDFLAGS anywhere in the .diff.gz...  The
OpenMPI top-level Makefile has
"LDFLAGS = -export-dynamic -Wl,-Bsymbolic-functions"

-Adam
-- 
GPG fingerprint: D54D 1AEE B11C CE9B A02B  C5DD 526F 01E8 564E E4B6

Engineering consulting with open source tools
http://www.opennovation.com/


signature.asc
Description: This is a digitally signed message part


Re: [OMPI users] Object Send Doubt

2008-07-24 Thread Tim Mattox
Hello Carlos,
Sorry for the long delay in replying.

You may want to take a look at the Boost.MPI project:
http://www.boost.org/doc/html/mpi.html

It has a higher-level interface to MPI that is much more C++ friendly.

On Sat, Jul 12, 2008 at 3:30 PM, Carlos Henrique da Silva Santos
 wrote:
> Dear,
>
>  I m looking for a solution.
>  I m developing a C++ code and I have an object called Field. I need
> to send/recv this object over some process allowed in my cluster. For
> example, I have to send Field from process 1 to process 2 and 3, but
> not process 4.
>  My questions are:
>  Is possible to send an object by Open-MPI?
>  If it yes, could you send me an source code example to do it or reference ?
>
> Thank you.
>
> Carlos
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

-- 
Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/
 tmat...@gmail.com || timat...@open-mpi.org
 I'm a bright... http://www.the-brights.net/


Re: [OMPI users] Problem with WRF and pgi-7.2

2008-07-24 Thread Brock Palen

Thanks!  I will post a bug with PGI.

Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985



On Jul 23, 2008, at 4:50 PM, Brian Dobbins wrote:


Hi Brock,

  Just to add my two cents now, I finally got around to building  
WRF with PGI 7.2 as well.  I noticed that in the configure script  
there isn't an option specifically for PGI (Fortran) + PGI (C), and  
when I try that combination I do get the same error you have - I'm  
doing this on RHEL5.2, with PGI 7.2-2.  There is a 7.2-3 out that I  
haven't tried yet, but they don't mention anything about this  
particular error in the fixes section of their documentation, so  
I'm guessing they haven't come across it yet.


  .. In the meantime, you can successfully build WRF with a PGI  
(Fortran) + GCC (C) OpenMPI install.  I just did that, and tested  
it with one case, using OpenMPI-1.2.6, PGI 7.2-2 and GCC 4.1.2 on  
the same RHEL 5.2 system.


  In a nutshell, I'm pretty sure it's a PGI problem.  If you want  
to post it in their forums, they're generally pretty responsive.  
(And if you don't, I will, since it'd be nice to see it work  
without a hybrid MPI installation!)


  Cheers,
  - Brian


Brian Dobbins
Yale Engineering HPC


On Wed, Jul 23, 2008 at 12:09 PM, Brock Palen   
wrote:

Not yet, if you have no ideas I will open a ticket.


Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985



On Jul 23, 2008, at 12:05 PM, Jeff Squyres wrote:
Hmm; I haven't seen this kind of problem before.  Have you  
contacted PGI?



On Jul 21, 2008, at 2:08 PM, Brock Palen wrote:

Hi, When compiling WRF with PGI-7.2-1  with openmpi-1.2.6
The file buf_for_proc.c  fails.  Nothing specail about this file  
sticks out to me.  But older versions of PGI like it just fine.   
The errors PGI complains about has to do with mpi.h though:


[brockp@nyx-login1 RSL_LITE]$ mpicc  -DFSEEKO64_OK  -w -O3 - 
DDM_PARALLEL   -c buf_for_proc.c
PGC-S-0036-Syntax error: Recovery attempted by inserting  
identifier .Z before '(' (/home/software/rhel4/openmpi-1.2.6/ 
pgi-7.0/include/mpi.h: 823)
PGC-S-0082-Function returning array not allowed (/home/software/ 
rhel4/openmpi-1.2.6/pgi-7.0/include/mpi.h: 823)
PGC-S-0043-Redefinition of symbol, MPI_Comm (/home/software/rhel4/ 
openmpi-1.2.6/pgi-7.0/include/mpi.h: 837)

PGC/x86-64 Linux 7.2-1: compilation completed with severe errors

Has anyone else seen that kind of problem with mpi.h  and pgi?  Do  
I need to use -c89  ?  I know PGI changed the default with this a  
while back, but it does not appear to help.


Thanks!


Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


--
Jeff Squyres
Cisco Systems

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users