[OMPI users] Q: OpenMPI's use of /tmp and hanging apps via FS problems?

2008-08-16 Thread Brian Dobbins
Hi guys, I was hoping someone here could shed some light on OpenMPI's use of /tmp (or, I guess, TMPDIR) and save me from diving into the source.. ;) The background is that I'm trying to run some applications on a system which has a flaky parallel file system which TMPDIR is mapped to - so, on

Re: [OMPI users] problem with alltoall with ppn=8

2008-08-16 Thread Ashley Pittman
On Sat, 2008-08-16 at 08:03 -0400, Jeff Squyres wrote: > - large all to all operations are very stressful on the network, even > if you have very low latency / high bandwidth networking such as DDR IB > > - if you only have 1 IB HCA in a machine with 8 cores, the problem > becomes even more di

Re: [OMPI users] problem with alltoall with ppn=8

2008-08-16 Thread Kozin, I (Igor)
> - per the "sm" thread, you might want to try with just IB (and not > shared memory), just to see if that helps (I don't expect that it > will, but every situation is different). Try running "mpirun --mca > btl openib ..." (vs. "--mca btl ^tcp"). Unfortunately you were right- it did not help. Sm

Re: [OMPI users] SM btl slows down bandwidth?

2008-08-16 Thread Terry Dontje
Date: Sat, 16 Aug 2008 08:18:47 -0400 From: Jeff Squyres Subject: Re: [OMPI users] SM btl slows down bandwidth? To: Open MPI Users Message-ID: <1197bce6-a7e3-499e-8b05-b85f7598d...@cisco.com> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes On Aug 15, 2008, at 3:32 PM,

Re: [OMPI users] SM btl slows down bandwidth?

2008-08-16 Thread Jeff Squyres
On Aug 15, 2008, at 3:32 PM, Gus Correa wrote: Just like Daniel and many others, I have seen some disappointing performance of MPI code on multicore machines, in code that scales fine in networked environments and single core CPUs, particularly in memory-intensive programs. The bad performan

Re: [OMPI users] bug in MPI_File_get_position_shared ?

2008-08-16 Thread Jeff Squyres
On Aug 13, 2008, at 7:06 PM, Yvan Fournier wrote: I seem to have encountered a bug in MPI-IO, in which MPI_File_get_position_shared hangs when called by multiple processes in a communicator. It can be illustrated by the following simple test case, in which a file is simply created with C IO

Re: [OMPI users] problem with alltoall with ppn=8

2008-08-16 Thread Jeff Squyres
There are likely many issues going on here: - large all to all operations are very stressful on the network, even if you have very low latency / high bandwidth networking such as DDR IB - if you only have 1 IB HCA in a machine with 8 cores, the problem becomes even more difficult because al

Re: [OMPI users] Segmentation fault (11) Address not mapped (1)

2008-08-16 Thread Jeff Squyres
It's not entirely clear that this means that it is a bug in Open MPI -- there's not really enough information here to say where the problem is. All that is clear is that a seg fault is happening somewhere in LAPACK. FWIW, I don't see MPI in the call stack of the segv at all. This doesn'

Re: [OMPI users] Newbie: API usage

2008-08-16 Thread Jeff Squyres
This is allowable by the MPI API. You're specifically telling MPI "I don't care to know when that send has completed." See the section for MPI_REQUEST_FREE here: http://www.mpi-forum.org/docs/mpi-11-html/node47.html#Node47 It's debatable whether that's good programming practice or not

Re: [OMPI users] problem with alltoall with ppn=8

2008-08-16 Thread Daniƫl Mantione
On Fri, 15 Aug 2008, Kozin, I \(Igor\) wrote: > Hello, I would really appreciate any advice on troubleshooting/tuning > Open MPI over ConnectX. More details about our setup can be found here > http://www.cse.scitech.ac.uk/disco/database/search-machine.php?MID=52 > Single process per node (ppn