Re: [OMPI users] Occasional mpirun hang on completion

2008-01-18 Thread Barry Rountree
On Fri, Jan 18, 2008 at 08:33:10PM -0500, Jeff Squyres wrote: > Barry -- > > Could you check what apps are still running when it hangs? I.e., I > assume that all the uptime's are dead; are all the orted's dead on the > remote nodes? (orted = our helper process that is launched on the > rem

Re: [OMPI users] multi-compiler builds of OpenMPI (RPM)

2008-01-18 Thread Jeff Squyres
On Jan 3, 2008, at 11:38 AM, Jim Kusznir wrote: > error: line 300: Dependency tokens must begin with alpha-numeric, '_' > or '/': Requires: %{_name}-runtime Huh..this is strange. Here's the chunk from my spec file and rpm version. I've now built 3 sets of multi-rpm openmpi, each with a diff

Re: [OMPI users] compiler warnings in openmpi-1.2.5rc2

2008-01-18 Thread Jeff Squyres
You're right. This fix is already on the trunk; it looks like we missed bringing it to the v1.2 branch. I'll file a CMR against the v1.2 branch in case we ever release 1.2.6. Thanks! On Dec 27, 2007, at 10:34 PM, Doug Reeder wrote: Hello, The attachment contains a short explanation of a

Re: [OMPI users] Occasional mpirun hang on completion

2008-01-18 Thread Jeff Squyres
Barry -- Could you check what apps are still running when it hangs? I.e., I assume that all the uptime's are dead; are all the orted's dead on the remote nodes? (orted = our helper process that is launched on the remote nodes to exert process control, funnel I/O back and forth to mpirun

Re: [OMPI users] odd network behavior

2008-01-18 Thread Jeff Squyres
Are all three machines running the same OS and version, perchance? If the machines are heterogeneous in terms of OS, glibc version, etc., weird things like these hangs can occur. Additionally, are you running a firewall on any of these machines? Ensure that iptables isn't running. It doe

Re: [OMPI users] ADIOI_Set_lock failure

2008-01-18 Thread Jeff Squyres
FWIW, you might want to ask the ROMIO maintainers if this is a known problem. I unfortunately have no idea. :-\ On Jan 16, 2008, at 12:12 PM, Brock Palen wrote: Hello, Using OpenMPI-1.2.3+pgi-7.0+hdf5 parallel + lustre is giving me the error: File locking failed in ADIOI_Set_lock. If the

[OMPI users] behi is out of the office.

2008-01-18 Thread Berit Hinnemann
I will be out of the office starting 19-01-2008 and will not return until 04-02-2008. I will respond to your message when I return.

Re: [OMPI users] Communications Problems when application distributed over different nodes

2008-01-18 Thread Jeff Squyres
Do you have the Linux firewall running on either of your machines, perchance? This can either block random socket connections between nodes (which Open MPI's TCP communication will use) or eat the connection requests in a black-hole fashion such that the connections will timeout. On Ja

Re: [OMPI users] Using open-mpi app on a normal network

2008-01-18 Thread Aurélien Bouteiller
You can do it, it might or might not make sense, depending on your application. Load imbalance in regular MPI applications kills performance. Therefore if your cluster is very heterogeneous, you might prefer some different programming paradigm that take care of this by nature (let say RPC).

[OMPI users] Using open-mpi app on a normal network

2008-01-18 Thread Antoine Monmayrant
Hi everyone, I am new to open-mpi and parallel computing so I hope I won't bore/offend you with obvious/off-topic questions. We are running scientific simulations (using meep from mit) on small bi-processors pcs and to fully use both processors on each machine, we had to compile a mpi version