Re: [OMPI users] Processes stuck in MPI_BARRIER

2007-06-21 Thread Chris Reeves
On Tue, Jun 19, 2007 at 11:28:33AM -0700, George Bosilca wrote: > > The deadlock happens with or without your patch ? If it's with your > patch, the problem might come from the fact that you start 2 > processes on each node and you will share the port range (because of > your patch). If pro

Re: [OMPI users] Processes stuck in MPI_BARRIER

2007-06-21 Thread Chris Reeves
On Tue, Jun 19, 2007 at 03:40:36PM -0400, Jeff Squyres wrote: > On Jun 19, 2007, at 2:24 PM, George Bosilca wrote: > > > 1. I don't believe the OS to release the binding when we close the > > socket. As an example on Linux the kernel sockets are release at a > > later moment. That means the so

Re: [OMPI users] Processes stuck in MPI_BARRIER

2007-06-21 Thread Chris Reeves
Thanks for all your replies and sorry for the delay in getting back to you. On Tue, Jun 19, 2007 at 01:40:21PM -0400, Jeff Squyres wrote: > On Jun 19, 2007, at 9:18 AM, Chris Reeves wrote: > > > Also attached is a small patch that I wrote to work around some firewall > > limitations on the node

Re: [OMPI users] Processes stuck in MPI_BARRIER

2007-06-20 Thread Marcin Skoczylas
I had almost the same situation when I upgraded OpenMPI from very old version to 1.2.2. All processes seemed to stuck in MPI_Barrier, as a walk-around I just commented out all MPI_Barrier occurrences in my program and it started to work perfectly. greets, Marcin Chris Reeves wrote: (This tim

Re: [OMPI users] Processes stuck in MPI_BARRIER

2007-06-20 Thread Gleb Natapov
On Tue, Jun 19, 2007 at 11:24:24AM -0700, George Bosilca wrote: > 1. I don't believe the OS to release the binding when we close the > socket. As an example on Linux the kernel sockets are release at a > later moment. That means the socket might be still in use for the > next run. > This is n

Re: [OMPI users] Processes stuck in MPI_BARRIER

2007-06-19 Thread Jeff Squyres
On Jun 19, 2007, at 2:24 PM, George Bosilca wrote: While limiting the ports used by Open MPI might be a good idea, I'm skeptical about it. For at least 2 reasons: 1. I don't believe the OS to release the binding when we close the socket. As an example on Linux the kernel sockets are release

Re: [OMPI users] Processes stuck in MPI_BARRIER

2007-06-19 Thread Jelena Pjesivac-Grbovic
Hi, You should definitely try everything people before me mentioned. Also, try running single process per node - and see if it happens. I do not have some great insight about this issue - but I did have similar problem in March. Unfortunately it went away (don't remember how - either by me qu

Re: [OMPI users] Processes stuck in MPI_BARRIER

2007-06-19 Thread George Bosilca
The deadlock happens with or without your patch ? If it's with your patch, the problem might come from the fact that you start 2 processes on each node and you will share the port range (because of your patch). Please re-run either with 2 processes by node but without your patch or with o

Re: [OMPI users] Processes stuck in MPI_BARRIER

2007-06-19 Thread George Bosilca
On Jun 19, 2007, at 10:40 AM, Jeff Squyres wrote: From the looks of the patch, it looks like you just want Open MPI to restrict itself to a specific range of ports, right? If that's the case, we'd probably do this slightly differently (with MCA parameters -- we certainly wouldn't want to forc

Re: [OMPI users] Processes stuck in MPI_BARRIER

2007-06-19 Thread Jeff Squyres
On Jun 19, 2007, at 9:18 AM, Chris Reeves wrote: I've had a look through the FAQ and searched the list archives and can't find any similar problems to this one. I'm running OpenMPI 1.2.2 on 10 Intel iMacs (Intel Core2 Duo CPU). I am specifiying two slots per machine and starting my job wit

[OMPI users] Processes stuck in MPI_BARRIER

2007-06-19 Thread Chris Reeves
(This time with attachments...) Hi there, I've had a look through the FAQ and searched the list archives and can't find any similar problems to this one. I'm running OpenMPI 1.2.2 on 10 Intel iMacs (Intel Core2 Duo CPU). I am specifiying two slots per machine and starting my job with: /Network/

[OMPI users] Processes stuck in MPI_BARRIER

2007-06-19 Thread Chris Reeves
Hi there, I've had a look through the FAQ and searched the list archives and can't find any similar problems to this one. I'm running OpenMPI 1.2.2 on 10 Intel iMacs (Intel Core2 Duo CPU). I am specifiying two slots per machine and starting my job with: /Network/Guanine/csr201/local-i386/opt/ope