Re: [OMPI users] OpenMPI out of band TCP retry exceeded

2011-04-28 Thread Jeff Squyres
omputing, TechApps > Pratt & Whitney, UTC > (860)-565-8486 > > -Original Message- > From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On > Behalf Of Ralph Castain > Sent: Thursday, April 28, 2011 9:03 AM > To: Open MPI Users > Subject: Re: [OMPI

Re: [OMPI users] OpenMPI out of band TCP retry exceeded

2011-04-28 Thread Sindhi, Waris PW
Sent: Thursday, April 28, 2011 9:03 AM To: Open MPI Users Subject: Re: [OMPI users] OpenMPI out of band TCP retry exceeded On Apr 28, 2011, at 6:56 AM, Sindhi, Waris PW wrote: > Yes the procgroup file has more than 128 applications in it. > > % wc -l procgroup > 239 procgroup >

Re: [OMPI users] OpenMPI out of band TCP retry exceeded

2011-04-28 Thread Ralph Castain
pi.org [mailto:users-boun...@open-mpi.org] On > Behalf Of Ralph Castain > Sent: Thursday, April 28, 2011 9:02 AM > To: Open MPI Users > Subject: Re: [OMPI users] OpenMPI out of band TCP retry exceeded > > > On Apr 28, 2011, at 6:49 AM, Jeff Squyres wrote: > >> On Apr 2

Re: [OMPI users] OpenMPI out of band TCP retry exceeded

2011-04-28 Thread Sindhi, Waris PW
Sent: Thursday, April 28, 2011 9:02 AM To: Open MPI Users Subject: Re: [OMPI users] OpenMPI out of band TCP retry exceeded On Apr 28, 2011, at 6:49 AM, Jeff Squyres wrote: > On Apr 28, 2011, at 8:45 AM, Ralph Castain wrote: > >> What lead you to conclude 1.2.8? >> >>>&

Re: [OMPI users] OpenMPI out of band TCP retry exceeded

2011-04-28 Thread Ralph Castain
half Of Ralph Castain > Sent: Wednesday, April 27, 2011 8:09 PM > To: Open MPI Users > Subject: Re: [OMPI users] OpenMPI out of band TCP retry exceeded > > > On Apr 27, 2011, at 1:31 PM, Sindhi, Waris PW wrote: > >> No we do not have a firewall turned on. I can run smal

Re: [OMPI users] OpenMPI out of band TCP retry exceeded

2011-04-28 Thread Ralph Castain
On Apr 28, 2011, at 6:49 AM, Jeff Squyres wrote: > On Apr 28, 2011, at 8:45 AM, Ralph Castain wrote: > >> What lead you to conclude 1.2.8? >> >> /opt/openmpi/i386/bin/mpirun -mca btl_openib_verbose 1 --mca btl ^tcp >> --mca pls_ssh_agent ssh -mca oob_tcp_peer_retries 1000 --prefix >

Re: [OMPI users] OpenMPI out of band TCP retry exceeded

2011-04-28 Thread Sindhi, Waris PW
sers-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph Castain Sent: Wednesday, April 27, 2011 8:09 PM To: Open MPI Users Subject: Re: [OMPI users] OpenMPI out of band TCP retry exceeded On Apr 27, 2011, at 1:31 PM, Sindhi, Waris PW wrote: > No we do not have a f

Re: [OMPI users] OpenMPI out of band TCP retry exceeded

2011-04-28 Thread Jeff Squyres
On Apr 28, 2011, at 8:45 AM, Ralph Castain wrote: > What lead you to conclude 1.2.8? > > /opt/openmpi/i386/bin/mpirun -mca btl_openib_verbose 1 --mca btl ^tcp > --mca pls_ssh_agent ssh -mca oob_tcp_peer_retries 1000 --prefix > /usr/lib/openmpi/1.2.8-gcc/bin -np 239 --app procgroup Hi

Re: [OMPI users] OpenMPI out of band TCP retry exceeded

2011-04-28 Thread Ralph Castain
128 applications in it. >> >>> >>> >>> Sincerely, >>> >>> Waris Sindhi >>> High Performance Computing, TechApps >>> Pratt & Whitney, UTC >>> (860)-565-8486 >>> >>> -----Original Message

Re: [OMPI users] OpenMPI out of band TCP retry exceeded

2011-04-28 Thread Jeff Squyres
t; >> Sincerely, >> >> Waris Sindhi >> High Performance Computing, TechApps >> Pratt & Whitney, UTC >> (860)-565-8486 >> >> -Original Message- >> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On >> Behalf

Re: [OMPI users] OpenMPI out of band TCP retry exceeded

2011-04-27 Thread Ralph Castain
, TechApps > Pratt & Whitney, UTC > (860)-565-8486 > > -Original Message- > From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On > Behalf Of Ralph Castain > Sent: Wednesday, April 27, 2011 2:18 PM > To: Open MPI Users > Subject: Re: [OMPI users] OpenMPI

Re: [OMPI users] OpenMPI out of band TCP retry exceeded

2011-04-27 Thread Sindhi, Waris PW
sage- From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph Castain Sent: Wednesday, April 27, 2011 2:18 PM To: Open MPI Users Subject: Re: [OMPI users] OpenMPI out of band TCP retry exceeded Perhaps a firewall? All it is telling you is that mpirun couldn't esta

Re: [OMPI users] OpenMPI out of band TCP retry exceeded

2011-04-27 Thread Ralph Castain
Perhaps a firewall? All it is telling you is that mpirun couldn't establish TCP communications with the daemon on ln10. On Apr 27, 2011, at 11:58 AM, Sindhi, Waris PW wrote: > Hi, > I am getting a "oob-tcp: Communication retries exceeded" error > message when I run a 238 MPI slave code > >