Re: [OMPI users] Segmentation fault in mca_btl_tcp

2010-04-15 Thread Werner Van Geit
> > We sometimes see mysterious crashes like this one. At least some of them > are caused by port scanners, i.e. unexpected non-mpi related packets > coming in on the sockets will sometimes cause havoc. > Port scanners etc I don't really see happening on our cluster, since the nodes are well s

Re: [OMPI users] Segmentation fault in mca_btl_tcp

2010-04-15 Thread Jeff Squyres
On Apr 15, 2010, at 3:18 AM, Ake Sandgren wrote: > We sometimes see mysterious crashes like this one. At least some of them > are caused by port scanners, i.e. unexpected non-mpi related packets > coming in on the sockets will sometimes cause havoc. Ooohhh... ouch. > We've been getting http traf

Re: [OMPI users] Segmentation fault in mca_btl_tcp

2010-04-15 Thread Werner Van Geit
ram that reproduces the problem, perchance? > > -jms > Sent from my PDA. No type good. > > - Original Message - > From: users-boun...@open-mpi.org > To: us...@open-mpi.org > Sent: Thu Apr 15 01:57:10 2010 > Subject: [OMPI users] Segmentation fault in mca_btl_t

Re: [OMPI users] Segmentation fault in mca_btl_tcp

2010-04-15 Thread Jeff Squyres (jsquyres)
Can you send a small program that reproduces the problem, perchance? -jms Sent from my PDA. No type good. - Original Message - From: users-boun...@open-mpi.org To: us...@open-mpi.org Sent: Thu Apr 15 01:57:10 2010 Subject: [OMPI users] Segmentation fault in mca_btl_tcp Hi, We are

Re: [OMPI users] Segmentation fault in mca_btl_tcp

2010-04-15 Thread Ake Sandgren
On Thu, 2010-04-15 at 15:57 +0900, Werner Van Geit wrote: > Hi, > > We are using openmpi 1.4.1 on our cluster computer (in conjunction with > Torque). One of our users has a problem with his jobs generating a > segmentation fault on one of the slaves, this is the backtrace: > > [cstone-00613:28

[OMPI users] Segmentation fault in mca_btl_tcp

2010-04-15 Thread Werner Van Geit
Hi, We are using openmpi 1.4.1 on our cluster computer (in conjunction with Torque). One of our users has a problem with his jobs generating a segmentation fault on one of the slaves, this is the backtrace: [cstone-00613:28461] *** Process received signal *** [cstone-00613:28461] Signal: Segmen