Re: [OMPI users] OpenMPI/OpenIB/IMB hangs[Scanned]

2007-01-22 Thread Arif Ali
On Fri, 2007-01-19 at 20:15 -0500, Jeff Squyres wrote: > On Jan 19, 2007, at 6:19 PM, Arif Ali wrote: > > > > [0,1,59][btl_openib_component.c: > > 1153:btl_openib_component_progress] from > > > node16 to: node02 error polling HP CQ with status REMOTE ACCESS > > ERROR > > > status number 10 for

Re: [OMPI users] OpenMPI/OpenIB/IMB hangs[Scanned]

2007-01-19 Thread Jeff Squyres
On Jan 19, 2007, at 6:19 PM, Arif Ali wrote: > [0,1,59][btl_openib_component.c: 1153:btl_openib_component_progress] from > node16 to: node02 error polling HP CQ with status REMOTE ACCESS ERROR > status number 10 for wr_id 268919352 opcode 256614836 > mpirun noticed that job rank 0 with PID 0

Re: [OMPI users] OpenMPI/OpenIB/IMB hangs[Scanned]

2007-01-19 Thread Arif Ali
-Original Message- From: Gleb Natapov [mailto:gl...@voltaire.com] Sent: Fri 19/01/2007 18:33 To: Arif Ali Cc: Open MPI Users; Galen Shipman; Brad Benton; Pavel Shamis; Russell Slack; Barry Evans Subject: Re: [OMPI users] OpenMPI/OpenIB/IMB hangs[Scanned] On Fri, Jan 19, 2007 at 05:51

Re: [OMPI users] OpenMPI/OpenIB/IMB hangs[Scanned]

2007-01-19 Thread Gleb Natapov
On Fri, Jan 19, 2007 at 05:51:49PM +, Arif Ali wrote: > >>I tried the nightly snapshot of OpenMPI-1.2b4r13137, which failed > >>miserably. > >> > > > >Can you describe what happened there? Is it failing in a different way? > > > Here's the output > > #-

Re: [OMPI users] OpenMPI/OpenIB/IMB hangs[Scanned]

2007-01-19 Thread Arif Ali
see below for answers, regards, Arif Ali Software Engineer OCF plc Mobile: +44 (0)7970 148 122 Office: +44 (0)114 257 2200 Fax:+44 (0)114 257 0022 Email: a...@ocf.co.uk Web:http://www.ocf.co.uk Skype: arif_ali80 MSN:a...@ocf.co.uk Jeff Squyres wrote: Beware: this is a lengthy,

Re: [OMPI users] OpenMPI/OpenIB/IMB hangs[Scanned]

2007-01-19 Thread Jeff Squyres
Beware: this is a lengthy, detailed message. On Jan 18, 2007, at 3:53 PM, Arif Ali wrote: 1. We have HW * 2xBladecenter H * 2xCisco Infiniband Switch Modules * 1xCisco Infiniband Switch * 16x PPC64 JS21 blades each are 4 cores, with Cisco HCA Can you provide the details of your Cisco HCA? S

[OMPI users] OpenMPI/OpenIB/IMB hangs[Scanned]

2007-01-18 Thread Arif Ali
Hi List, 1. We have HW * 2xBladecenter H * 2xCisco Infiniband Switch Modules * 1xCisco Infiniband Switch * 16x PPC64 JS21 blades each are 4 cores, with Cisco HCA SW * SLES 10 * OFED 1.1 w. OpenMPI 1.1.1 I am running the Intel MPI Benchmark (IMB) on the cluster as a part of validation process f