[OMPI users] Error Polling HP CQ Status on PPC64 LInux with IB

2006-06-19 Thread Owen Stampflee
I'm currently working on getting OpenMPI + OpenIB 1.0 (might be an RC)
working on our 8 node Xserve G5 cluster running Linux kernel version
2.6.16 and get the following errors:

Process 1 on node-192-168-111-249
Process 0 on node-192-168-111-248
[0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 1 for wr_id 270995584 opcode -1286736

[0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 270995868 opcode -1286736

[0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 270996152 opcode -1286736

[0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 270996436 opcode -1286736

[0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 270996720 opcode -1286736

[0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 270997004 opcode -1286736

[0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 270997288 opcode -1286736

[0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 270997572 opcode -1286736

[0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 271077504 opcode -1286736

[0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 271077788 opcode -1286736

[0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 271078072 opcode -1286736

[0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 9 for wr_id 270991488 opcode -6639584

[0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 270995584 opcode -6639584

[0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 270995868 opcode -6639584

[0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 270996152 opcode -6639584

[0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 270996436 opcode -6639584

[0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 270996720 opcode -6639584

[0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 270997004 opcode -6639584

[0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 270997288 opcode -6639584

[0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 270997572 opcode -6639584

[0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 271077504 opcode -6639584

[0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 271077788 opcode -6639584

[0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 271078072 opcode -6639584

mpirun: killing job...




Re: [OMPI users] Error Polling HP CQ Status on PPC64 LInux with IB

2006-06-19 Thread Gleb Natapov
What version of OpenMPI are you using?

On Mon, Jun 19, 2006 at 07:06:54AM -0700, Owen Stampflee wrote:
> I'm currently working on getting OpenMPI + OpenIB 1.0 (might be an RC)
> working on our 8 node Xserve G5 cluster running Linux kernel version
> 2.6.16 and get the following errors:
> 
> Process 1 on node-192-168-111-249
> Process 0 on node-192-168-111-248
> [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
> error polling HP CQ with status 1 for wr_id 270995584 opcode -1286736
> 
> [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
> error polling HP CQ with status 5 for wr_id 270995868 opcode -1286736
> 
> [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
> error polling HP CQ with status 5 for wr_id 270996152 opcode -1286736
> 
> [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
> error polling HP CQ with status 5 for wr_id 270996436 opcode -1286736
> 
> [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
> error polling HP CQ with status 5 for wr_id 270996720 opcode -1286736
> 
> [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
> error polling HP CQ with status 5 for wr_id 270997004 opcode -1286736
> 
> [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
> error polling HP CQ with status 5 for wr_id 270997288 opcode -1286736
> 
> [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
> error polling HP CQ with status 5 for wr_id 270997572 opcode -1286736
> 
> [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
> error polling HP CQ with status 5 for wr_id 271077504 opcode -1286736
> 
> [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
> error polling HP CQ with status 5 for wr_id 271077788 opcode -1286736
> 
> [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
> error polling HP CQ with status 5 for wr_id 271078072 opcode -1286736
> 
> [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
> error polling HP CQ with status 9 for wr_id 270991488 opcode -6639584
> 
> [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
> error polling HP CQ with status 5 for wr_id 270995584 opcode -6639584
> 
> [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
> error polling HP CQ with status 5 for wr_id 270995868 opcode -6639584
> 
> [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
> error polling HP CQ with status 5 for wr_id 270996152 opcode -6639584
> 
> [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
> error polling HP CQ with status 5 for wr_id 270996436 opcode -6639584
> 
> [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
> error polling HP CQ with status 5 for wr_id 270996720 opcode -6639584
> 
> [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
> error polling HP CQ with status 5 for wr_id 270997004 opcode -6639584
> 
> [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
> error polling HP CQ with status 5 for wr_id 270997288 opcode -6639584
> 
> [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
> error polling HP CQ with status 5 for wr_id 270997572 opcode -6639584
> 
> [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
> error polling HP CQ with status 5 for wr_id 271077504 opcode -6639584
> 
> [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
> error polling HP CQ with status 5 for wr_id 271077788 opcode -6639584
> 
> [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
> error polling HP CQ with status 5 for wr_id 271078072 opcode -6639584
> 
> mpirun: killing job...
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

--
Gleb.


[OMPI users] Error polling HP CQ on linux ppc64 w/Infiniband

2006-06-19 Thread Owen Stampflee
We are currently trying to get OpenMPI 1.0.2 and OpenIB (1.0) running on
our Xserve G5 cluster running Linux 2.6.16 with no luck, the
ibv_*_pingpong tests work fine, opensm is started and the network is up.

Here's the output:
Process 1 on node-192-168-111-249
Process 0 on node-192-168-111-248
[0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 1 for wr_id 270995584 opcode -1286736

[0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 270995868 opcode -1286736

[0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 270996152 opcode -1286736

[0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 270996436 opcode -1286736

[0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 270996720 opcode -1286736

[0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 270997004 opcode -1286736

[0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 270997288 opcode -1286736

[0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 270997572 opcode -1286736

[0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 271077504 opcode -1286736

[0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 271077788 opcode -1286736

[0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 271078072 opcode -1286736

[0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 9 for wr_id 270991488 opcode -6639584

[0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 270995584 opcode -6639584

[0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 270995868 opcode -6639584

[0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 270996152 opcode -6639584

[0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 270996436 opcode -6639584

[0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 270996720 opcode -6639584

[0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 270997004 opcode -6639584

[0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 270997288 opcode -6639584

[0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 270997572 opcode -6639584

[0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 271077504 opcode -6639584

[0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 271077788 opcode -6639584

[0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
error polling HP CQ with status 5 for wr_id 271078072 opcode -6639584

mpirun: killing job...





Re: [OMPI users] Installing OpenMPI on a solaris

2006-06-19 Thread Eric Thibodeau
I checked the thread with the same title as this e-mail and tried compiling 
openmpi-1.1b4r10418 with:

./configure CFLAGS="-mv8plus" CXXFLAGS="-mv8plus" FFLAGS="-mv8plus" 
FCFLAGS="-mv8plus" --prefix=$HOME/openmpi-SUN-`uname -r` 
--enable-pretty-print-stacktrace 

but I keep getting:

*** Assembler
checking for BSD-compatible nm... /usr/ccs/bin/nm -p
checking for fgrep... fgrep
checking whether to enable smp locks... yes
checking directive for setting text section... .text
checking directive for exporting symbols... .globl
checking for objdump... no
checking if .note.GNU-stack is needed... no
checking suffix for labels... :
checking prefix for global symbol labels...
checking prefix for lsym labels... .L
checking prefix for function in .type... #
checking if .size is needed... yes
checking if .align directive takes logarithmic value... no
checking if have Sparc v8+/v9 support... no
configure: WARNING: Sparc v8 target is not supported in this release of Open 
MPI.
configure: WARNING: You must specify the target architecture v8plus
configure: WARNING: (cc: -xarch=v8plus, gcc: -mv8plus) for CFLAGS, CCXXFLAGS,
configure: WARNING: FFLAGS, and FCFLAGS to compile Open MPI in32 bit mode on
configure: WARNING: Sparc processors
configure: error: Can not continue.

Is Sparc support put aside for the moment or am-I doing something wrong?

Thanks,

-- 
Eric Thibodeau
Neural Bucket Solutions Inc.
T. (514) 736-1436
C. (514) 710-0517

[OMPI users] MPI_Wtime

2006-06-19 Thread Michael Kluskens

Is anyone using MPI_Wtime with any version of OpenMPI under Fortran 90?

I got my program to compile with MPI_Wtime commands but the  
difference between two different times in the process is always zero.


When compiling against OpenMPI I have to specify

mytime = MPI_Wtime

For other MPI's I specify:

mytime = MPI_Wtime()

This has been tested on a dual-opteron with PGI 6.1-5 and a G4 with  
g95, I'm currently using OpenMPI 1.2a1r10297.


The same code works fine on the dual-operton with PGI 6.1-5 and  
MPICH2, SGI Altix with Intel compilers and SGI MPI library, and SGI  
IRIX with SGI MPI library.


Michael





[OMPI users] auto detect hosts

2006-06-19 Thread Michael Kluskens

How does OpenMPI auto-detect available hosts?

I'm running on a cluster of dual-opterons running Debian Linux.

Just using "mpirun -np 4 hostname" somehow OpenMPI located the second  
dual-opteron in the stack of machines but no more than that,  
regardless of how many processes I asked for.


The master node has an internal ip of 10.0.0.0 and the second node  
has an ip of 10.0.0.1 and a name of "node02" and "node2"


I've been unable to find a file that contains only the name of my  
second node and not the others.


I'm currently running OpenMPI 1.2a1r10297.

Michael



Re: [OMPI users] Error Polling HP CQ Status on PPC64 LInux with IB

2006-06-19 Thread Owen Stampflee
Ooops, thought I mentioned that, its 1.0.2.

Cheers,
Owen

On Mon, 2006-06-19 at 17:08 +0300, Gleb Natapov wrote:
> What version of OpenMPI are you using?
> 
> On Mon, Jun 19, 2006 at 07:06:54AM -0700, Owen Stampflee wrote:
> > I'm currently working on getting OpenMPI + OpenIB 1.0 (might be an RC)
> > working on our 8 node Xserve G5 cluster running Linux kernel version
> > 2.6.16 and get the following errors:
> > 
> > Process 1 on node-192-168-111-249
> > Process 0 on node-192-168-111-248
> > [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
> > error polling HP CQ with status 1 for wr_id 270995584 opcode -1286736
> > 
> > [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
> > error polling HP CQ with status 5 for wr_id 270995868 opcode -1286736
> > 
> > [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
> > error polling HP CQ with status 5 for wr_id 270996152 opcode -1286736
> > 
> > [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
> > error polling HP CQ with status 5 for wr_id 270996436 opcode -1286736
> > 
> > [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
> > error polling HP CQ with status 5 for wr_id 270996720 opcode -1286736
> > 
> > [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
> > error polling HP CQ with status 5 for wr_id 270997004 opcode -1286736
> > 
> > [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
> > error polling HP CQ with status 5 for wr_id 270997288 opcode -1286736
> > 
> > [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
> > error polling HP CQ with status 5 for wr_id 270997572 opcode -1286736
> > 
> > [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
> > error polling HP CQ with status 5 for wr_id 271077504 opcode -1286736
> > 
> > [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
> > error polling HP CQ with status 5 for wr_id 271077788 opcode -1286736
> > 
> > [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress]
> > error polling HP CQ with status 5 for wr_id 271078072 opcode -1286736
> > 
> > [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
> > error polling HP CQ with status 9 for wr_id 270991488 opcode -6639584
> > 
> > [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
> > error polling HP CQ with status 5 for wr_id 270995584 opcode -6639584
> > 
> > [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
> > error polling HP CQ with status 5 for wr_id 270995868 opcode -6639584
> > 
> > [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
> > error polling HP CQ with status 5 for wr_id 270996152 opcode -6639584
> > 
> > [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
> > error polling HP CQ with status 5 for wr_id 270996436 opcode -6639584
> > 
> > [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
> > error polling HP CQ with status 5 for wr_id 270996720 opcode -6639584
> > 
> > [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
> > error polling HP CQ with status 5 for wr_id 270997004 opcode -6639584
> > 
> > [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
> > error polling HP CQ with status 5 for wr_id 270997288 opcode -6639584
> > 
> > [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
> > error polling HP CQ with status 5 for wr_id 270997572 opcode -6639584
> > 
> > [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
> > error polling HP CQ with status 5 for wr_id 271077504 opcode -6639584
> > 
> > [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
> > error polling HP CQ with status 5 for wr_id 271077788 opcode -6639584
> > 
> > [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress]
> > error polling HP CQ with status 5 for wr_id 271078072 opcode -6639584
> > 
> > mpirun: killing job...
> > 
> > 
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> --
>   Gleb.
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> !DSPAM:4496b11547791312162!



Re: [OMPI users] auto detect hosts

2006-06-19 Thread Reuti

Am 19.06.2006 um 20:02 schrieb Michael Kluskens:


How does OpenMPI auto-detect available hosts?

I'm running on a cluster of dual-opterons running Debian Linux.

Just using "mpirun -np 4 hostname" somehow OpenMPI located the second
dual-opteron in the stack of machines but no more than that,
regardless of how many processes I asked for.

The master node has an internal ip of 10.0.0.0 and the second node
has an ip of 10.0.0.1 and a name of "node02" and "node2"


I would suggest not to use .0 at all as a host address, as it usually  
refers to a subnet by convention. And having it in the same order as  
the name, avoids the offset. I mean .1 = node01, .2 = node02...


I don't know, whether this is related in any way to the effect you  
observe.


Cheers - Reuti



I've been unable to find a file that contains only the name of my
second node and not the others.

I'm currently running OpenMPI 1.2a1r10297.

Michael

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] auto detect hosts

2006-06-19 Thread Michael Kluskens

Corrected:

How does OpenMPI auto-detect available hosts?

I'm running on a cluster of dual-opterons running Debian Linux.

Just using "mpirun -np 4 hostname" somehow OpenMPI located the second
dual-opteron in the stack of machines but no more than that,
regardless of how many processes I asked for.

The master node has an internal ip of 10.0.0.1 and the second node
has an ip of 10.0.0.2 and a name of "node02" and "node2"

I've been unable to find a file that contains only the name of my
second node and not the others.

I'm currently running OpenMPI 1.2a1r10297.

Michael