[OMPI users] SEGV_ACCERR Failing at addr ...

2006-05-25 Thread 杨科
Hi,all
I tried to use hostfile option to launch MPI jobs on several nodes.Each node has
the same Open MPI package installed correctly (version 1.1ra9202).
But only node0 refused to work with other nodes.What should I do?
Can anybody help me? thank you.

[semper@node0]mpirun --hostfile h -np 4 /tmp/semper/testMPI
Hello MPI World the original.
Hello MPI World the original.
Hello MPI World the original.
Hello MPI World the original.
Signal:11 info.si_errno:0(Success) si_code:2(SEGV_ACCERR)
Failing at addr:0x6

[semper@node0]cat h
node0
node2
node4
node6
[semper@node0]vi h
[semper@node0]cat h
node2
node4
node6

[semper@node0]mpirun --hostfile h -np 4 /tmp/semper/testMPI
Hello MPI World the original.
Hello MPI World the original.
Hello MPI World the original.
Hello MPI World the original.
>From process 0: Num processes: 4
Greetings from process 1 ,pid 5140,host node4!
Greetings from process 2 ,pid 11471,host node6!
Greetings from process 3 ,pid 4290,host node2!

[semper@node0]mpirun -np 4 /tmp/semper/testMPI
Hello MPI World the original.
Hello MPI World the original.
Hello MPI World the original.
Hello MPI World the original.
>From process 0: Num processes: 4
Greetings from process 1 ,pid 21743,host node0!
Greetings from process 2 ,pid 21744,host node0!
Greetings from process 3 ,pid 21745,host node0!




Re: [OMPI users] spawn failed with errno=-7

2006-05-25 Thread Michael Kluskens
I think I moved to OpenMPI 1.1 and 1.2 alphas because of problems  
with spawn and OpenMPI 1.0.1 & 1.0.2.


You may wish to test building 1.1 and seeing if that solves your  
problem.


Michael

On May 24, 2006, at 1:48 PM, Jens Klostermann wrote:


I did the following run with openmpi1.0.2:

mpirun -np 8 -machinefile ompimachinefile ./hello_world_mpi

and got the following errors
[stokes:00740] [0,0,0] ORTE_ERROR_LOG: Not implemented in file
rmgr_urm.c at line 177
[stokes:00740] [0,0,0] ORTE_ERROR_LOG: Not implemented in file
rmgr_urm.c at line 365
[stokes:00740] mpirun: spawn failed with errno=-7


Re: [OMPI users] Wont run with 1.0.2

2006-05-25 Thread Michael Kluskens
One possibility is that you didn't properly uninstall version 1.0.1  
before installing version 1.0.2 & 1.0.3.


There was a change with some of the libraries a while back that  
caused me a similar problem.  An install of later versions of OpenMPI  
do not remove certain libraries from 1.0.1.


You absolutely have to:

cd openmpi1.0.1
sudo make uninstall
cd ../openmpi1.0.2
sudo make install

I have had no trouble in the past with PGF90 version 6.1-3 and  
OpenMPI 1.1a on a dual Operton 1.4 GHz machine running Debian Linux.


Michael

On May 24, 2006, at 7:43 PM, Tom Rosmond wrote:

After using OPENMPI Ver 1.0.1 for several months without trouble,  
last week
I decided to upgrade to Ver 1.0.2.  My primary motivation was  
curiosity,

to see if
there was any performance benefit.  To my surprise, several of my F90
applications
refused to run with the newer version.  I also tried Ver 1.0.3a1r10036
with the same
result.  In all 3 cases I configured the build identically.  Even that
old chestnut 'hello.f'
will not run with the newer versions.  I ran it in the totalview
debugger and can see
that it is hanging in the MPI initialization code before it gets to  
the

F90 application.
 I am using the Ver 6.1 PGF90 64bit compiler on a Linux Opteron
workstation with 2
dual core 2.4 GHZ processors.  If you think it it is worthwhile to
pursue this problem
further, what could I send you to help troubleshoot the problem?
Meanwhile I have
gone back to 1.0.1, which works fine on everything.




Re: [OMPI users] mca_btl_sm_send: write fifo failed:,errno=9

2006-05-25 Thread Jeff Squyres (jsquyres)
Just to follow up for the web archives -- this has now been fixed on the
trunk and v1.1 and v1.0 branches, and will show up in tonight's nightly
snapshots (r10062 or later). 


> -Original Message-
> From: users-boun...@open-mpi.org 
> [mailto:users-boun...@open-mpi.org] On Behalf Of Mykael BOUQUEY
> Sent: Wednesday, May 24, 2006 2:30 AM
> To: us...@open-mpi.org
> Subject: [OMPI users] mca_btl_sm_send: write fifo failed:,errno=9
> 
> [root@mymachine]# ompi_info
> Open MPI: 1.1a6
>Open MPI SVN revision: r9950
> Open RTE: 1.1a6
>Open RTE SVN revision: r9950
> OPAL: 1.1a6
>OPAL SVN revision: r9950
>   Prefix: /root/usr/local
>  Configured architecture: i686-pc-linux-gnu
>Configured by: root
>Configured on: Fri May 19 09:28:11 CEST 2006
>   Configure host: xpscp117892.tasfr.thales
> Built by: root
> Built on: ven mai 19 10:00:44 CEST 2006
>   Built host: xpscp117892.tasfr.thales
>   C bindings: yes
> C++ bindings: yes
>   Fortran77 bindings: yes (all)
>   Fortran90 bindings: no
>   C compiler: gcc
>  C compiler absolute: /usr/bin/gcc
> C++ compiler: g++
>C++ compiler absolute: /usr/bin/g++
>   Fortran77 compiler: g77
>   Fortran77 compiler abs: /usr/bin/g77
>   Fortran90 compiler: none
>   Fortran90 compiler abs: none
>  C profiling: yes
>C++ profiling: yes
>  Fortran77 profiling: yes
>  Fortran90 profiling: no
>   C++ exceptions: no
>   Thread support: posix (mpi: yes, progress: yes)
>   Internal debug support: no
>  MPI parameter check: runtime
> Memory profiling support: no
> Memory debugging support: no
>  libltdl support: 1
>   MCA memory: ptmalloc2 (MCA v1.0, API v1.0, 
> Component v1.1)
>MCA paffinity: linux (MCA v1.0, API v1.0, Component v1.1)
>MCA maffinity: first_use (MCA v1.0, API v1.0, 
> Component v1.1)
>MCA timer: linux (MCA v1.0, API v1.0, Component v1.1)
>MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0)
>MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0)
> MCA coll: basic (MCA v1.0, API v1.0, Component v1.1)
> MCA coll: hierarch (MCA v1.0, API v1.0, 
> Component v1.1)
> MCA coll: self (MCA v1.0, API v1.0, Component v1.1)
> MCA coll: sm (MCA v1.0, API v1.0, Component v1.1)
> MCA coll: tuned (MCA v1.0, API v1.0, Component v1.1)
>   MCA io: romio (MCA v1.0, API v1.0, Component v1.1)
>MCA mpool: sm (MCA v1.0, API v1.0, Component v1.1)
>  MCA pml: dr (MCA v1.0, API v1.0, Component v1.1)
>  MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.1)
>  MCA bml: r2 (MCA v1.0, API v1.0, Component v1.1)
>   MCA rcache: rb (MCA v1.0, API v1.0, Component v1.1)
>  MCA btl: self (MCA v1.0, API v1.0, Component v1.1)
>  MCA btl: sm (MCA v1.0, API v1.0, Component v1.1)
>  MCA btl: tcp (MCA v1.0, API v1.0, Component v1.0)
> MCA topo: unity (MCA v1.0, API v1.0, Component v1.1)
>  MCA osc: pt2pt (MCA v1.0, API v1.0, Component v1.0)
>  MCA gpr: null (MCA v1.0, API v1.0, Component v1.1)
>  MCA gpr: proxy (MCA v1.0, API v1.0, Component v1.1)
>  MCA gpr: replica (MCA v1.0, API v1.0, Component v1.1)
>  MCA iof: proxy (MCA v1.0, API v1.0, Component v1.1)
>  MCA iof: svc (MCA v1.0, API v1.0, Component v1.1)
>   MCA ns: proxy (MCA v1.0, API v1.0, Component v1.1)
>   MCA ns: replica (MCA v1.0, API v1.0, Component v1.1)
>  MCA oob: tcp (MCA v1.0, API v1.0, Component v1.0)
>  MCA ras: dash_host (MCA v1.0, API v1.0, 
> Component v1.1)
>  MCA ras: hostfile (MCA v1.0, API v1.0, 
> Component v1.1)
>  MCA ras: localhost (MCA v1.0, API v1.0, 
> Component v1.1)
>  MCA ras: slurm (MCA v1.0, API v1.0, Component v1.1)
>  MCA rds: hostfile (MCA v1.0, API v1.0, 
> Component v1.1)
>  MCA rds: resfile (MCA v1.0, API v1.0, Component v1.1)
>MCA rmaps: round_robin (MCA v1.0, API v1.0, 
> Component v1.1)
> MCA rmgr: proxy (MCA v1.0, API v1.0, Component v1.1)
> MCA rmgr: urm (MCA v1.0, API v1.0, Component v1.1)
>  MCA rml: oob (MCA v1.0, API v1.0, Component v1.1)
>  MCA pls: fork (MCA v1.0, API v1.0, Component v1.1)
>  MCA pls: rsh (MCA v1.0, API v1.0, Component v1.1)
>  MCA pls: slurm (MCA v1.0, API v1.0, Component v1.1)
>  MCA sds: env (MCA v1.0, API v1.0, Component v1.1)

Re: [OMPI users] SEGV_ACCERR Failing at addr ...

2006-05-25 Thread Jeff Squyres (jsquyres)
> -Original Message-
> From: users-boun...@open-mpi.org 
> [mailto:users-boun...@open-mpi.org] On Behalf Of ??
> Sent: Thursday, May 25, 2006 4:54 AM
> To: us...@open-mpi.org
> Subject: [OMPI users] SEGV_ACCERR Failing at addr ...
> 
> Hi,all
> I tried to use hostfile option to launch MPI jobs on several 
> nodes.Each node has
> the same Open MPI package installed correctly (version 1.1ra9202).

This is actually a fairly old, unreleased version of Open MPI (we're up to 
r10069 as of this writing).  If you're using nightly snapshots or SVN 
checkouts, can you upgrade to more recent versions and see if the problem is 
fixed?  If the problem still persists, we can dig into it deeper.

Thanks.

-- 
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems



Re: [OMPI users] pallas assistance ?

2006-05-25 Thread Jeff Squyres (jsquyres)
Gleb just committed some fixes for the PPC64 issue last night
(https://svn.open-mpi.org/trac/ompi/changeset/10059).  It should only
affect the eager RDMA issues, but it would be a worthwhile datapoint if
you could test with (i.e., specify no MCA parameters on your mpirun
command line, so it should use RDMA by default).
 
I'm waiting for my own PPC64 machine to be reconfigured so that I can
test again; can you try with r10059 or later?




From: users-boun...@open-mpi.org
[mailto:users-boun...@open-mpi.org] On Behalf Of Paul
Sent: Wednesday, May 24, 2006 9:35 PM
To: Open MPI Users
Subject: Re: [OMPI users] pallas assistance ?


It makes no difference on my end. Exact same error.


On 5/24/06, Andrew Friedley  wrote: 

Paul wrote:
> Somebody call orkin. ;-P
> Well I tried running it with things set as noted in
the bug report. 
> However it doesnt change anything on my end. I am
willing to do any
> verification you guys need (time permitting and all).
Anything special
> needed to get mpi_latency to compile ? I can run that
to verify that 
> things are actually working on my end.
>
> [root@something ompi]#  
Shouldn't the parameter be '--mca
btl_openib_use_eager_rdma'?

> [root@something ompi]# /opt/ompi/bin/mpirun --mca
btl_openmpi_use_srq 1 
> -np 2 -hostfile machine.list ./IMB-MPI1

Same here - '--mca btl_openib_use_srq'

Andrew
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users





Re: [OMPI users] Fortran support not installing

2006-05-25 Thread Jeff Squyres (jsquyres)
I actually had to set FCFLAGS, not LDFLAGS, to get arbitrary flags
passed down to the Fortran tests in configure.
 
Can you try that?  (I'm not 100% sure -- you may need to specify LDFLAGS
*and* FCFLAGS...?)
 
We have made substantial improvements to the configure tests with
regards to the MPI F90 bindings in the upcoming 1.1 release.  Most of
the work is currently off in a temporary branch in our code repository
(meaning that it doesn't show up yet in the nightly trunk tarballs), but
it will hopefully be brought back to the trunk soon.




From: users-boun...@open-mpi.org
[mailto:users-boun...@open-mpi.org] On Behalf Of Terry Reeves
Sent: Wednesday, May 24, 2006 4:21 PM
To: us...@open-mpi.org
Subject: Re: [OMPI users] Fortran support not installing


HI,
Unfortunately using  lSystemStubs still failed during configure.
Output enclosed





Re: [OMPI users] pallas assistance ?

2006-05-25 Thread Paul

Okay, I rebuilt using those diffs. Currently I am still having issues with
pallas however. That being said I think my issue is more with
compiling/linking pallas. Here is my pallas make_$arch file:

MPI_HOME = /opt/ompi/
MPI_INCLUDE = $(MPI_HOME)/include
LIB_PATH =
LIBS =
CC = ${MPI_HOME}/bin/mpicc
OPTFLAGS = -O
CLINKER = ${CC}
LDFLAGS = -m64
CPPFLAGS = -m64

Again ldd'ing the IMB-MPI1 file works fine, and the compilation completes
okay.

On 5/25/06, Jeff Squyres (jsquyres)  wrote:


 Gleb just committed some fixes for the PPC64 issue last night (
https://svn.open-mpi.org/trac/ompi/changeset/10059).  It should only
affect the eager RDMA issues, but it would be a worthwhile datapoint if you
could test with (i.e., specify no MCA parameters on your mpirun command
line, so it should use RDMA by default).

I'm waiting for my own PPC64 machine to be reconfigured so that I can test
again; can you try with r10059 or later?

 --
*From:* users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] *On
Behalf Of *Paul
*Sent:* Wednesday, May 24, 2006 9:35 PM
*To:* Open MPI Users
*Subject:* Re: [OMPI users] pallas assistance ?

It makes no difference on my end. Exact same error.

On 5/24/06, Andrew Friedley  wrote:
>
> Paul wrote:
> > Somebody call orkin. ;-P
> > Well I tried running it with things set as noted in the bug report.
> > However it doesnt change anything on my end. I am willing to do any
> > verification you guys need (time permitting and all). Anything special
> > needed to get mpi_latency to compile ? I can run that to verify that
> > things are actually working on my end.
> >
> > [root@something ompi]#
> Shouldn't the parameter be '--mca btl_openib_use_eager_rdma'?
>
> > [root@something ompi]# /opt/ompi/bin/mpirun --mca btl_openmpi_use_srq
> 1
> > -np 2 -hostfile machine.list ./IMB-MPI1
>
> Same here - '--mca btl_openib_use_srq'
>
> Andrew
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Problems with myirnet support on PPC

2006-05-25 Thread Brock Palen
Is there a way to disable pthreads when building with gm?  i can  
build with just tcp just fine.

Brock

On May 25, 2006, at 3:34 PM, George Bosilca wrote:


That's kind of funny ... Look like the PTHREAD_CANCEL was missing
from the pthread.h on most of the Linux distributions until the
beginning of 2002 (http://sourceware.org/ml/newlib/2002/
msg00538.html). And it look like it is still missing from the MAC OS
X 10.3.9 pthread.h (http://lists.apple.com/archives/darwin-
development/2004/Feb/msg00150.html). Anyway, we can remove it as the
ptl are not used in the 1.0.2 release.

   Thanks,
 george.

On May 24, 2006, at 5:10 PM, Brock Palen wrote:


Im getting the following error when trying to build OMPI on OSX
10.3.9  with myrinet support,  the libs are in
/opt/gm/lib
Includes:
/opt/gm/include

Bellow is my configure line and the error:
./configure --prefix=/home/software/openmpi-1.0.2 --with-tm=/home/
software/torque-2.1.0p0 --with-gm=/opt/gm FC=/opt/ibmcmp/xlf/8.1/bin/
xlf90 F77=/opt/ibmcmp/xlf/8.1/bin/xlf CPPFLAGS=-I/opt/gm/include

gcc -DHAVE_CONFIG_H -I. -I. -I../../../../include -I../../../../
include -I/opt/gm/include -I../../../../include -I../../../.. -
I../../../.. -I../../../../include -I../../../../opal -I../../../../
orte -I../../../../ompi -I/opt/gm/include -D_REENTRANT -O3 -DNDEBUG -
fno-strict-aliasing -MT ptl_gm.lo -MD -MP -MF .deps/ptl_gm.Tpo -c
ptl_gm.c  -fno-common -DPIC -o .libs/ptl_gm.o
gcc -DHAVE_CONFIG_H -I. -I. -I../../../../include -I../../../../
include -I/opt/gm/include -I../../../../include -I../../../.. -
I../../../.. -I../../../../include -I../../../../opal -I../../../../
orte -I../../../../ompi -I/opt/gm/include -D_REENTRANT -O3 -DNDEBUG -
fno-strict-aliasing -MT ptl_gm_priv.lo -MD -MP -MF .deps/
ptl_gm_priv.Tpo -c ptl_gm_priv.c  -fno-common -DPIC -o .libs/
ptl_gm_priv.o
ptl_gm_component.c: In function `mca_ptl_gm_thread_progress':
ptl_gm_component.c:249: error: `PTHREAD_CANCELED' undeclared (first
use in this function)
ptl_gm_component.c:249: error: (Each undeclared identifier is
reported only once
ptl_gm_component.c:249: error: for each function it appears in.)
make[4]: *** [ptl_gm_component.lo] Error 1
make[4]: *** Waiting for unfinished jobs
make[3]: *** [all-recursive] Error 1
make[2]: *** [all-recursive] Error 1
make[1]: *** [all-recursive] Error 1
make: *** [all-recursive] Error 1

Could this be related to the version of the gm package we have
installed?  Any insight would be helpful.
Brock
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users






Re: [OMPI users] Wont run with 1.0.2

2006-05-25 Thread Tom Rosmond

I didn't do a formal uninstall as you demonstate below, but instead went
into the 'prefix' directory and renamed 'bin','lib','etc','include', and 
'share'

before running the 1.0.2 build and install. That way I didn't blow up my
1.0.1 installation, and it was easy to switch back in case the 1.0.2 
install

didn't work. I was sure this procedure was legitimate, but maybe I missed
something? As far as I know all executable, library, and include paths
are identical between the two, so what else should I change?

T. Rosmond


Michael Kluskens wrote:

One possibility is that you didn't properly uninstall version 1.0.1  
before installing version 1.0.2 & 1.0.3.


There was a change with some of the libraries a while back that  
caused me a similar problem.  An install of later versions of OpenMPI  
do not remove certain libraries from 1.0.1.


You absolutely have to:

cd openmpi1.0.1
sudo make uninstall
cd ../openmpi1.0.2
sudo make install

I have had no trouble in the past with PGF90 version 6.1-3 and  
OpenMPI 1.1a on a dual Operton 1.4 GHz machine running Debian Linux.


Michael

On May 24, 2006, at 7:43 PM, Tom Rosmond wrote:

 

After using OPENMPI Ver 1.0.1 for several months without trouble,  
last week
I decided to upgrade to Ver 1.0.2.  My primary motivation was  
curiosity,

to see if
there was any performance benefit.  To my surprise, several of my F90
applications
refused to run with the newer version.  I also tried Ver 1.0.3a1r10036
with the same
result.  In all 3 cases I configured the build identically.  Even that
old chestnut 'hello.f'
will not run with the newer versions.  I ran it in the totalview
debugger and can see
that it is hanging in the MPI initialization code before it gets to  
the

F90 application.
I am using the Ver 6.1 PGF90 64bit compiler on a Linux Opteron
workstation with 2
dual core 2.4 GHZ processors.  If you think it it is worthwhile to
pursue this problem
further, what could I send you to help troubleshoot the problem?
Meanwhile I have
gone back to 1.0.1, which works fine on everything.
   



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

 



Re: [OMPI users] Problems with myirnet support on PPC

2006-05-25 Thread George Bosilca
The commit is quite simple. You can simply modify the ompi/mca/ptl/gm/ 
ptl_gm_component.c at line 249 and replace the PTHREAD_CANCELED with  
0 and your problem will be solved. This fix was committed yesterday  
it will be on the next 1.0.3 release, or you can grab it from the  
nightly build section on our website.


There is another way, but this approach will completely disable the  
GM support. You can specify --without-gm on the configure line.


  Thanks,
george.

On May 25, 2006, at 12:58 PM, Brock Palen wrote:


Is there a way to disable pthreads when building with gm?  i can
build with just tcp just fine.
Brock

On May 25, 2006, at 3:34 PM, George Bosilca wrote:


That's kind of funny ... Look like the PTHREAD_CANCEL was missing
from the pthread.h on most of the Linux distributions until the
beginning of 2002 (http://sourceware.org/ml/newlib/2002/
msg00538.html). And it look like it is still missing from the MAC OS
X 10.3.9 pthread.h (http://lists.apple.com/archives/darwin-
development/2004/Feb/msg00150.html). Anyway, we can remove it as the
ptl are not used in the 1.0.2 release.

   Thanks,
 george.

On May 24, 2006, at 5:10 PM, Brock Palen wrote:


Im getting the following error when trying to build OMPI on OSX
10.3.9  with myrinet support,  the libs are in
/opt/gm/lib
Includes:
/opt/gm/include

Bellow is my configure line and the error:
./configure --prefix=/home/software/openmpi-1.0.2 --with-tm=/home/
software/torque-2.1.0p0 --with-gm=/opt/gm FC=/opt/ibmcmp/xlf/8.1/ 
bin/

xlf90 F77=/opt/ibmcmp/xlf/8.1/bin/xlf CPPFLAGS=-I/opt/gm/include

gcc -DHAVE_CONFIG_H -I. -I. -I../../../../include -I../../../../
include -I/opt/gm/include -I../../../../include -I../../../.. -
I../../../.. -I../../../../include -I../../../../opal -I../../../../
orte -I../../../../ompi -I/opt/gm/include -D_REENTRANT -O3 - 
DNDEBUG -

fno-strict-aliasing -MT ptl_gm.lo -MD -MP -MF .deps/ptl_gm.Tpo -c
ptl_gm.c  -fno-common -DPIC -o .libs/ptl_gm.o
gcc -DHAVE_CONFIG_H -I. -I. -I../../../../include -I../../../../
include -I/opt/gm/include -I../../../../include -I../../../.. -
I../../../.. -I../../../../include -I../../../../opal -I../../../../
orte -I../../../../ompi -I/opt/gm/include -D_REENTRANT -O3 - 
DNDEBUG -

fno-strict-aliasing -MT ptl_gm_priv.lo -MD -MP -MF .deps/
ptl_gm_priv.Tpo -c ptl_gm_priv.c  -fno-common -DPIC -o .libs/
ptl_gm_priv.o
ptl_gm_component.c: In function `mca_ptl_gm_thread_progress':
ptl_gm_component.c:249: error: `PTHREAD_CANCELED' undeclared (first
use in this function)
ptl_gm_component.c:249: error: (Each undeclared identifier is
reported only once
ptl_gm_component.c:249: error: for each function it appears in.)
make[4]: *** [ptl_gm_component.lo] Error 1
make[4]: *** Waiting for unfinished jobs
make[3]: *** [all-recursive] Error 1
make[2]: *** [all-recursive] Error 1
make[1]: *** [all-recursive] Error 1
make: *** [all-recursive] Error 1

Could this be related to the version of the gm package we have
installed?  Any insight would be helpful.
Brock
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




[OMPI users] multicast and broadcast

2006-05-25 Thread Brock Palen
We are trying to track down a problem with our network,  the question  
has come up, if its posable that the mpi lib (OMPI)  could be using  
ether multicast or broadcast (or both)

In what cases could multicast traffic be seen if OMPI is our mpi lib.

Brock


Re: [OMPI users] multicast and broadcast

2006-05-25 Thread Brian W. Barrett

On Thu, 25 May 2006, Brock Palen wrote:


We are trying to track down a problem with our network,  the question
has come up, if its posable that the mpi lib (OMPI)  could be using
ether multicast or broadcast (or both)
In what cases could multicast traffic be seen if OMPI is our mpi lib.


At this time, Open MPI only uses point-to-point communication -- it does 
not use multicast or broadcast.  That may change in the long term, but not 
until after the next release cycle.


Brian


Re: [OMPI users] Fortran support not installing

2006-05-25 Thread Brian W. Barrett

On Wed, 24 May 2006, Terry Reeves wrote:


Here is the out put for both g95 and gfortran

???

From the output you sent, you ran "./configure FC=g95".  Configure did

not find a valid F77 compiler, and therefore skipped both the F77 and
F90 bindings.

Can you try:

   ./configure FC=g95 F77=g95
and/or
   ./configure FC=gfortran F77=gfortran

You said you tried the formed but it died in configure -- can you send
the configure output and config.log from that run?


There are two sepereate issues at work here.

Unfortunately, g95 as installed is broken and requires the -lSystemStubs 
flag to all link commands in order to work properly.  Normally, one could 
just add -lSystemStubs to LDFLAGS and everything would work fine. 
Unfortunately, there is a bug in the configure tests for Open MPI 1.0.x 
that prevents this from working with Fortran 90.  Jeff suggested a 
workaround (adding -l in FCFLAGS) that's a really bad idea.  A better 
solution would be to use the 1.1 betas (available on the Open MPI web 
page) or to get a copy of g95 that properly links (it has been suggested 
that the one from Fink does this properly).


The issue with gfortran is much simpler -- it wasn't found in your path 
when you ran configure.  Make sure you can run 'gfortran -V' and get the 
expected version output, then try re-running configure.  My guess is that 
your problems will go away. You can also specify a full path to gfortran, 
like:


  ./configure FC=/usr/local/foo/bin/gfortran  F77=/usr/local/foo/bin/gfortran

Just make sure you put the right path in ;).

Hope this helps,

Brian

Re: [OMPI users] Fortran support not installing

2006-05-25 Thread Jeff Squyres (jsquyres)
Brian gave a much more complete description of the problem than me;
thanks.

We'll have this fixed in v1.0.3 (and later) such that you can use
LDFLAGS / LIBS, as expected, and you will not have to pass -l values
through FCFLAGS (which is quite definitely not the Right place to pass
in -l value).

Note, too, that the 1.1 betas currently do not have good Fortran 90
support -- in an attempt to support the Fortran compilers belonging to
new members of the OMPI team, we managed to break the F90 bindings in
recent betas.  :-(  We're working now to fix that and hope to have a
beta shortly that includes proper F90 support (much better than it was
in 1.0, actually).


> -Original Message-
> From: users-boun...@open-mpi.org 
> [mailto:users-boun...@open-mpi.org] On Behalf Of Brian W. Barrett
> Sent: Thursday, May 25, 2006 6:26 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] Fortran support not installing
> 
> On Wed, 24 May 2006, Terry Reeves wrote:
> 
> There are two sepereate issues at work here.
> 
> Unfortunately, g95 as installed is broken and requires the 
> -lSystemStubs 
> flag to all link commands in order to work properly.  
> Normally, one could 
> just add -lSystemStubs to LDFLAGS and everything would work fine. 
> Unfortunately, there is a bug in the configure tests for Open 
> MPI 1.0.x 
> that prevents this from working with Fortran 90.  Jeff suggested a 
> workaround (adding -l in FCFLAGS) that's a really bad idea.  A better 
> solution would be to use the 1.1 betas (available on the Open MPI web 
> page) or to get a copy of g95 that properly links (it has 
> been suggested 
> that the one from Fink does this properly).
> 
> The issue with gfortran is much simpler -- it wasn't found in 
> your path 
> when you ran configure.  Make sure you can run 'gfortran -V' 
> and get the 
> expected version output, then try re-running configure.  My 
> guess is that 
> your problems will go away. You can also specify a full path 
> to gfortran, 
> like:
> 
>./configure FC=/usr/local/foo/bin/gfortran  
> F77=/usr/local/foo/bin/gfortran
> 
> Just make sure you put the right path in ;).
> 
> Hope this helps,
> 
> Brian
> 



Re: [OMPI users] pallas assistance ?

2006-05-25 Thread Jeff Squyres (jsquyres)
In further discussions with other OMPI team members, I double checked
(should have checked this in the beginning, sorry): OFED 1.0rc4 does not
support 64 bit on PPC64 platforms; it only supports 32 bit on PPC64
platforms.
 
Mellanox says that 1.0rc5 (cut this morning) supports 64 bit on PPC64
platforms.
 
Can you try upgrading?  Sorry for all the hassle.  :-(
 




From: users-boun...@open-mpi.org
[mailto:users-boun...@open-mpi.org] On Behalf Of Paul
Sent: Thursday, May 25, 2006 11:51 AM
To: Open MPI Users
Subject: Re: [OMPI users] pallas assistance ?


Okay, I rebuilt using those diffs. Currently I am still having
issues with pallas however. That being said I think my issue is more
with compiling/linking pallas. Here is my pallas make_$arch file:

MPI_HOME = /opt/ompi/ 
MPI_INCLUDE = $(MPI_HOME)/include
LIB_PATH =
LIBS =
CC = ${MPI_HOME}/bin/mpicc
OPTFLAGS = -O
CLINKER = ${CC}
LDFLAGS = -m64
CPPFLAGS = -m64

Again ldd'ing the IMB-MPI1 file works fine, and the compilation
completes okay. 


On 5/25/06, Jeff Squyres (jsquyres)  wrote: 

Gleb just committed some fixes for the PPC64 issue last
night (https://svn.open-mpi.org/trac/ompi/changeset/10059 ).  It should
only affect the eager RDMA issues, but it would be a worthwhile
datapoint if you could test with (i.e., specify no MCA parameters on
your mpirun command line, so it should use RDMA by default).
 
I'm waiting for my own PPC64 machine to be reconfigured
so that I can test again; can you try with r10059 or later?






From: users-boun...@open-mpi.org
[mailto:users-boun...@open-mpi.org] On Behalf Of Paul

Sent: Wednesday, May 24, 2006 9:35 PM
To: Open MPI Users
Subject: Re: [OMPI users] pallas assistance ?




It makes no difference on my end. Exact same error.


On 5/24/06, Andrew Friedley 
wrote: 

Paul wrote:
> Somebody call orkin. ;-P
> Well I tried running it with things set as
noted in the bug report. 
> However it doesnt change anything on my end. I
am willing to do any
> verification you guys need (time permitting
and all). Anything special
> needed to get mpi_latency to compile ? I can
run that to verify that 
> things are actually working on my end.
>
> [root@something ompi]#  
Shouldn't the parameter be '--mca
btl_openib_use_eager_rdma'?

> [root@something ompi]# /opt/ompi/bin/mpirun
--mca btl_openmpi_use_srq 1 
> -np 2 -hostfile machine.list ./IMB-MPI1

Same here - '--mca btl_openib_use_srq'

Andrew
___
users mailing list
us...@open-mpi.org

http://www.open-mpi.org/mailman/listinfo.cgi/users




___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users






Re: [OMPI users] pallas assistance ?

2006-05-25 Thread Paul

Already done. I grabbed the rc5 this morning and rebuilt everything. I am
still having the same issue. I sent a message to the openib list about it. I
wont cross-spam this list with that message. I was wondering if you have
access to that list or not ? I can send you a copy if you need it. The
summary is that there are numerous apparent issues, though I have made a
little headway with regards to what the issues are, no gaurantees that I am
right in my guessing.

Its not a problem. At the moment I have the resources to chase it. Just let
me know what needs to be done.


On 5/25/06, Jeff Squyres (jsquyres)  wrote:


 In further discussions with other OMPI team members, I double checked
(should have checked this in the beginning, sorry): OFED 1.0rc4 does
not support 64 bit on PPC64 platforms; it only supports 32 bit on PPC64
platforms.

Mellanox says that 1.0rc5 (cut this morning) supports 64 bit on PPC64
platforms.

Can you try upgrading?  Sorry for all the hassle.  :-(


 --
*From:* users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] *On
Behalf Of *Paul
*Sent:* Thursday, May 25, 2006 11:51 AM

*To:* Open MPI Users
*Subject:* Re: [OMPI users] pallas assistance ?

Okay, I rebuilt using those diffs. Currently I am still having issues with
pallas however. That being said I think my issue is more with
compiling/linking pallas. Here is my pallas make_$arch file:

MPI_HOME = /opt/ompi/
MPI_INCLUDE = $(MPI_HOME)/include
LIB_PATH =
LIBS =
CC = ${MPI_HOME}/bin/mpicc
OPTFLAGS = -O
CLINKER = ${CC}
LDFLAGS = -m64
CPPFLAGS = -m64

Again ldd'ing the IMB-MPI1 file works fine, and the compilation completes
okay.

On 5/25/06, Jeff Squyres (jsquyres)  wrote:
>
>  Gleb just committed some fixes for the PPC64 issue last night 
(https://svn.open-mpi.org/trac/ompi/changeset/10059
> ).  It should only affect the eager RDMA issues, but it would be a
> worthwhile datapoint if you could test with (i.e., specify no MCA
> parameters on your mpirun command line, so it should use RDMA by default).
>
> I'm waiting for my own PPC64 machine to be reconfigured so that I can
> test again; can you try with r10059 or later?
>
>  --
>  *From:* users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org]
> *On Behalf Of *Paul
> *Sent:* Wednesday, May 24, 2006 9:35 PM
> *To:* Open MPI Users
> *Subject:* Re: [OMPI users] pallas assistance ?
>
>  It makes no difference on my end. Exact same error.
>
> On 5/24/06, Andrew Friedley  wrote:
> >
> > Paul wrote:
> > > Somebody call orkin. ;-P
> > > Well I tried running it with things set as noted in the bug report.
> > > However it doesnt change anything on my end. I am willing to do any
> > > verification you guys need (time permitting and all). Anything
> > special
> > > needed to get mpi_latency to compile ? I can run that to verify that
> >
> > > things are actually working on my end.
> > >
> > > [root@something ompi]#
> > Shouldn't the parameter be '--mca btl_openib_use_eager_rdma'?
> >
> > > [root@something ompi]# /opt/ompi/bin/mpirun --mca
> > btl_openmpi_use_srq 1
> > > -np 2 -hostfile machine.list ./IMB-MPI1
> >
> > Same here - '--mca btl_openib_use_srq'
> >
> > Andrew
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Fortran support not installing

2006-05-25 Thread Terry Reeves
Hello 	I tried configure with FCFLAGS=-lSystemStubs and with both FCFLAGS=-lSystemStubs and LDFLAGS=-lSystemStubs. Again it died during configure both times. I can provide configure output if desired. 	I also decided to try version 1.1a7. With LDFLAGS=-lSystemStubs, with our without FCFLAGS=-lSystemStubs, ir gets through configure but fails in "make all". Since that seems to be progress I have included that output.

ompi-output.tar.gz
Description: GNU Zip compressed data
Date: Thu, 25 May 2006 10:02:08 -0400From: "Jeff Squyres \(jsquyres\)" Subject: Re: [OMPI users] Fortran support not installingTo: "Open MPI Users" Message-ID:	Content-Type: text/plain; charset="us-ascii"I actually had to set FCFLAGS, not LDFLAGS, to get arbitrary flagspassed down to the Fortran tests in configure.Can you try that?  (I'm not 100% sure -- you may need to specify LDFLAGS*and* FCFLAGS...?)We have made substantial improvements to the configure tests withregards to the MPI F90 bindings in the upcoming 1.1 release.  Most ofthe work is currently off in a temporary branch in our code repository(meaning that it doesn't show up yet in the nightly trunk tarballs), butit will hopefully be brought back to the trunk soon. Terry Reeves 2-1013 - reeve...@osu.eduComputing ServicesOffice of Information TechnologyThe Ohio State University