[OMPI users] Getting default hostfile on Open MPI 1.3

2008-07-16 Thread Daniel Felix Ferber

Hi,

On Open MPI 1.2, I used following command to get the default host file:
ompi_info -a --parseable | grep rds_hostfile_path

Now, on Open MPI 1.3, the parameter does not seem to exist anymore. I 
checked that there is another parameter 
(mca:orte:base:param:orte_default_hostfile), but its value is always empty.


Which file whould I used for a system-wide default host file? Is there 
another way to query the path for this default host file? Or is a 
command to query which hosts are going to be used by mpi by default?


Best regards,
Daniel Felix Ferber


[OMPI users] Open MPI on Cray XT4

2008-07-16 Thread Adam Jundt
Hi all,

I have been working on getting a nightly tarball of Open MPI to build on
a Cray XT4 system running CNL. I found the following post on the forum:
http://www.open-mpi.org/community/lists/users/2007/09/4059.php. I had to
modify the configure options a little (added another include directory
to CFLAGS, and inserted the '--disable-mpi-f77' flag) to get it to build
for me, here is what I used:

./configure CC=/opt/xt-pe/default/bin/snos64/linux-pgcc
CXX=/opt/xt-pe/default/bin/snos64/linux-pgCC
F77=/opt/xt-pe/default/bin/snos64/linux-pgftn
FC=/opt/xt-pe/default/bin/snos64/linux-pgf90
CFLAGS="-I/opt/xt-pe/default/include/
-I/opt/xt-catamount/default/catamount/linux/include/"
CPPFLAGS=-I/opt/xt-pe/default/include/
FCFLAGS=-I/opt/xt-pe/default/include/
FFLAGS=-I/opt/xt-pe/default/include/
LDFLAGS=-L/opt/xt-mpt/default/lib/snos64/ LIBS="-lpct -lalpslli
-lalpsutil" --build=x86_64-unknown-linux-gnu
--host=x86_64-cray-linux-gnu
--with-platform=/lus/nid8/jundt/openmpi-1.3a1r18788/contrib/platform/cray_xt3_romio
--with-io-romio-flags=--disable-aio build_alias=x86_64-unknown-linux-gnu
host_alias=x86_64-cray-linux-gnu --enable-ltdl-convenience
--no-recursion --disable-mpi-f77 --prefix=~/OpenMPI

It builds without error (minus a warning that it doesn't recognize the
--enable-ltdl-convenience flag). The problem comes when trying to
compile a program and link with Open MPI. Here is what I get:

> ~/OpenMPI/bin/mpicc test.c
~/OpenMPI/lib/libopen-rte.a(session_dir.o): In function
`orte_session_dir_get_name':
session_dir.c:(.text+0x7e): warning: Using 'getpwuid' in statically
linked applications requires at runtime the shared libraries from the
glibc version used for linking
~/OpenMPI/lib/libmpi.a(btl_tcp_component.o): In function
`mca_btl_tcp_component_create_listen':
btl_tcp_component.c:(.text+0x11c0): warning: Using 'getaddrinfo' in
statically linked applications requires at runtime the shared libraries
from the glibc version used for linking
~/OpenMPI/lib/libopen-pal.a(timer_catamount_component.o): In function
`opal_timer_catamount_open':
timer_catamount_component.c:(.text+0x6): undefined reference to `__cpu_mhz'

Looking into timer_catamount_component.c, __cpu_mhz is defined within
the  file (which it should have already pulled in).
I realize that this is a very specified question, but I was curious if
anyone else had successfully gotten Open MPI to work on a similar
system, and if so, what configure options were used? If not, is anyone
aware of how to circumvent the problem?

By the way, I did try modifying the file timer_catamount_component.c to
not reference __cpu_mhz to see the result, and the program is able to
successfully compile, but hangs upon execution, i.e.:

> ~/OpenMPI/bin/mpicc test.c
~/OpenMPI/lib/libopen-rte.a(session_dir.o): In function
`orte_session_dir_get_name':
session_dir.c:(.text+0x7e): warning: Using 'getpwuid' in statically
linked applications requires at runtime the shared libraries from the
glibc version used for linking
~/OpenMPI/lib/libmpi.a(btl_tcp_component.o): In function
`mca_btl_tcp_component_create_listen':
btl_tcp_component.c:(.text+0x11c0): warning: Using 'getaddrinfo' in
statically linked applications requires at runtime the shared libraries
from the glibc version used for linking
> aprun -n 2 ./a.out
... program hangs...

Thanks!
Adam


Re: [OMPI users] Open MPI on Cray XT4

2008-07-16 Thread Brian W. Barrett

On Wed, 16 Jul 2008, Adam Jundt wrote:


I have been working on getting a nightly tarball of Open MPI to build on
a Cray XT4 system running CNL. I found the following post on the forum:
http://www.open-mpi.org/community/lists/users/2007/09/4059.php. I had to
modify the configure options a little (added another include directory
to CFLAGS, and inserted the '--disable-mpi-f77' flag) to get it to build
for me, here is what I used:

./configure CC=/opt/xt-pe/default/bin/snos64/linux-pgcc
CXX=/opt/xt-pe/default/bin/snos64/linux-pgCC
F77=/opt/xt-pe/default/bin/snos64/linux-pgftn
FC=/opt/xt-pe/default/bin/snos64/linux-pgf90
CFLAGS="-I/opt/xt-pe/default/include/
-I/opt/xt-catamount/default/catamount/linux/include/"
CPPFLAGS=-I/opt/xt-pe/default/include/
FCFLAGS=-I/opt/xt-pe/default/include/
FFLAGS=-I/opt/xt-pe/default/include/
LDFLAGS=-L/opt/xt-mpt/default/lib/snos64/ LIBS="-lpct -lalpslli
-lalpsutil" --build=x86_64-unknown-linux-gnu
--host=x86_64-cray-linux-gnu
--with-platform=/lus/nid8/jundt/openmpi-1.3a1r18788/contrib/platform/cray_xt3_romio
--with-io-romio-flags=--disable-aio build_alias=x86_64-unknown-linux-gnu
host_alias=x86_64-cray-linux-gnu --enable-ltdl-convenience
--no-recursion --disable-mpi-f77 --prefix=~/OpenMPI


I don't think it's a huge deal, but I think things will be a bit more sane 
if you change the --with-platform argument to cray_xt_cnl_romio instead of 
cray_xt3_romio (which is really targeting Catamount instead of CNL).  One 
of the ORNL guys can probably be more helpful than I can here, as I'm only 
familiar with building on Red Storm / Catamount.



~/OpenMPI/lib/libopen-pal.a(timer_catamount_component.o): In function
`opal_timer_catamount_open':
timer_catamount_component.c:(.text+0x6): undefined reference to `__cpu_mhz'

Looking into timer_catamount_component.c, __cpu_mhz is defined within
the  file (which it should have already pulled in).
I realize that this is a very specified question, but I was curious if
anyone else had successfully gotten Open MPI to work on a similar
system, and if so, what configure options were used? If not, is anyone
aware of how to circumvent the problem?

By the way, I did try modifying the file timer_catamount_component.c to
not reference __cpu_mhz to see the result, and the program is able to
successfully compile, but hangs upon execution, i.e.:


That's a weird result.  The configure test for the timer catamount 
component checks to see if __cpu_mhz is defined when linking.  Can you 
send me (off list is probably best) the config.log generated by configure? 
That component was added just to the trunk/v1.3 branch in the last month, 
which is probably why no one on CNL noticed yet (obviously it works great 
on Catamount).  I'm not really familiar with CNL -- does 
catamount/dclock.h exist on a standard CNL setup?



~/OpenMPI/bin/mpicc test.c

~/OpenMPI/lib/libopen-rte.a(session_dir.o): In function
`orte_session_dir_get_name':
session_dir.c:(.text+0x7e): warning: Using 'getpwuid' in statically
linked applications requires at runtime the shared libraries from the
glibc version used for linking
~/OpenMPI/lib/libmpi.a(btl_tcp_component.o): In function
`mca_btl_tcp_component_create_listen':
btl_tcp_component.c:(.text+0x11c0): warning: Using 'getaddrinfo' in
statically linked applications requires at runtime the shared libraries
from the glibc version used for linking

aprun -n 2 ./a.out

... program hangs...


I'm afraid I can't help a whole lot here.  HOwever, there are some 
differences between how Open MPI initializes Portals between CNL and 
Catamount.  Since you configured for Catamount, it's possible that's the 
cause of the hang.  Again, the ORNL people would probably know better than 
I would.



Brian