Re: [OMPI users] Unable to find any HCAs ..

2007-07-05 Thread Graham Jenkins



On Jul 4, 2007, at 8:21 PM, Graham Jenkins wrote:

  

I'm using the openmpi-1.1.1-5.el5.x86_64 RPM on a Scientific Linux 5
cluster, with no installed HCAs. And a simple MPI job submitted to  
that
cluster runs OK .. except that it issues messages for each node  
like the

one shown below.  Is there some way I can supress these, perhaps by an
appropriate entry in /etc/openmpi-mca-params.conf ?

--
libibverbs: Fatal: couldn't open sysfs class 'infiniband_verbs'.
-- 



Yes, there is a line you can add to /etc/openmpi-mca-params.conf:

btl=^openib

will tell Open MPI to use any available btls (our network transport  
layer) except openib.



  

It works! Thanks for that :)


--
Graham Jenkins
Senior Software Specialist, E-Research

Email: graham.jenk...@its.monash.edu.au
Tel:   +613 9905-5942
Mob:   +614 4850-2491



Re: [OMPI users] mpi with icc, icpc and ifort :: segfault (Jeff Squyres)

2007-07-05 Thread Jeff Squyres

Yoinks -- that's not good.

I suspect that our included memory manager is borking things up  
(Brian: can you comment?).  Can you try configuring OMPI --without- 
memory-manager?



On Jul 4, 2007, at 5:52 PM, Ricardo Reis wrote:




From: Jeff Squyres 

Can you be a bit more specific than "it dies"?  Are you talking about
mpif90/mpif77, or your app?


Sorry, tuspid me. When executing mpif90 or mpif77 I have a segfault  
and it doesn't compile. I've tried both with or without input  
(i.e., giving it something to compile or just executing it  
expecting to see the normal "no files given" kind of message). The  
intel suite compiled openmpi without problems.


Going through the posting guidelines:

5005.0 $ ompi_info --all
Segmentation fault

(added an strace of this execution in attach).
(also added config.log e make.log)

I have compiled lam 7.1.3 with this set of compilers and have no  
problem at all.


 thanks,

 Ricardo Reis

 'Non Serviam'

 PhD student @ Lasef
 Computational Fluid Dynamics, High Performance Computing, Turbulence
 

 &

 Cultural Instigator @ Rádio Zero
 http://radio.ist.utl.pt







--
Jeff Squyres
Cisco Systems




Re: [OMPI users] Absoft compilation problem

2007-07-05 Thread Jeff Squyres

On Jul 2, 2007, at 7:31 PM, Yip, Elizabeth L wrote:

I downloaded openmpi-1.2.3rc2r15098 from your "nightly snapshot",  
same problem.
I notice in version 1.1.2, you generate libmpi_f90.a instead of  
the .so files.


Brian clarified for me off-list that we use the same LT for nightly  
trunk and OMPI 1.2.x tarballs.


In 1.1.2, you're correct that we only made static F90 libraries.

Can you clarify/confirm:

- 1.1.2 works for you (static F90 library)
- 1.2.1 works for you
- 1.2.3 does not work for you

If this is correct, then something is really, really weird.




-Original Message-
From: Jeff Squyres [mailto:jsquy...@cisco.com]
Sent: Sun 7/1/2007 4:03 AM
To: Open MPI Users
Subject: Re: [OMPI users] Absoft compilation problem

I unfortunately do not have access to an Absoft compiler to test this
with; it looks like GNU Libtool is getting the wrong arguments to
pass to the f95 compiler to build a shared library.

A quick workaround for this issue would be to disable the MPI F90
bindings with the --disable-mpi-f90 switch to configure.

Could you try the Open MPI nightly trunk tarball and see if it works
there?  We use a different version of Libtool to make those tarballs.


On Jun 30, 2007, at 2:09 AM, Yip, Elizabeth L wrote:

>
> The attachment shows my problems when I tried to compile openmpi
> 1.2.3 with absoft 95
> (Absoft 64-bit Fortran 95 9.0 with Service Pack 1).  I have similar
> problems with version 1.2.1, but
> no problem with version 1.2.1.
>
> Elizabeth Yip
> 
> 


--
Jeff Squyres
Cisco Systems


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users





--
Jeff Squyres
Cisco Systems



[OMPI users] TCP Nagle algorithm and latency

2007-07-05 Thread Biagio Cosenza

Hi,
I'm using Open MPI in a real time rendering system and I need an accurate
latency control.

consider the 'Nagle' optimization implemented in the TCP/IP protocol, which
delays small packets for a short time period to possibly combine them with
successive packets generating network friendly packet sizes.
This optimization can result in a better throughput when lots of small
packets are sent, but can also lead to considerable latencies, if  packets
get delayed several times.



For example, I want to turn the Nagle optimization on for sockets in which
updated scene data is streamed to the clients, as throughput is the main
issue here.
On the other hand, I want turn it off for e.g. sockets used to send tiles to
the clients, as this has to be done with an absolute minimum of latency.

Can I do it with OpenMPI?
Am I using the the wrong tool?


Thanks in advance

Biagio Cosenza
an italian MSc student
Università di Salerno


Re: [OMPI users] mpi with icc, icpc and ifort :: segfault (Jeff Squyres)

2007-07-05 Thread Ricardo Reis

On Thu, 5 Jul 2007, Jeff Squyres wrote:


Yoinks -- that's not good.

I suspect that our included memory manager is borking things up
(Brian: can you comment?).  Can you try configuring OMPI --without-
memory-manager?


Yes. It compiles and links OK (execution of mpif90).

Trying to run (mpirun -np  ) gives segmentation fault.

Ompi_info gives output and then segfaults. ompi_info --all segfaults 
immediatly.


Added ompi_info log (without --all)
Added strace ompi_info --all log
Added strace mpirun log

 greets,

 Ricardo Reis

 'Non Serviam'

 PhD student @ Lasef
 Computational Fluid Dynamics, High Performance Computing, Turbulence
 

 &

 Cultural Instigator @ Rádio Zero
 http://radio.ist.utl.pt

mpirun.log.bz2
Description: Binary data


ompi_info.log.bz2
Description: Binary data


ompi_info_all.log.bz2
Description: Binary data


Re: [OMPI users] mpi with icc, icpc and ifort :: segfault (Jeff Squyres)

2007-07-05 Thread George Bosilca
There is another piece of information that can be useful in order to  
figure out what's wrong. If you can execute the ompi_info directly in  
gdb, run it until it segfault and then send us the output of "where"  
and "shared" (both are gdb commands). This will give us access to the  
exact location of the segfault, and the list of all shared libraries  
that get loaded by the application.


  Thanks,
george.

On Jul 5, 2007, at 12:19 PM, Ricardo Reis wrote:


On Thu, 5 Jul 2007, Jeff Squyres wrote:


Yoinks -- that's not good.

I suspect that our included memory manager is borking things up
(Brian: can you comment?).  Can you try configuring OMPI --without-
memory-manager?


Yes. It compiles and links OK (execution of mpif90).

Trying to run (mpirun -np  ) gives segmentation fault.

Ompi_info gives output and then segfaults. ompi_info --all  
segfaults immediatly.


Added ompi_info log (without --all)
Added strace ompi_info --all log
Added strace mpirun log

 greets,

 Ricardo Reis

 'Non Serviam'

 PhD student @ Lasef
 Computational Fluid Dynamics, High Performance Computing, Turbulence
 

 &

 Cultural Instigator @ Rádio Zero
 http:// 
radio.ist.utl.ptbz2>___

users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




smime.p7s
Description: S/MIME cryptographic signature


Re: [OMPI users] mpi with icc, icpc and ifort :: segfault (Jeff Squyres)

2007-07-05 Thread Ricardo Reis


Has requested:


Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread -1214711408 (LWP 23581)]
0xb7eb9d98 in opal_event_set ()
   from /usr/local/share/openmpi-1.2.3.icc.ifort/lib/libopen-pal.so.0

(gdb) where
#0  0xb7eb9d98 in opal_event_set ()
   from /usr/local/share/openmpi-1.2.3.icc.ifort/lib/libopen-pal.so.0
#1  0xb7ebb86f in opal_evsignal_init ()
   from /usr/local/share/openmpi-1.2.3.icc.ifort/lib/libopen-pal.so.0
#2  0x0006 in ?? ()
#3  0x0002 in ?? ()
#4  0xb7ebb78a in opal_evsignal_add ()
   from /usr/local/share/openmpi-1.2.3.icc.ifort/lib/libopen-pal.so.0
#5  0x0800 in ?? ()
#6  0xb7ed44b8 in ?? ()
   from /usr/local/share/openmpi-1.2.3.icc.ifort/lib/libopen-pal.so.0
#7  0xb7ebc577 in opal_poll_init ()
   from /usr/local/share/openmpi-1.2.3.icc.ifort/lib/libopen-pal.so.0
#8  0x0023 in ?? ()
#9  0xb7eb9f61 in opal_event_init ()
   from /usr/local/share/openmpi-1.2.3.icc.ifort/lib/libopen-pal.so.0
#10 0x0804d22a in ompi_info::open_components ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

(gdb) shared
Symbols already loaded for /lib/ld-linux.so.2
Symbols already loaded for 
/usr/local/share/openmpi-1.2.3.icc.ifort/lib/libmpi.so.0
Symbols already loaded for 
/usr/local/share/openmpi-1.2.3.icc.ifort/lib/libopen-rte.so.0
Symbols already loaded for 
/usr/local/share/openmpi-1.2.3.icc.ifort/lib/libopen-pal.so.0

Symbols already loaded for /lib/i686/cmov/libnsl.so.1
Symbols already loaded for /lib/i686/cmov/libutil.so.1
Symbols already loaded for /lib/i686/cmov/libm.so.6
Symbols already loaded for /usr/lib/libstdc++.so.6
Symbols already loaded for /lib/libgcc_s.so.1
Symbols already loaded for /opt/intel/cc/10.0.023/lib/libcxaguard.so.5
Symbols already loaded for /lib/i686/cmov/libpthread.so.0
Symbols already loaded for /lib/i686/cmov/libc.so.6
Symbols already loaded for /lib/i686/cmov/libdl.so.2
Symbols already loaded for /opt/intel/cc/10.0.023/lib/libimf.so
Symbols already loaded for /opt/intel/cc/10.0.023/lib/libintlc.so.5


 Ricardo Reis

 'Non Serviam'

 PhD student @ Lasef
 Computational Fluid Dynamics, High Performance Computing, Turbulence
 

 &

 Cultural Instigator @ Rádio Zero
 http://radio.ist.utl.pt

[OMPI users] openmpi fails on mx endpoint busy

2007-07-05 Thread SLIM H.A.

Hello

I have compiled openmpi-1.2.3 with the --with-mx=
configuration and gcc compiler. On testing with 4-8 slots I get an error
message, the mx ports are busy:

>mpirun --mca btl mx,self -np 4 ./cpi
[node001:10071] mca_btl_mx_init: mx_open_endpoint() failed with
status=20
[node001:10074] mca_btl_mx_init: mx_open_endpoint() failed with
status=20
[node001:10073] mca_btl_mx_init: mx_open_endpoint() failed with
status=20

--
Process 0.1.0 is unable to reach 0.1.1 for MPI communication.
If you specified the use of a BTL component, you may have
forgotten a component (such as "self") in the list of 
usable components.
... snipped
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or
environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)

--
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (goodbye)
mpirun noticed that job rank 0 with PID 10071 on node node001 exited on
signal 1 (Hangup).


I would not expect mx messages as communication should not go through
the mx card? (This is a twin dual core  shared memory node)
The same happens when testing on 2 nodes, using a hostfile.
I checked the state of the mx card with mx_endpoint_info and mx_info,
they are healthy and free.
What is missing here?

Thanks

Henk



Re: [OMPI users] openmpi fails on mx endpoint busy

2007-07-05 Thread Michael Edwards

If the machine is multi-processor you might want to add the sm btl.  That
cleared up some similar problems for me, though I don't use mx so your
millage may vary.

On 7/5/07, SLIM H.A.  wrote:



Hello

I have compiled openmpi-1.2.3 with the --with-mx=
configuration and gcc compiler. On testing with 4-8 slots I get an error
message, the mx ports are busy:

>mpirun --mca btl mx,self -np 4 ./cpi
[node001:10071] mca_btl_mx_init: mx_open_endpoint() failed with
status=20
[node001:10074] mca_btl_mx_init: mx_open_endpoint() failed with
status=20
[node001:10073] mca_btl_mx_init: mx_open_endpoint() failed with
status=20

--
Process 0.1.0 is unable to reach 0.1.1 for MPI communication.
If you specified the use of a BTL component, you may have
forgotten a component (such as "self") in the list of
usable components.
... snipped
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or
environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)

--
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (goodbye)
mpirun noticed that job rank 0 with PID 10071 on node node001 exited on
signal 1 (Hangup).


I would not expect mx messages as communication should not go through
the mx card? (This is a twin dual core  shared memory node)
The same happens when testing on 2 nodes, using a hostfile.
I checked the state of the mx card with mx_endpoint_info and mx_info,
they are healthy and free.
What is missing here?

Thanks

Henk

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] openmpi fails on mx endpoint busy

2007-07-05 Thread Tim Prins

Hi Henk,

By specifying '--mca btl mx,self' you are telling Open MPI not to use 
its shared memory support. If you want to use Open MPI's shared memory 
support, you must add 'sm' to the list. I.e. '--mca btl mx,self'. If you 
would rather use MX's shared memory support, instead use '--mca btl 
mx,self --mca btl_mx_shared_mem 1'. However, in most cases I believe 
Open MPI's shared memory support is a bit better.


Alternatively, if you don't specify any btls, Open MPI should figure out 
what to use automatically.


Hope this helps,

Tim

SLIM H.A. wrote:

Hello

I have compiled openmpi-1.2.3 with the --with-mx=
configuration and gcc compiler. On testing with 4-8 slots I get an error
message, the mx ports are busy:


mpirun --mca btl mx,self -np 4 ./cpi

[node001:10071] mca_btl_mx_init: mx_open_endpoint() failed with
status=20
[node001:10074] mca_btl_mx_init: mx_open_endpoint() failed with
status=20
[node001:10073] mca_btl_mx_init: mx_open_endpoint() failed with
status=20

--
Process 0.1.0 is unable to reach 0.1.1 for MPI communication.
If you specified the use of a BTL component, you may have
forgotten a component (such as "self") in the list of 
usable components.

... snipped
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or
environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)

--
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (goodbye)
mpirun noticed that job rank 0 with PID 10071 on node node001 exited on
signal 1 (Hangup).


I would not expect mx messages as communication should not go through
the mx card? (This is a twin dual core  shared memory node)
The same happens when testing on 2 nodes, using a hostfile.
I checked the state of the mx card with mx_endpoint_info and mx_info,
they are healthy and free.
What is missing here?

Thanks

Henk

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Absoft compilation problem

2007-07-05 Thread Yip, Elizabeth L

1.2.1 does NOT work for me.

-Original Message-
From: Jeff Squyres [mailto:jsquy...@cisco.com]
Sent: Thu 7/5/2007 2:39 AM
To: Open MPI Users
Subject: Re: [OMPI users] Absoft compilation problem
 
On Jul 2, 2007, at 7:31 PM, Yip, Elizabeth L wrote:

> I downloaded openmpi-1.2.3rc2r15098 from your "nightly snapshot",  
> same problem.
> I notice in version 1.1.2, you generate libmpi_f90.a instead of  
> the .so files.

Brian clarified for me off-list that we use the same LT for nightly  
trunk and OMPI 1.2.x tarballs.

In 1.1.2, you're correct that we only made static F90 libraries.

Can you clarify/confirm:

- 1.1.2 works for you (static F90 library)
- 1.2.1 works for you
- 1.2.3 does not work for you

If this is correct, then something is really, really weird.


>
> -Original Message-
> From: Jeff Squyres [mailto:jsquy...@cisco.com]
> Sent: Sun 7/1/2007 4:03 AM
> To: Open MPI Users
> Subject: Re: [OMPI users] Absoft compilation problem
>
> I unfortunately do not have access to an Absoft compiler to test this
> with; it looks like GNU Libtool is getting the wrong arguments to
> pass to the f95 compiler to build a shared library.
>
> A quick workaround for this issue would be to disable the MPI F90
> bindings with the --disable-mpi-f90 switch to configure.
>
> Could you try the Open MPI nightly trunk tarball and see if it works
> there?  We use a different version of Libtool to make those tarballs.
>
>
> On Jun 30, 2007, at 2:09 AM, Yip, Elizabeth L wrote:
>
> >
> > The attachment shows my problems when I tried to compile openmpi
> > 1.2.3 with absoft 95
> > (Absoft 64-bit Fortran 95 9.0 with Service Pack 1).  I have similar
> > problems with version 1.2.1, but
> > no problem with version 1.2.1.
> >
> > Elizabeth Yip
> > 
> > 
>
>
> -- 
> Jeff Squyres
> Cisco Systems
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> 


-- 
Jeff Squyres
Cisco Systems

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

<>

Re: [OMPI users] nfs romio

2007-07-05 Thread Robert Latham
On Mon, Jul 02, 2007 at 12:49:27PM -0500, Adams, Samuel D Contr AFRL/HEDR wrote:
> Anyway, so if anyone can tell me how I should configure my NFS server,
> or OpenMPI to make ROMIO work properly, I would appreciate it.   

Well, as Jeff said, the only safe way to run NFS servers for ROMIO is
by disabling all caching, which in turn will dramatically slow down
performance.  

Since NFS is performing so slowly for you, I'd suggest taking this
opportunity to deploy a parallel file system.  PVFS, Lustre, or GPFS
might make fine choices. 

==rob

-- 
Rob Latham
Mathematics and Computer Science DivisionA215 0178 EA2D B059 8CDF
Argonne National Lab, IL USA B29D F333 664A 4280 315B


[OMPI users] Can't get TotalView to find main program

2007-07-05 Thread Dennis McRitchie
Hi,

I'm trying to get TotalView to work using OpenMPI with a simple
1-processor test program. I have tried building it using both OpenMPI
1.1.4 and 1.2.3, with the -g option. This is on two RedHat EL4 systems,
one a 32-bit system, and one a 64-bit system. Each executable is built
on its own system. I then use the command:

mpirun -tv -np 1 /path/to/my/MPI/test/program

or

totalview mpirun -a -np 1 /path/to/my/MPI/test/program

By following the OpenMPI docs
(http://www.open-mpi.org/faq/?category=running#run-with-tv), TV will
start mpirun (actually, orterun), and then state that it can't find my
main program, as shown below in the output on the 32-bit system:


> totalview mpirun -a -np 1 /path/to/my/MPI/test/program
Linux x86 TotalView 8.1.0-0
Copyright 2007 by TotalView Technologies, LLC. ALL RIGHTS RESERVED.
Copyright 1999-2007 by Etnus, LLC.
Copyright 1999 by Etnus, Inc.
Copyright 1996-1998 by Dolphin Interconnect Solutions, Inc.
Copyright 1989-1996 by BBN Inc.
Reading symbols for process 1, executing "mpirun"
Library /usr/local/openmpi/1.1.4/intel/i386/bin/orterun, with 2 asects,
was linked at 0x08048000, and initially loaded at 0x9000
WARNING: Invalid .gnu_debuglink checksum for file
'/usr/lib/debug/usr/local/openmpi/1.1.4/intel/i386/bin/orterun.debug' is
3fb29221, expected fa794855
Mapping 3031 bytes of ELF string data from
'/usr/local/openmpi/1.1.4/intel/i386/bin/orterun'...done
Indexing 4 bytes of DWARF '.eh_frame' symbols from
'/usr/local/openmpi/1.1.4/intel/i386/bin/orterun'...done
Library /usr/local/openmpi/1.1.4/intel/i386/lib/liborte.so.0, with 2
asects, was linked at 0x, and initially loaded at 0x90022d00
WARNING: Invalid .gnu_debuglink checksum for file
'/usr/lib/debug/usr/local/openmpi/1.1.4/intel/i386/lib/liborte.so.0.0.0.
debug' is d24f7322, expected 2e59816b
Mapping 32483 bytes of ELF string data from
'/usr/local/openmpi/1.1.4/intel/i386/lib/liborte.so.0'...done
Indexing 4 bytes of DWARF '.eh_frame' symbols from
'/usr/local/openmpi/1.1.4/intel/i386/lib/liborte.so.0'...done
Library /usr/lib/libtorque.so.0, with 2 asects, was linked at
0x00456000, and initially loaded at 0x900b0500
Mapping 6778 bytes of ELF string data from
'/usr/lib/libtorque.so.0'...done
Indexing 4 bytes of DWARF '.eh_frame' symbols from
'/usr/lib/libtorque.so.0'...done
Library /usr/local/openmpi/1.1.4/intel/i386/lib/libopal.so.0, with 2
asects, was linked at 0x, and initially loaded at 0x900ea400
WARNING: Invalid .gnu_debuglink checksum for file
'/usr/lib/debug/usr/local/openmpi/1.1.4/intel/i386/lib/libopal.so.0.0.0.
debug' is 4a2fe1c5, expected 17575d23
Mapping 9597 bytes of ELF string data from
'/usr/local/openmpi/1.1.4/intel/i386/lib/libopal.so.0'...done
Indexing 4 bytes of DWARF '.eh_frame' symbols from
'/usr/local/openmpi/1.1.4/intel/i386/lib/libopal.so.0'...done
Library /lib/libnsl.so.1, with 2 asects, was linked at 0x04a92000, and
initially loaded at 0x9012a900
Mapping 3146 bytes of ELF string data from '/lib/libnsl.so.1'...done
Indexing 4 bytes of DWARF '.eh_frame' symbols from
'/lib/libnsl.so.1'...done
Library /lib/libutil.so.1, with 2 asects, was linked at 0x00343000, and
initially loaded at 0x9013ee00
Mapping 407 bytes of ELF string data from '/lib/libutil.so.1'...done
Indexing 4 bytes of DWARF '.eh_frame' symbols from
'/lib/libutil.so.1'...done
Library /lib/tls/libm.so.6, with 2 asects, was linked at 0x00bbb000, and
initially loaded at 0x90140800
Mapping 1996 bytes of ELF string data from '/lib/tls/libm.so.6'...done
Indexing 4 bytes of DWARF '.eh_frame' symbols from
'/lib/tls/libm.so.6'...done
Library /lib/libgcc_s.so.1, with 2 asects, was linked at 0x00456000, and
initially loaded at 0x90161700
Mapping 1403 bytes of ELF string data from '/lib/libgcc_s.so.1'...done
Indexing 1404 bytes of DWARF '.eh_frame' symbols from
'/lib/libgcc_s.so.1'...done
Library /lib/tls/libpthread.so.0, with 2 asects, was linked at
0x00cd1000, and initially loaded at 0x90168800
Mapping 4402 bytes of ELF string data from
'/lib/tls/libpthread.so.0'...done
Indexing 2272 bytes of DWARF '.eh_frame' symbols from
'/lib/tls/libpthread.so.0'...done
Library /lib/tls/libc.so.6, with 2 asects, was linked at 0x00a88000, and
initially loaded at 0x90178400
Mapping 20760 bytes of ELF string data from '/lib/tls/libc.so.6'...done
Indexing 16648 bytes of DWARF '.eh_frame' symbols from
'/lib/tls/libc.so.6'...done
Library /lib/libdl.so.2, with 2 asects, was linked at 0x00bb5000, and
initially loaded at 0x902a2000
Mapping 481 bytes of ELF string data from '/lib/libdl.so.2'...done
Indexing 4 bytes of DWARF '.eh_frame' symbols from
'/lib/libdl.so.2'...done
Library /opt/intel/fc/9.1.040/lib/libimf.so, with 2 asects, was linked
at 0x, and initially loaded at 0x902a3c00
Mapping 38346 bytes of ELF string data from
'/opt/intel/fc/9.1.040/lib/libimf.so'...done
Library /opt/intel/fc/9.1.040/lib/libirc.so, with 2 asects, was linked
at 0x, and initially loaded at 0x904e0900
Mapping 12223 b

[OMPI users] Exclude/Include HCA with OpenIB BTL ?

2007-07-05 Thread Don Kerr


Does the OpenIB BTL have the notion of include and exclude of HCA's as 
the TCP  BTL does for NICs?  E.G.  "--mca btl_tcp_if_include eth1,eth2 ..."


I think not but I was not sure if this was accomplished some other way 
so wanted to ask the group.


TIA
-DON


Re: [OMPI users] Exclude/Include HCA with OpenIB BTL ?

2007-07-05 Thread Jeff Squyres
This was added on the trunk recently (btl_openib_if_[in|ex]clude),  
but is not on the v1.2 branch.


On Jul 5, 2007, at 9:39 PM, Don Kerr wrote:



Does the OpenIB BTL have the notion of include and exclude of HCA's as
the TCP  BTL does for NICs?  E.G.  "--mca btl_tcp_if_include  
eth1,eth2 ..."


I think not but I was not sure if this was accomplished some other way
so wanted to ask the group.

TIA
-DON
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems



[OMPI users] Open MPI 1.2.3 spec file

2007-07-05 Thread Alex Tumanov

Greetings,

The spec file provided in the latest stable src RPM makes it possible
to change the name of the resulting RPM. I tried to make use of that,
but ran into some issues. Specifically, the resulting RPM does not
have the etc directory (and sample config files in it). rpmbuild
complained about that when it checked for left-over/unpackaged files,
and it seems that, despite the name change, the etc config files were
still installed under openmpi subdirectory, i.e. the new name I have
provided was not honoured by this part of the build.

Here are my compilation options:

rpmbuild --rebuild --define="configure_options \
   --with-openib=/usr/include/infiniband \
   --with-openib-libdir=/usr/lib64" \
   --define="install_in_opt 1" \
   --define="_name openmpi_mine" \
   -D "_defaultdocdir /opt/openmpi_mine/1.2.3/share" \
   --define="mflags all" openmpi-1.2.3-1.src.rpm

Platform: RHEL4U5, arch=x86-64

Has anybody seen the same problem when attempting to change the RPMs
name? How can I solve/work around this issue?

Adjacent to this, I also noticed that the doc file path remains
unchanged when I request install_in_opt. I get around this by making
an explicit definition for %_defaultdocdir macro on the command line,
but I think this would make sense to include into the spec file
itself. Below would be a simple proposed fix for this:

--- openmpi-1.2.3.spec  2007-07-05 17:00:54.0 -0400
+++ openmpi-1.2.3.spec.new  2007-07-05 19:39:49.0 -0400
@@ -180,6 +180,7 @@
%define _includedir /opt/%{name}/%{version}/include
%define _mandir /opt/%{name}/%{version}/man
%define _pkgdatadir /opt/%{name}/%{version}/share/%{name}
+%define _defaultdocdir /opt/%{name}/%{version}/share
%endif


Thanks,
Alex.

P.S. BTW, I would like to commend the author of the Open MPI spec file
- it's the most feature-rich spec file I've ever seen. One can learn
from it by example...


Re: [OMPI users] Open MPI 1.2.3 spec file

2007-07-05 Thread Alex Tumanov

Actually, tried compiling the RPM again, and at the very top, noticed
that the ./configure is called with --sysconfdir set to /opt/openmpi
instead of the new name provided. All other parameters are correct!
Any ideas?

./configure --build=x86_64-redhat-linux-gnu
--host=x86_64-redhat-linux-gnu --target=x86_64-redhat-linux-gnu
--program-prefix= --prefix=/opt/openmpi_mine/1.2.3
--exec-prefix=/opt/openmpi_mine/1.2.3
--bindir=/opt/openmpi_mine/1.2.3/bin
--sbindir=/opt/openmpi_mine/1.2.3/sbin
--sysconfdir=/opt/openmpi/1.2.3/etc
--datadir=/opt/openmpi_mine/1.2.3/share
--includedir=/opt/openmpi_mine/1.2.3/include
--libdir=/opt/openmpi_mine/1.2.3/lib
--libexecdir=/opt/openmpi_mine/1.2.3/libexec --localstatedir=/var
--sharedstatedir=/opt/openmpi_mine/1.2.3/com
--mandir=/opt/openmpi_mine/1.2.3/man --infodir=/usr/share/info
--prefix=/opt/openmpi_mine/1.2.3 --with-openib=/usr/include/infiniband
--with-openib-libdir=/usr/lib64

Thanks,
Alex.