Re: [OMPI users] MPI_Comm_spawn errors

2008-02-19 Thread Tim Prins

Hi Joao,

Unfortunately, spawn is broken on the development trunk right now. We 
are working on a major revamp of the runtime system which should fix 
these problems, but it is not ready yet.


Sorry about that :(

Tim


Joao Vicente Lima wrote:

Hi all,
I'm getting errors with spawn in the situations:

1) spawn1.c - spawning 2 process on localhost, one by one,  the error is:

spawning ...
[localhost:31390] *** Process received signal ***
[localhost:31390] Signal: Segmentation fault (11)
[localhost:31390] Signal code: Address not mapped (1)
[localhost:31390] Failing at address: 0x98
[localhost:31390] [ 0] /lib/libpthread.so.0 [0x2b1d38a17ed0]
[localhost:31390] [ 1]
/usr/local/mpi/openmpi-svn/lib/libmpi.so.0(ompi_comm_dyn_finalize+0xd2)
[0x2b1d37667cb2]
[localhost:31390] [ 2]
/usr/local/mpi/openmpi-svn/lib/libmpi.so.0(ompi_comm_finalize+0x3b)
[0x2b1d3766358b]
[localhost:31390] [ 3]
/usr/local/mpi/openmpi-svn/lib/libmpi.so.0(ompi_mpi_finalize+0x248)
[0x2b1d37679598]
[localhost:31390] [ 4] ./spawn1(main+0xac) [0x400ac4]
[localhost:31390] [ 5] /lib/libc.so.6(__libc_start_main+0xf4) [0x2b1d38c43b74]
[localhost:31390] [ 6] ./spawn1 [0x400989]
[localhost:31390] *** End of error message ***
--
mpirun has exited due to process rank 0 with PID 31390 on
node localhost calling "abort". This will have caused other processes
in the application to be terminated by signals sent by mpirun
(as reported here).
--

With 1 process spawned or with 2 process spawned in one call there is
no output from child.

2) spawn2.c - no response, this init is
 MPI_Init_thread (&argc, &argv, MPI_THREAD_MULTIPLE, &required)

the attachments contains the programs, ompi_info and config.log.

Some suggest ?

thanks a lot.
Joao.




___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




[OMPI users] openmpi/openib problems

2008-02-19 Thread jessie puls

Hi all,

I'm having problems getting openmpi to work correctly using verbs on 
some systems.  It's been working using openib for quite some time, but I 
need to get it working using verbs for some research I'm doing.  Anyway 
all seems to be good on the openib side of things.  ibv_devinfo and 
ibv_devices returns device information, and they are listed as active on 
each node.  Also all hosts are visible to each other (ibhosts shows a 
full list).


The problem I see with openmpi is I have the openib btl, but not the 
openib mpool, and when looking at the contents of ompi/mca/mpool/ I 
don't see openib there (sm and rdma are both listed and ompi_info shows 
they've been included in the build).  Any help would be appreciated.


Thanks,

Jessie


Re: [OMPI users] openmpi/openib problems

2008-02-19 Thread jessie puls

jessie puls wrote:

Hi all,

I'm having problems getting openmpi to work correctly using verbs on 
some systems.  It's been working using openib for quite some time, but I 
need to get it working using verbs for some research I'm doing.  



This would make a whole lot more sense if I'd typed it correctly.  It's 
been working using ipoib.



Anyway
all seems to be good on the openib side of things.  ibv_devinfo and 
ibv_devices returns device information, and they are listed as active on 
each node.  Also all hosts are visible to each other (ibhosts shows a 
full list).


The problem I see with openmpi is I have the openib btl, but not the 
openib mpool, and when looking at the contents of ompi/mca/mpool/ I 
don't see openib there (sm and rdma are both listed and ompi_info shows 
they've been included in the build).  Any help would be appreciated.


Thanks,

Jessie
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




[OMPI users] mpi.h macro naming

2008-02-19 Thread Ben Allan

Thanks in advance if this is already fixed in a later release I've not caught 
up to,
I'm at 1.2.3.

Is there some subtle reason that ompi's mpi.h
leaves the following macros both 
unguarded with an ifndef and un-prefixed with OMPI_ ?

This produces considerable amounts of compiler whinage for other
codes that include mpi.h. As always, extraneous whinage makes real
errors harder to find. (And yes, those other codes also need
*their* definitions of HAVE_LONG_LONG, etc properly protected).
And of course who knows how the answer was defined for any given
unprotected appearance of these macros?

/* Define to 1 if the system has the type `long long'. */
#undef HAVE_LONG_LONG

/* The size of a `bool', as computed by sizeof. */
#undef SIZEOF_BOOL

/* The size of a `int', as computed by sizeof. */
#undef SIZEOF_INT

If it's simply a matter of developer hours, I can post a patch
somewhere to address this. It appears that of these, only
sizeof_int affects more than a few source files.

thanks,
Ben Allan



[OMPI users] processes aborting on MPI_Finalize

2008-02-19 Thread Adams, Samuel D AFRL/RHDR
This is probably some coding error on my part, but under some problem
divisions I get processes aborting when I call MPI_Finalize().  Perhaps
they are still waiting incorrectly to recived some message or something
like that.  Sometimes it seems to work.  Is there a good way to get to
the bottom of this error? 


output-
4 additional processes aborted (not shown)

Sam Adams
General Dynamics Information Technology
Phone: 210.536.5945




Re: [OMPI users] Can't get OPENMPI to run parallel job with Myrinet/GM

2008-02-19 Thread twurgl
Would you be able to send me the mpirun command and args that you use?

how can I get more output to study?  I added "--display-map -d -v " to my
mpirun command, which gives more output, but not the reason for the
failure.


The information contained herein is GOODYEAR PROPRIETARY information and
includes GOODYEAR CONFIDENTIAL information. Reproduction of this
document, disclosure of the information, and use for any purpose other than
to conduct business with Goodyear is expressly prohibited.



 George Bosilca
   To 
 Sent by:  Open MPI Users  
 users-bounces@ope  cc 
 n-mpi.org t901...@rds4020.akr.goodyear.com
   Subject 
   Re: [OMPI users] Can't get OPENMPI  
 02/14/2008 10:18  to run parallel job with Myrinet/GM 
 PM


 Please respond to 
  Open MPI Users   
 






I run a full testing on the GM with 1.2.5 and with the trunk. Both of
them run to completion without any errors.

Moreover, the error message only say that one of the processes was
terminated, which usually means that something bad happened somewhere
else, and the runtime decided to terminate the whole job. This might
be a segfault, an abort. Without more information it will be difficult
to help or to offer any advice..

   george.

On Feb 14, 2008, at 11:15 AM, Tom Wurgler wrote:

>
> I am trying to use openmpi 1.2.5 (I also tried 1.2.4) to run a
> parallel job
> using GM drivers.  The only message I get is:
>
> mpirun noticed that job rank 0 with PID 19508 on node node93 exited on
> signal 15 (Terminated).
>
> I can run serially on one node (4 processors), it just dies when
> trying to use
> more than one node.
>
> Any help appreciated.
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] Can't get OPENMPI to run parallel job with Myrinet/GM

2008-02-19 Thread George Bosilca

Tom,

Here is how I configured Open MPI. It's mostly the default  
configuration ...


../../ompi-trunk/configure --prefix=/nfs/home/bosilca/opt/unstable/fog/ 
fast --disable-debug --enable-picky --with-platform=optimized -- 
disable-mpi-cxx --disable-mpi-f90 --enable-mpi-f77 --disable-mpi- 
profiling --with-gm=/opt/gm -enable-visibility


No specific arguments were required to run the tests. You can force  
the GM BTL by using "--mca btl gm,self" or "--mca btl gm,sm,self" if  
you need shared memory.


  george.


On Feb 19, 2008, at 4:59 PM, twu...@goodyear.com wrote:


Would you be able to send me the mpirun command and args that you use?

how can I get more output to study?  I added "--display-map -d -v "  
to my

mpirun command, which gives more output, but not the reason for the
failure.


The information contained herein is GOODYEAR PROPRIETARY information  
and

includes GOODYEAR CONFIDENTIAL information. Reproduction of this
document, disclosure of the information, and use for any purpose  
other than

to conduct business with Goodyear is expressly prohibited.



George Bosilca
.edu 
>  To
Sent by:  Open MPI Users >
users- 
bounces@ope  cc

n-mpi.org t901...@rds4020.akr.goodyear.com
   
Subject
  Re: [OMPI users] Can't get  
OPENMPI
02/14/2008 10:18  to run parallel job with  
Myrinet/GM

PM


Please respond to
 Open MPI Users







I run a full testing on the GM with 1.2.5 and with the trunk. Both of
them run to completion without any errors.

Moreover, the error message only say that one of the processes was
terminated, which usually means that something bad happened somewhere
else, and the runtime decided to terminate the whole job. This might
be a segfault, an abort. Without more information it will be difficult
to help or to offer any advice..

  george.

On Feb 14, 2008, at 11:15 AM, Tom Wurgler wrote:



I am trying to use openmpi 1.2.5 (I also tried 1.2.4) to run a
parallel job
using GM drivers.  The only message I get is:

mpirun noticed that job rank 0 with PID 19508 on node node93 exited  
on

signal 15 (Terminated).

I can run serially on one node (4 processors), it just dies when
trying to use
more than one node.

Any help appreciated.



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




smime.p7s
Description: S/MIME cryptographic signature