[OMPI users] Orted freezes on launch of application

2007-03-13 Thread David Minor
Hi,

I'm an MPICH2 user trying out openmpi. I'm running a 1G network under
Red Hat 9, but using the g++ 3.4.3 compiler. Openmpi compiled and
installed fine but none of my applications that run under MPICH2 will
run.  I decided to go backwards and try to run a non-mpi application
like /bin/ps, same results. 

mpirun -np 2 --host zebra1,bug --mca pls_rsh_debug 1 --mca pls_rsh_agent
rsh /bin/ps

 

The end result is the console is frozen. orted is running on both nodes,
one version of orted is zombied under mpirun. I get the same results
trying to run a simple mpi application. The enclosed tar has all the
info you ask for and then some. I know I'm probably just not doing
something right but you're documentation leaves a lot to be desired. The
best doc seems to the be FAQ. There doesn't seem to be anything more
comprehensive, if there is please tell me.  Also, you need to define an
== operator for MPI::Request that will allow a request to be compared to
MPI_NULL_REQUEST. I don't see any way to do this in you c++
implementation.  

Regards,

David Minor

Orbotech



Re: [OMPI users] Fun with threading

2007-03-13 Thread David Minor
Sounds like bad news about the threading. That's probably what's hanging me as 
well. We're running clusters of multi-core smp's, our app NEEDS 
multi-threading. It'd be nice to get an "official" reply on this from someone 
on the dev team.
-David

-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf 
Of Mike Houston
Sent: Tuesday, March 13, 2007 5:52 AM
To: Open MPI Users
Subject: [OMPI users] Fun with threading

At least with 1.1.4, I'm having a heck of a time with enabling 
multi-threading.  Configuring with --with-threads=posix 
--enable-mpi-threads --enable-progress-threads leads to mpirun just 
hanging, even when not launching MPI apps, i.e. mpirun -np 1 hostname, 
and I can't crtl-c to kill it, I have to kill -9 it.  Removing progress 
threads support results in the same behavior.  Removing 
--enable-mpi-threads gets mpirun working again, but not the thread 
protection I need.

What is the status for multi thread support?  It looks like it's still 
largely untested from my reading of the mailing lists.  We actually have 
an application that would be much easier to deal with if we could have 
two threads in a process both using MPI.  Funneling everything through a 
single processor creates a locking nightmare, and generally means we 
will be forced to spin checking a IRecv and the status of a data 
structure instead of having one thread happily sitting on a blocking 
receive and the other watching the data structure, basically pissing 
away a processor that we could be using to do something useful.  (We are 
basically doing a simplified version of DSM and we need to respond to 
remote data requests).

At the moment, it seems that when running without threading support 
enabled, if we only post a receive on a single thread, things are mostly 
happy, except if one thread in process sends to the other thread in the 
same process who has posted a receive.  Under TCP, the send fails with:

*** An error occurred in MPI_Send
*** on communicator MPI_COMM_WORLD
*** MPI_ERR_INTERN: internal error
*** MPI_ERRORS_ARE_FATAL (goodbye)
[0,0,0]-[0,1,0] mca_oob_tcp_msg_recv: readv failed with errno=104

SM has undefined results.

Obviously I'm playing fast and loose, which is why I'm attempting to get 
threading support to work to see if it solve the headaches.  If you 
really want to have some fun, have a posted MPI_Recv on one thread and 
issue an MPI_Barrier on the other (with SM):

Signal:11 info.si_errno:0(Success) si_code:1(SEGV_MAPERR)
Failing at addr:0x1c
[0] func:/usr/lib/libopal.so.0 [0xc030f4]
[1] func:/lib/tls/libpthread.so.0 [0x46f93890]
[2] 
func:/usr/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_recv_frag_match+0xb08) 
[0x14ec38]
[3] 
func:/usr/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_recv_frag_callback+0x2f9) 
[0x14f7e9]
[4] 
func:/usr/lib/openmpi/mca_btl_sm.so(mca_btl_sm_component_progress+0xa87) 
[0x806c07]
[5] func:/usr/lib/openmpi/mca_bml_r2.so(mca_bml_r2_progress+0x39) [0x510c69]
[6] func:/usr/lib/libopal.so.0(opal_progress+0x69) [0xbecc39]
[7] func:/usr/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_send+0x785) [0x14d675]
[8] 
func:/usr/lib/openmpi/mca_coll_tuned.so(ompi_coll_tuned_sendrecv_actual_localcompleted+0x8c)
 
[0x5cc3fc]
[9] 
func:/usr/lib/openmpi/mca_coll_tuned.so(ompi_coll_tuned_barrier_intra_two_procs+0x76)
 
[0x5ceef6]
[10] 
func:/usr/lib/openmpi/mca_coll_tuned.so(ompi_coll_tuned_barrier_intra_dec_fixed+0x38)
 
[0x5cc638]
[11] func:/usr/lib/libmpi.so.0(PMPI_Barrier+0xe9) [0x29a1b9]

-Mike
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] Orted freezes on launch of application

2007-03-13 Thread David Minor
with tar

 



From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf 
Of Ralph H Castain
Sent: Tuesday, March 13, 2007 3:25 PM
To: Open MPI Users 
Subject: Re: [OMPI users] Orted freezes on launch of application

 

Hi David

I think your tar file didn't get attached - at least, it didn't reach me. Can 
you please send it again?

Thanks
Ralph


On 3/13/07 1:00 AM, "David Minor"  wrote:

Hi,
I'm an MPICH2 user trying out openmpi. I'm running a 1G network under Red Hat 
9, but using the g++ 3.4.3 compiler. Openmpi compiled and installed fine but 
none of my applications that run under MPICH2 will run.  I decided to go 
backwards and try to run a non-mpi application like /bin/ps, same results. 
mpirun -np 2 --host zebra1,bug --mca pls_rsh_debug 1 --mca pls_rsh_agent rsh 
/bin/ps
 
The end result is the console is frozen. orted is running on both nodes, one 
version of orted is zombied under mpirun. I get the same results trying to run 
a simple mpi application. The enclosed tar has all the info you ask for and 
then some. I know I'm probably just not doing something right but you're 
documentation leaves a lot to be desired. The best doc seems to the be FAQ. 
There doesn't seem to be anything more comprehensive, if there is please tell 
me. Also, you need to define an == operator for MPI::Request that will allow a 
request to be compared to MPI_NULL_REQUEST. I don't see any way to do this in 
you c++ implementation.  
Regards,
David Minor
Orbotech



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

 



ompi-output.tar.gz
Description: ompi-output.tar.gz