[OMPI users] mpirun related

2007-01-30 Thread Chevchenkovic Chevchenkovic

Hi,
 mpirun internally uses ssh to launch a program on multiple nodes.
I would like to see the various parameters that are sent to each of
the nodes. How can I do this?

-chev


[OMPI users] mutex deadlock in btl tcp

2007-01-30 Thread Jeremy Buisson
Dear Open MPI users list,

From time to time, I experience a mutex deadlock in Open-MPI 1.1.2. The stack
trace is available at the end of the mail. The deadlock seems to be caused by
lines 118 & 119 of the ompi/mca/btl/tcp/btl_tcp.c file, in function
mca_btl_tcp_add_procs:
OBJ_RELEASE(tcp_endpoint);
OPAL_THREAD_UNLOCK(&tcp_proc->proc_lock);
(of course, I did not check whether line numbers have changed since 1.1.2.)
Indeed, releasing tcp_endpoint causes a call to mca_btl_tcp_proc_remove that
attempts to acquire the mutex tcp_proc->proc_lock, which is already held by the
thread (OBJ_THREAD_LOCK(&tcp_proc->proc_lock) at line 103 of the
ompi/mca/btl/tcp/btl_tcp.c file). Switching the two lines above (ie releasing
the mutex before destructing tcp_endpoint) seems to be sufficient to fix the
deadlock. Maybe should the changes done in the mca_btl_tcp_proc_insert function
be reverted rather than releasing the mutex before tcp_endpoint?
As far as I looked, the problem seems to still appear in the trunk revision 
13359.

Second point. Is there any reason why MPI_Comm_spawn is restricted to execute
the new process(es) only on hosts listed in either the --host option or in the
hostfile? Or did I miss something?

Best regards,
Jeremy

--
stack trace as dumped by open-mpi (gdb version follows):
opal_mutex_lock(): Resource deadlock avoided
Signal:6 info.si_errno:0(Success) si_code:-6()
[0] func:/home1/jbuisson/soft/openmpi-1.1.2/lib/libopal.so.0 [0x8addeb]
[1] func:/lib/tls/libpthread.so.0 [0x176e40]
[2] func:/lib/tls/libc.so.6(abort+0x1d5) [0xa294e5]
[3] func:/home1/jbuisson/soft/openmpi-1.1.2/lib/openmpi/mca_btl_tcp.so
[0x65f8a3]
[4]
func:/home1/jbuisson/soft/openmpi-1.1.2/lib/openmpi/mca_btl_tcp.so(mca_btl_tcp_proc_remove+0x2a)
[0x65fff0]
[5] func:/home1/jbuisson/soft/openmpi-1.1.2/lib/openmpi/mca_btl_tcp.so
[0x65cb24]
[6] func:/home1/jbuisson/soft/openmpi-1.1.2/lib/openmpi/mca_btl_tcp.so
[0x659465]
[7]
func:/home1/jbuisson/soft/openmpi-1.1.2/lib/openmpi/mca_btl_tcp.so(mca_btl_tcp_add_procs+0x10f)
[0x65927b]
[8]
func:/home1/jbuisson/soft/openmpi-1.1.2/lib/openmpi/mca_bml_r2.so(mca_bml_r2_add_procs+0x1bb)
[0x628023]
[9]
func:/home1/jbuisson/soft/openmpi-1.1.2/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_add_procs+0xd6)
[0x61650b]
[10]
func:/home1/jbuisson/soft/openmpi-1.1.2/lib/libmpi.so.0(ompi_comm_get_rport+0x1f8)
[0xb82303]
[11]
func:/home1/jbuisson/soft/openmpi-1.1.2/lib/libmpi.so.0(ompi_comm_connect_accept+0xbb)
[0xb81b43]
[12]
func:/home1/jbuisson/soft/openmpi-1.1.2/lib/libmpi.so.0(PMPI_Comm_spawn+0x3de)
[0xbb671a]
[13]
func:/home1/jbuisson/target/bin/mpi-spawner(__gxx_personality_v0+0x3d2)
[0x804bb8e]
[14] func:/home1/jbuisson/target/bin/mpi-spawner [0x804bdff]
[15] func:/home1/jbuisson/target/bin/mpi-spawner [0x804bfd4]
[16] func:/lib/tls/libc.so.6(__libc_start_main+0xda) [0xa1578a]
[17]
func:/home1/jbuisson/target/bin/mpi-spawner(__gxx_personality_v0+0x75)
[0x804b831]
*** End of error message ***


Same stack, dumped by gdb:
#0  0x00176357 in __pause_nocancel () from /lib/tls/libpthread.so.0
#1  0x008ade9b in opal_show_stackframe (signo=6, info=0xbfff9290,
p=0xbfff9310) at stacktrace.c:306
#2  
#3  0x00a27cdf in raise () from /lib/tls/libc.so.6
#4  0x00a294e5 in abort () from /lib/tls/libc.so.6
#5  0x0065f8a3 in opal_mutex_lock (m=0x8ff8250) at
../../../../opal/threads/mutex_unix.h:104
#6  0x0065fff0 in mca_btl_tcp_proc_remove (btl_proc=0x8ff8220,
btl_endpoint=0x900eba0) at btl_tcp_proc.c:296
#7  0x0065cb24 in mca_btl_tcp_endpoint_destruct (endpoint=0x900eba0) at
btl_tcp_endpoint.c:99
#8  0x00659465 in opal_obj_run_destructors (object=0x900eba0) at
../../../../opal/class/opal_object.h:405
#9  0x0065927b in mca_btl_tcp_add_procs (btl=0x8e57c30, nprocs=1,
ompi_procs=0x8ff7ac8, peers=0x8ff7ad8, reachable=0xbfff98e4) at
btl_tcp.c:118
#10 0x00628023 in mca_bml_r2_add_procs (nprocs=1, procs=0x8ff7ac8,
bml_endpoints=0x8ff60b8, reachable=0xbfff98e4) at bml_r2.c:231
#11 0x0061650b in mca_pml_ob1_add_procs (procs=0xbfff9930, nprocs=1) at
pml_ob1.c:133
#12 0x00b82303 in ompi_comm_get_rport (port=0x0, send_first=0,
proc=0x8e51c70, tag=2000) at communicator/comm_dyn.c:305
#13 0x00b81b43 in ompi_comm_connect_accept (comm=0x8ff8ce0, root=0,
port=0x0, send_first=0, newcomm=0xbfff9a38, tag=2000) at
communicator/comm_dyn.c:85
#14 0x00bb671a in PMPI_Comm_spawn (command=0x8ff88f0
"/home1/jbuisson/target/bin/sample-npb-ft-pp", argv=0xbfff9b40,
maxprocs=1, info=0x8ff73e0, root=0,
comm=0x8ff8ce0, intercomm=0xbfff9aa4, array_of_errcodes=0x0) at
pcomm_spawn.c:110



signature.asc
Description: OpenPGP digital signature


Re: [OMPI users] mpirun related

2007-01-30 Thread Adrian Knoth
On Mon, Jan 29, 2007 at 10:49:10PM -0800, Chevchenkovic Chevchenkovic wrote:
> Hi,

Hi

> mpirun internally uses ssh to launch a program on multiple nodes.
> I would like to see the various parameters that are sent to each of
> the nodes. How can I do this?

You mean adding "pls_rsh_debug=1" to your ~/.openmpi/mca-params.conf?


HTH

-- 
Cluster and Metacomputing Working Group
Friedrich-Schiller-Universität Jena, Germany

private: http://adi.thur.de


Re: [OMPI users] Scrambled communications using ssh starter onmultiple nodes.

2007-01-30 Thread Fisher, Mark S
The master process uses both MPI_ANY_SOURCE and MPI_ANY_TAG while
waiting for requests from slave processes. The slaves sometimes use
MPI_ANY_TAG but the source is always specified.

We have run the code through valgrid for a number of cases including the
one being used here. 

The code is Fortran 90 and we are using the FORTRAN 77 interface so I do
not believe this is a problem.

We are using Gigabit Ethernet. 

I could look at LAM again to see if it would work. The code needs to be
in a specific working directory and we need some environment variable
set. This was not supported well in pre MPI 2. versions of MPI. For
MPICH1 I actually launch a script for the slaves so that we have the
proper setup before running the executable. Note I had tried that with
OpenMPI and it had an internal error in orterun. This is not a problem
since the mpirun can setup everything we need. If you think it is worth
while I will download and try it.

-Original Message-
From: Jeff Squyres [mailto:jsquy...@cisco.com] 
Sent: Monday, January 29, 2007 7:54 PM
To: Open MPI Users
Subject: Re: [OMPI users] Scrambled communications using ssh starter
onmultiple nodes.

Without analyzing your source, it's hard to say.  I will say that OMPI
may send fragments out of order, but we do, of course, provide the same
message ordering guarantees that MPI mandates.  So let me ask a few
leading questions:

- Are you using any wildcards in your receives, such as MPI_ANY_SOURCE
or MPI_ANY_TAG?

- Have you run your code through a memory-checking debugger such as
valgrind?

- I don't know what Scali MPI uses, but MPICH and Intel MPI use integers
for MPI handles.  Have you tried LAM/MPI as well?  It, like Open MPI,
uses pointers for MPI handles.  I mention this because apps that
unintentionally have code that takes advantage of integer handles can
sometimes behave unpredictably when switching to a pointer-based MPI
implementation.

- What network interconnect are you using between the two hosts?



On Jan 25, 2007, at 4:22 PM, Fisher, Mark S wrote:

> Recently I wanted to try OpenMPI for use with our CFD flow solver 
> WINDUS. The code uses a master/slave methodology were the master 
> handles I/O and issues tasks for the slaves to perform. The original 
> parallel implementation was done in 1993 using PVM and in 1999 we 
> added support for MPI.
>
> When testing the code with Openmpi 1.1.2 it ran fine when running on a

> single machine. As soon as I ran on more than one machine I started 
> getting random errors right away (the attached tar ball has a good and

> bad output). It looked like either the messages were out of order or 
> were for the other slave process. In the run mode used there is no 
> slave to slave communication. In the file the code died near the 
> beginning of the communication between master and slave. Sometimes it 
> will run further before it fails.
>
> I have included a tar file with the build and configuration info. The 
> two nodes are identical Xeon 2.8 GHZ machines running SLED 10. I am 
> running real-time (no queue) using the ssh starter using the following

> appt file.
>
> -x PVMTASK -x BCFD_PS_MODE --mca pls_rsh_agent /usr/bin/ssh --host
> skipper2  -wdir /opt/scratch/m209290/ol.scr.16348 -np 1 ./ 
> __bcfdbeta.exe -x PVMTASK -x BCFD_PS_MODE --mca pls_rsh_agent 
> /usr/bin/ssh --host copland -wdir /tmp/mpi.m209290 -np 2 
> ./__bcfdbeta.exe
>
> The above file fails but the following works:
>
> -x PVMTASK -x BCFD_PS_MODE --mca pls_rsh_agent /usr/bin/ssh --host
> skipper2  -wdir /opt/scratch/m209290/ol.scr.16348 -np 1 ./ 
> __bcfdbeta.exe -x PVMTASK -x BCFD_PS_MODE --mca pls_rsh_agent 
> /usr/bin/ssh --host
> skipper2 -wdir /tmp/mpi.m209290 -np 2 ./__bcfdbeta.exe
>
> The first process is the master and the second two are the slaves.  
> I am
> not sure what is going wrong, the code runs fine with many other MPI 
> distributions (MPICH1/2, Intel, Scali...). I assume that either I 
> built it wrong or am not running it properly but I cannot see what I 
> am doing wrong. Any help would be appreciated!
>
>  <>
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


--
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] Scrambled communications using ssh starter onmultiple nodes.

2007-01-30 Thread Jeff Squyres

On Jan 30, 2007, at 9:35 AM, Fisher, Mark S wrote:


The master process uses both MPI_ANY_SOURCE and MPI_ANY_TAG while
waiting for requests from slave processes. The slaves sometimes use
MPI_ANY_TAG but the source is always specified.


I think you said that you only had corruption issues on the slave,  
right?  If so, the ANY_SOURCE/ANY_TAG on the master probably aren't  
the issue.


But if you're doing ANY_TAG on the slaves, you might want to double  
check that that code is doing exactly what you think it's doing.  Are  
there any race conditions such that a message could be received on  
that ANY_TAG that you did not intend to receive there?  Look  
especially hard at non-blocking receives with ANY_TAG.


We have run the code through valgrid for a number of cases  
including the

one being used here.


Excellent.

The code is Fortran 90 and we are using the FORTRAN 77 interface so  
I do

not believe this is a problem.


Agreed; should not be an issue.


We are using Gigabit Ethernet.


Ok, good.

I could look at LAM again to see if it would work. The code needs  
to be

in a specific working directory and we need some environment variable
set. This was not supported well in pre MPI 2. versions of MPI. For
MPICH1 I actually launch a script for the slaves so that we have the
proper setup before running the executable. Note I had tried that with
OpenMPI and it had an internal error in orterun. This is not a problem


Really?  OMPI's mpirun does not depend on the executable being an MPI  
application -- indeed, you can "mpirun -np 2 uptime" with no  
problem.  What problem did you run into here?


since the mpirun can setup everything we need. If you think it is  
worth

while I will download and try it.


From what you describe, it sounds like order of messaging may be the  
issue, not necessarily MPI handle types.  So let's hold off on that  
one for the moment (although LAM should be pretty straightforward to  
try -- you should be able to mpirun scripts with no problems; perhaps  
you can try it as a background effort when you have spare cycles /  
etc.), and look at your slave code for receiving.




-Original Message-
From: Jeff Squyres [mailto:jsquy...@cisco.com]
Sent: Monday, January 29, 2007 7:54 PM
To: Open MPI Users
Subject: Re: [OMPI users] Scrambled communications using ssh starter
onmultiple nodes.

Without analyzing your source, it's hard to say.  I will say that OMPI
may send fragments out of order, but we do, of course, provide the  
same

message ordering guarantees that MPI mandates.  So let me ask a few
leading questions:

- Are you using any wildcards in your receives, such as MPI_ANY_SOURCE
or MPI_ANY_TAG?

- Have you run your code through a memory-checking debugger such as
valgrind?

- I don't know what Scali MPI uses, but MPICH and Intel MPI use  
integers

for MPI handles.  Have you tried LAM/MPI as well?  It, like Open MPI,
uses pointers for MPI handles.  I mention this because apps that
unintentionally have code that takes advantage of integer handles can
sometimes behave unpredictably when switching to a pointer-based MPI
implementation.

- What network interconnect are you using between the two hosts?



On Jan 25, 2007, at 4:22 PM, Fisher, Mark S wrote:


Recently I wanted to try OpenMPI for use with our CFD flow solver
WINDUS. The code uses a master/slave methodology were the master
handles I/O and issues tasks for the slaves to perform. The original
parallel implementation was done in 1993 using PVM and in 1999 we
added support for MPI.

When testing the code with Openmpi 1.1.2 it ran fine when running  
on a



single machine. As soon as I ran on more than one machine I started
getting random errors right away (the attached tar ball has a good  
and



bad output). It looked like either the messages were out of order or
were for the other slave process. In the run mode used there is no
slave to slave communication. In the file the code died near the
beginning of the communication between master and slave. Sometimes it
will run further before it fails.

I have included a tar file with the build and configuration info. The
two nodes are identical Xeon 2.8 GHZ machines running SLED 10. I am
running real-time (no queue) using the ssh starter using the  
following



appt file.

-x PVMTASK -x BCFD_PS_MODE --mca pls_rsh_agent /usr/bin/ssh --host
skipper2  -wdir /opt/scratch/m209290/ol.scr.16348 -np 1 ./
__bcfdbeta.exe -x PVMTASK -x BCFD_PS_MODE --mca pls_rsh_agent
/usr/bin/ssh --host copland -wdir /tmp/mpi.m209290 -np 2
./__bcfdbeta.exe

The above file fails but the following works:

-x PVMTASK -x BCFD_PS_MODE --mca pls_rsh_agent /usr/bin/ssh --host
skipper2  -wdir /opt/scratch/m209290/ol.scr.16348 -np 1 ./
__bcfdbeta.exe -x PVMTASK -x BCFD_PS_MODE --mca pls_rsh_agent
/usr/bin/ssh --host
skipper2 -wdir /tmp/mpi.m209290 -np 2 ./__bcfdbeta.exe

The first process is the master and the second two are the slaves.
I am
not sure what is going wrong, the code runs fine with ma

Re: [OMPI users] Scrambled communications using ssh starteronmultiple nodes.

2007-01-30 Thread Fisher, Mark S
The slaves send specific requests to the master and then waits for a
reply to that request. For instance it might send a request to read a
variable from the file. The master will read the variable and send it
back with the same tag in response. Thus there is never more than one
response at a time to a given slave. We do not use any broadcast
functions in the code. 

The fact that it run ok on one host but not more than one host seems to
indicate something else is the problem. The code has been used for 13
years in parallel and runs with PVM and other MPI distros without any
problems. The communication patterns are very simple and only require
that message order be preserved.

-Original Message-
From: Jeff Squyres [mailto:jsquy...@cisco.com] 
Sent: Tuesday, January 30, 2007 8:44 AM
To: Open MPI Users
Subject: Re: [OMPI users] Scrambled communications using ssh
starteronmultiple nodes.

On Jan 30, 2007, at 9:35 AM, Fisher, Mark S wrote:

> The master process uses both MPI_ANY_SOURCE and MPI_ANY_TAG while 
> waiting for requests from slave processes. The slaves sometimes use 
> MPI_ANY_TAG but the source is always specified.

I think you said that you only had corruption issues on the slave,
right?  If so, the ANY_SOURCE/ANY_TAG on the master probably aren't the
issue.

But if you're doing ANY_TAG on the slaves, you might want to double
check that that code is doing exactly what you think it's doing.  Are
there any race conditions such that a message could be received on that
ANY_TAG that you did not intend to receive there?  Look especially hard
at non-blocking receives with ANY_TAG.

> We have run the code through valgrid for a number of cases including 
> the one being used here.

Excellent.

> The code is Fortran 90 and we are using the FORTRAN 77 interface so I 
> do not believe this is a problem.

Agreed; should not be an issue.

> We are using Gigabit Ethernet.

Ok, good.

> I could look at LAM again to see if it would work. The code needs to 
> be in a specific working directory and we need some environment 
> variable set. This was not supported well in pre MPI 2. versions of 
> MPI. For
> MPICH1 I actually launch a script for the slaves so that we have the 
> proper setup before running the executable. Note I had tried that with

> OpenMPI and it had an internal error in orterun. This is not a problem

Really?  OMPI's mpirun does not depend on the executable being an MPI
application -- indeed, you can "mpirun -np 2 uptime" with no problem.
What problem did you run into here?

> since the mpirun can setup everything we need. If you think it is 
> worth while I will download and try it.

 From what you describe, it sounds like order of messaging may be the
issue, not necessarily MPI handle types.  So let's hold off on that one
for the moment (although LAM should be pretty straightforward to try --
you should be able to mpirun scripts with no problems; perhaps you can
try it as a background effort when you have spare cycles / etc.), and
look at your slave code for receiving.


> -Original Message-
> From: Jeff Squyres [mailto:jsquy...@cisco.com]
> Sent: Monday, January 29, 2007 7:54 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] Scrambled communications using ssh starter 
> onmultiple nodes.
>
> Without analyzing your source, it's hard to say.  I will say that OMPI

> may send fragments out of order, but we do, of course, provide the 
> same message ordering guarantees that MPI mandates.  So let me ask a 
> few leading questions:
>
> - Are you using any wildcards in your receives, such as MPI_ANY_SOURCE

> or MPI_ANY_TAG?
>
> - Have you run your code through a memory-checking debugger such as 
> valgrind?
>
> - I don't know what Scali MPI uses, but MPICH and Intel MPI use 
> integers for MPI handles.  Have you tried LAM/MPI as well?  It, like 
> Open MPI, uses pointers for MPI handles.  I mention this because apps 
> that unintentionally have code that takes advantage of integer handles

> can sometimes behave unpredictably when switching to a pointer-based 
> MPI implementation.
>
> - What network interconnect are you using between the two hosts?
>
>
>
> On Jan 25, 2007, at 4:22 PM, Fisher, Mark S wrote:
>
>> Recently I wanted to try OpenMPI for use with our CFD flow solver 
>> WINDUS. The code uses a master/slave methodology were the master 
>> handles I/O and issues tasks for the slaves to perform. The original 
>> parallel implementation was done in 1993 using PVM and in 1999 we 
>> added support for MPI.
>>
>> When testing the code with Openmpi 1.1.2 it ran fine when running on 
>> a
>
>> single machine. As soon as I ran on more than one machine I started 
>> getting random errors right away (the attached tar ball has a good 
>> and
>
>> bad output). It looked like either the messages were out of order or 
>> were for the other slave process. In the run mode used there is no 
>> slave to slave communication. In the file the code died near the 
>> beginning of the communication

Re: [OMPI users] Scrambled communications using ssh starteronmultiple nodes.

2007-01-30 Thread Jeff Squyres

Is there any way that you can share the code?

On Jan 30, 2007, at 9:57 AM, Fisher, Mark S wrote:


The slaves send specific requests to the master and then waits for a
reply to that request. For instance it might send a request to read a
variable from the file. The master will read the variable and send it
back with the same tag in response. Thus there is never more than one
response at a time to a given slave. We do not use any broadcast
functions in the code.

The fact that it run ok on one host but not more than one host  
seems to

indicate something else is the problem. The code has been used for 13
years in parallel and runs with PVM and other MPI distros without any
problems. The communication patterns are very simple and only require
that message order be preserved.

-Original Message-
From: Jeff Squyres [mailto:jsquy...@cisco.com]
Sent: Tuesday, January 30, 2007 8:44 AM
To: Open MPI Users
Subject: Re: [OMPI users] Scrambled communications using ssh
starteronmultiple nodes.

On Jan 30, 2007, at 9:35 AM, Fisher, Mark S wrote:


The master process uses both MPI_ANY_SOURCE and MPI_ANY_TAG while
waiting for requests from slave processes. The slaves sometimes use
MPI_ANY_TAG but the source is always specified.


I think you said that you only had corruption issues on the slave,
right?  If so, the ANY_SOURCE/ANY_TAG on the master probably aren't  
the

issue.

But if you're doing ANY_TAG on the slaves, you might want to double
check that that code is doing exactly what you think it's doing.  Are
there any race conditions such that a message could be received on  
that
ANY_TAG that you did not intend to receive there?  Look especially  
hard

at non-blocking receives with ANY_TAG.


We have run the code through valgrid for a number of cases including
the one being used here.


Excellent.


The code is Fortran 90 and we are using the FORTRAN 77 interface so I
do not believe this is a problem.


Agreed; should not be an issue.


We are using Gigabit Ethernet.


Ok, good.


I could look at LAM again to see if it would work. The code needs to
be in a specific working directory and we need some environment
variable set. This was not supported well in pre MPI 2. versions of
MPI. For
MPICH1 I actually launch a script for the slaves so that we have the
proper setup before running the executable. Note I had tried that  
with


OpenMPI and it had an internal error in orterun. This is not a  
problem


Really?  OMPI's mpirun does not depend on the executable being an MPI
application -- indeed, you can "mpirun -np 2 uptime" with no problem.
What problem did you run into here?


since the mpirun can setup everything we need. If you think it is
worth while I will download and try it.


 From what you describe, it sounds like order of messaging may be the
issue, not necessarily MPI handle types.  So let's hold off on that  
one
for the moment (although LAM should be pretty straightforward to  
try --

you should be able to mpirun scripts with no problems; perhaps you can
try it as a background effort when you have spare cycles / etc.), and
look at your slave code for receiving.



-Original Message-
From: Jeff Squyres [mailto:jsquy...@cisco.com]
Sent: Monday, January 29, 2007 7:54 PM
To: Open MPI Users
Subject: Re: [OMPI users] Scrambled communications using ssh starter
onmultiple nodes.

Without analyzing your source, it's hard to say.  I will say that  
OMPI



may send fragments out of order, but we do, of course, provide the
same message ordering guarantees that MPI mandates.  So let me ask a
few leading questions:

- Are you using any wildcards in your receives, such as  
MPI_ANY_SOURCE



or MPI_ANY_TAG?

- Have you run your code through a memory-checking debugger such as
valgrind?

- I don't know what Scali MPI uses, but MPICH and Intel MPI use
integers for MPI handles.  Have you tried LAM/MPI as well?  It, like
Open MPI, uses pointers for MPI handles.  I mention this because apps
that unintentionally have code that takes advantage of integer  
handles



can sometimes behave unpredictably when switching to a pointer-based
MPI implementation.

- What network interconnect are you using between the two hosts?



On Jan 25, 2007, at 4:22 PM, Fisher, Mark S wrote:


Recently I wanted to try OpenMPI for use with our CFD flow solver
WINDUS. The code uses a master/slave methodology were the master
handles I/O and issues tasks for the slaves to perform. The original
parallel implementation was done in 1993 using PVM and in 1999 we
added support for MPI.

When testing the code with Openmpi 1.1.2 it ran fine when running on
a



single machine. As soon as I ran on more than one machine I started
getting random errors right away (the attached tar ball has a good
and



bad output). It looked like either the messages were out of order or
were for the other slave process. In the run mode used there is no
slave to slave communication. In the file the code died near the
beginning of the communication between m

Re: [OMPI users] Scrambled communications using sshstarteronmultiple nodes.

2007-01-30 Thread Fisher, Mark S
The code can be Freely downloaded for US citizens (it is export
controlled) at http://zephyr.lerc.nasa.gov/wind/. I can also provide you
the test case which is very small. I am a developer of the code and can
help you dig through it if you decide to download it. On the above page
you will need to request the code, if you request it just mention my
name to help expedite the approval.

-Original Message-
From: Jeff Squyres [mailto:jsquy...@cisco.com] 
Sent: Tuesday, January 30, 2007 9:09 AM
To: Open MPI Users
Subject: Re: [OMPI users] Scrambled communications using
sshstarteronmultiple nodes.

Is there any way that you can share the code?

On Jan 30, 2007, at 9:57 AM, Fisher, Mark S wrote:

> The slaves send specific requests to the master and then waits for a 
> reply to that request. For instance it might send a request to read a 
> variable from the file. The master will read the variable and send it 
> back with the same tag in response. Thus there is never more than one 
> response at a time to a given slave. We do not use any broadcast 
> functions in the code.
>
> The fact that it run ok on one host but not more than one host seems 
> to indicate something else is the problem. The code has been used for 
> 13 years in parallel and runs with PVM and other MPI distros without 
> any problems. The communication patterns are very simple and only 
> require that message order be preserved.
>
> -Original Message-
> From: Jeff Squyres [mailto:jsquy...@cisco.com]
> Sent: Tuesday, January 30, 2007 8:44 AM
> To: Open MPI Users
> Subject: Re: [OMPI users] Scrambled communications using ssh 
> starteronmultiple nodes.
>
> On Jan 30, 2007, at 9:35 AM, Fisher, Mark S wrote:
>
>> The master process uses both MPI_ANY_SOURCE and MPI_ANY_TAG while 
>> waiting for requests from slave processes. The slaves sometimes use 
>> MPI_ANY_TAG but the source is always specified.
>
> I think you said that you only had corruption issues on the slave, 
> right?  If so, the ANY_SOURCE/ANY_TAG on the master probably aren't 
> the issue.
>
> But if you're doing ANY_TAG on the slaves, you might want to double 
> check that that code is doing exactly what you think it's doing.  Are 
> there any race conditions such that a message could be received on 
> that ANY_TAG that you did not intend to receive there?  Look 
> especially hard at non-blocking receives with ANY_TAG.
>
>> We have run the code through valgrid for a number of cases including 
>> the one being used here.
>
> Excellent.
>
>> The code is Fortran 90 and we are using the FORTRAN 77 interface so I

>> do not believe this is a problem.
>
> Agreed; should not be an issue.
>
>> We are using Gigabit Ethernet.
>
> Ok, good.
>
>> I could look at LAM again to see if it would work. The code needs to 
>> be in a specific working directory and we need some environment 
>> variable set. This was not supported well in pre MPI 2. versions of 
>> MPI. For
>> MPICH1 I actually launch a script for the slaves so that we have the 
>> proper setup before running the executable. Note I had tried that 
>> with
>
>> OpenMPI and it had an internal error in orterun. This is not a 
>> problem
>
> Really?  OMPI's mpirun does not depend on the executable being an MPI 
> application -- indeed, you can "mpirun -np 2 uptime" with no problem.
> What problem did you run into here?
>
>> since the mpirun can setup everything we need. If you think it is 
>> worth while I will download and try it.
>
>  From what you describe, it sounds like order of messaging may be the 
> issue, not necessarily MPI handle types.  So let's hold off on that 
> one for the moment (although LAM should be pretty straightforward to 
> try -- you should be able to mpirun scripts with no problems; perhaps 
> you can try it as a background effort when you have spare cycles / 
> etc.), and look at your slave code for receiving.
>
>
>> -Original Message-
>> From: Jeff Squyres [mailto:jsquy...@cisco.com]
>> Sent: Monday, January 29, 2007 7:54 PM
>> To: Open MPI Users
>> Subject: Re: [OMPI users] Scrambled communications using ssh starter 
>> onmultiple nodes.
>>
>> Without analyzing your source, it's hard to say.  I will say that 
>> OMPI
>
>> may send fragments out of order, but we do, of course, provide the 
>> same message ordering guarantees that MPI mandates.  So let me ask a 
>> few leading questions:
>>
>> - Are you using any wildcards in your receives, such as 
>> MPI_ANY_SOURCE
>
>> or MPI_ANY_TAG?
>>
>> - Have you run your code through a memory-checking debugger such as 
>> valgrind?
>>
>> - I don't know what Scali MPI uses, but MPICH and Intel MPI use 
>> integers for MPI handles.  Have you tried LAM/MPI as well?  It, like 
>> Open MPI, uses pointers for MPI handles.  I mention this because apps

>> that unintentionally have code that takes advantage of integer 
>> handles
>
>> can sometimes behave unpredictably when switching to a pointer-based 
>> MPI implementation.
>>
>> - What network interconnect 

Re: [OMPI users] mutex deadlock in btl tcp

2007-01-30 Thread George Bosilca

Jeremy,

You're right. Thanks for point it out. I do the change in the trunk.

  george.

On Jan 30, 2007, at 3:40 AM, Jeremy Buisson wrote:


Dear Open MPI users list,

From time to time, I experience a mutex deadlock in Open-MPI 1.1.2.  
The stack
trace is available at the end of the mail. The deadlock seems to be  
caused by

lines 118 & 119 of the ompi/mca/btl/tcp/btl_tcp.c file, in function
mca_btl_tcp_add_procs:
OBJ_RELEASE(tcp_endpoint);
OPAL_THREAD_UNLOCK(&tcp_proc->proc_lock);
(of course, I did not check whether line numbers have changed since  
1.1.2.)
Indeed, releasing tcp_endpoint causes a call to  
mca_btl_tcp_proc_remove that
attempts to acquire the mutex tcp_proc->proc_lock, which is already  
held by the

thread (OBJ_THREAD_LOCK(&tcp_proc->proc_lock) at line 103 of the
ompi/mca/btl/tcp/btl_tcp.c file). Switching the two lines above (ie  
releasing
the mutex before destructing tcp_endpoint) seems to be sufficient  
to fix the
deadlock. Maybe should the changes done in the  
mca_btl_tcp_proc_insert function

be reverted rather than releasing the mutex before tcp_endpoint?
As far as I looked, the problem seems to still appear in the trunk  
revision 13359.


Second point. Is there any reason why MPI_Comm_spawn is restricted  
to execute
the new process(es) only on hosts listed in either the --host  
option or in the

hostfile? Or did I miss something?

Best regards,
Jeremy

-- 


stack trace as dumped by open-mpi (gdb version follows):
opal_mutex_lock(): Resource deadlock avoided
Signal:6 info.si_errno:0(Success) si_code:-6()
[0] func:/home1/jbuisson/soft/openmpi-1.1.2/lib/libopal.so.0  
[0x8addeb]

[1] func:/lib/tls/libpthread.so.0 [0x176e40]
[2] func:/lib/tls/libc.so.6(abort+0x1d5) [0xa294e5]
[3] func:/home1/jbuisson/soft/openmpi-1.1.2/lib/openmpi/mca_btl_tcp.so
[0x65f8a3]
[4]
func:/home1/jbuisson/soft/openmpi-1.1.2/lib/openmpi/mca_btl_tcp.so 
(mca_btl_tcp_proc_remove+0x2a)

[0x65fff0]
[5] func:/home1/jbuisson/soft/openmpi-1.1.2/lib/openmpi/mca_btl_tcp.so
[0x65cb24]
[6] func:/home1/jbuisson/soft/openmpi-1.1.2/lib/openmpi/mca_btl_tcp.so
[0x659465]
[7]
func:/home1/jbuisson/soft/openmpi-1.1.2/lib/openmpi/mca_btl_tcp.so 
(mca_btl_tcp_add_procs+0x10f)

[0x65927b]
[8]
func:/home1/jbuisson/soft/openmpi-1.1.2/lib/openmpi/mca_bml_r2.so 
(mca_bml_r2_add_procs+0x1bb)

[0x628023]
[9]
func:/home1/jbuisson/soft/openmpi-1.1.2/lib/openmpi/mca_pml_ob1.so 
(mca_pml_ob1_add_procs+0xd6)

[0x61650b]
[10]
func:/home1/jbuisson/soft/openmpi-1.1.2/lib/libmpi.so.0 
(ompi_comm_get_rport+0x1f8)

[0xb82303]
[11]
func:/home1/jbuisson/soft/openmpi-1.1.2/lib/libmpi.so.0 
(ompi_comm_connect_accept+0xbb)

[0xb81b43]
[12]
func:/home1/jbuisson/soft/openmpi-1.1.2/lib/libmpi.so.0 
(PMPI_Comm_spawn+0x3de)

[0xbb671a]
[13]
func:/home1/jbuisson/target/bin/mpi-spawner(__gxx_personality_v0 
+0x3d2)

[0x804bb8e]
[14] func:/home1/jbuisson/target/bin/mpi-spawner [0x804bdff]
[15] func:/home1/jbuisson/target/bin/mpi-spawner [0x804bfd4]
[16] func:/lib/tls/libc.so.6(__libc_start_main+0xda) [0xa1578a]
[17]
func:/home1/jbuisson/target/bin/mpi-spawner(__gxx_personality_v0+0x75)
[0x804b831]
*** End of error message ***


Same stack, dumped by gdb:
#0  0x00176357 in __pause_nocancel () from /lib/tls/libpthread.so.0
#1  0x008ade9b in opal_show_stackframe (signo=6, info=0xbfff9290,
p=0xbfff9310) at stacktrace.c:306
#2  
#3  0x00a27cdf in raise () from /lib/tls/libc.so.6
#4  0x00a294e5 in abort () from /lib/tls/libc.so.6
#5  0x0065f8a3 in opal_mutex_lock (m=0x8ff8250) at
../../../../opal/threads/mutex_unix.h:104
#6  0x0065fff0 in mca_btl_tcp_proc_remove (btl_proc=0x8ff8220,
btl_endpoint=0x900eba0) at btl_tcp_proc.c:296
#7  0x0065cb24 in mca_btl_tcp_endpoint_destruct  
(endpoint=0x900eba0) at

btl_tcp_endpoint.c:99
#8  0x00659465 in opal_obj_run_destructors (object=0x900eba0) at
../../../../opal/class/opal_object.h:405
#9  0x0065927b in mca_btl_tcp_add_procs (btl=0x8e57c30, nprocs=1,
ompi_procs=0x8ff7ac8, peers=0x8ff7ad8, reachable=0xbfff98e4) at
btl_tcp.c:118
#10 0x00628023 in mca_bml_r2_add_procs (nprocs=1, procs=0x8ff7ac8,
bml_endpoints=0x8ff60b8, reachable=0xbfff98e4) at bml_r2.c:231
#11 0x0061650b in mca_pml_ob1_add_procs (procs=0xbfff9930,  
nprocs=1) at

pml_ob1.c:133
#12 0x00b82303 in ompi_comm_get_rport (port=0x0, send_first=0,
proc=0x8e51c70, tag=2000) at communicator/comm_dyn.c:305
#13 0x00b81b43 in ompi_comm_connect_accept (comm=0x8ff8ce0, root=0,
port=0x0, send_first=0, newcomm=0xbfff9a38, tag=2000) at
communicator/comm_dyn.c:85
#14 0x00bb671a in PMPI_Comm_spawn (command=0x8ff88f0
"/home1/jbuisson/target/bin/sample-npb-ft-pp", argv=0xbfff9b40,
maxprocs=1, info=0x8ff73e0, root=0,
comm=0x8ff8ce0, intercomm=0xbfff9aa4, array_of_errcodes=0x0) at
pcomm_spawn.c:110

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




[OMPI users] no MPI_2COMPLEX and MPI_2DOUBLE_COMPLEX

2007-01-30 Thread Bert Wesarg
Hello,

I see the extern definitions in mpi.h for ompi_mpi_2cplex and
ompi_mpi_2dblcplex, but no #define for MPI_2COMPLEX and MPI_2DOUBLE_COMPLEX.

Greetings
Bert Wesarg


Re: [OMPI users] ompi_info segmentation fault

2007-01-30 Thread Avishay Traeger
Jeff,
Upgrading to 1.1.3 solved both issues - thank you very much!

Avishay

On Mon, 2007-01-29 at 20:59 -0500, Jeff Squyres wrote:
> I'm quite sure that we have since fixed the command line parsing  
> problem, and I *think* we fixed the mmap problem.
> 
> Is there any way that you can upgrade to v1.1.3?
> 
> 
> On Jan 29, 2007, at 3:24 PM, Avishay Traeger wrote:
> 
> > Hello,
> >
> > I have just installed Open MPI 1.1 on a 64-bit FC6 machine using yum.
> > The packages that were installed are:
> > openmpi-devel-1.1-7.fc6
> > openmpi-libs-1.1-7.fc6
> > openmpi-1.1-7.fc6
> >
> > I tried running ompi_info, but it results in a segmentation fault.
> > Running strace shows this at the end:
> >
> > mmap(NULL, 4294967296, PROT_READ|PROT_WRITE, MAP_PRIVATE| 
> > MAP_ANONYMOUS,
> > -1, 0) = -1 ENOMEM (Cannot allocate memory)
> > --- SIGSEGV (Segmentation fault) @ 0 (0) ---
> > +++ killed by SIGSEGV +++
> >
> > The full output of ompi_info is:
> > # ompi_info
> > Open MPI: 1.1
> >Open MPI SVN revision: r10477
> > Open RTE: 1.1
> >Open RTE SVN revision: r10477
> > OPAL: 1.1
> >OPAL SVN revision: r10477
> >   Prefix: /usr
> >  Configured architecture: x86_64-redhat-linux-gnu
> >Configured by: brewbuilder
> >Configured on: Fri Oct 13 14:34:07 EDT 2006
> >   Configure host: hs20-bc1-7.build.redhat.com
> > Built by: brewbuilder
> > Built on: Fri Oct 13 14:44:39 EDT 2006
> >   Built host: hs20-bc1-7.build.redhat.com
> >   C bindings: yes
> > C++ bindings: yes
> >   Fortran77 bindings: yes (single underscore)
> >   Fortran90 bindings: yes
> >  Fortran90 bindings size: small
> >   C compiler: gcc
> >  C compiler absolute: /usr/bin/gcc
> > C++ compiler: g++
> >C++ compiler absolute: /usr/bin/g++
> >   Fortran77 compiler: gfortran
> >   Fortran77 compiler abs: /usr/bin/gfortran
> >   Fortran90 compiler: gfortran
> >   Fortran90 compiler abs: /usr/bin/gfortran
> >  C profiling: yes
> >C++ profiling: yes
> >  Fortran77 profiling: yes
> >  Fortran90 profiling: yes
> >   C++ exceptions: no
> >   Thread support: posix (mpi: no, progress: no)
> >   Internal debug support: no
> >  MPI parameter check: runtime
> > Memory profiling support: no
> > Memory debugging support: no
> >  libltdl support: yes
> > Segmentation fault
> >
> > It seems that at this point in the program, it tries to map 4GB of
> > memory, which results in ENOMEM.  I'm guessing that the return  
> > value of
> > mmap isn't checked, which results in this segmentation fault.
> >
> > Also, I tried running "mpirun", and the output was:
> > # mpirun
> > *** buffer overflow detected ***: mpirun terminated
> > === Backtrace: =
> > /lib64/libc.so.6(__chk_fail+0x2f)[0x3f59ce0dff]
> > /lib64/libc.so.6[0x3f59ce065b]
> > /lib64/libc.so.6(__snprintf_chk+0x7b)[0x3f59ce052b]
> > /usr/lib64/openmpi/libopal.so.0(opal_cmd_line_get_usage_msg
> > +0x20a)[0x304901963a]
> > mpirun[0x403c7c]
> > mpirun(orterun+0xa4)[0x40260c]
> > mpirun(main+0x1b)[0x402563]
> > /lib64/libc.so.6(__libc_start_main+0xf4)[0x3f59c1da44]
> > mpirun[0x4024b9]
> >
> > It also included a "Memory map", which I left out.
> >
> > Any suggestions?
> >
> > Thanks in advance,
> > Avishay
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 



Re: [OMPI users] ompi_info segmentation fault

2007-01-30 Thread Jeff Squyres
Please note that due to a mixup in the 1.1.3 release, we just  
released v1.1.4.  :-(


See http://www.open-mpi.org/community/lists/announce/2007/01/0010.php  
for the official announcement.


The short version is that the wrong tarball was posted to the OMPI  
web site for the 1.1.3 release (doh!).  So we released today as  
v1.1.4 what should have been released a few days ago as v1.1.3.




On Jan 30, 2007, at 2:03 PM, Avishay Traeger wrote:


Jeff,
Upgrading to 1.1.3 solved both issues - thank you very much!

Avishay

On Mon, 2007-01-29 at 20:59 -0500, Jeff Squyres wrote:

I'm quite sure that we have since fixed the command line parsing
problem, and I *think* we fixed the mmap problem.

Is there any way that you can upgrade to v1.1.3?


On Jan 29, 2007, at 3:24 PM, Avishay Traeger wrote:


Hello,

I have just installed Open MPI 1.1 on a 64-bit FC6 machine using  
yum.

The packages that were installed are:
openmpi-devel-1.1-7.fc6
openmpi-libs-1.1-7.fc6
openmpi-1.1-7.fc6

I tried running ompi_info, but it results in a segmentation fault.
Running strace shows this at the end:

mmap(NULL, 4294967296, PROT_READ|PROT_WRITE, MAP_PRIVATE|
MAP_ANONYMOUS,
-1, 0) = -1 ENOMEM (Cannot allocate memory)
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV +++

The full output of ompi_info is:
# ompi_info
Open MPI: 1.1
   Open MPI SVN revision: r10477
Open RTE: 1.1
   Open RTE SVN revision: r10477
OPAL: 1.1
   OPAL SVN revision: r10477
  Prefix: /usr
 Configured architecture: x86_64-redhat-linux-gnu
   Configured by: brewbuilder
   Configured on: Fri Oct 13 14:34:07 EDT 2006
  Configure host: hs20-bc1-7.build.redhat.com
Built by: brewbuilder
Built on: Fri Oct 13 14:44:39 EDT 2006
  Built host: hs20-bc1-7.build.redhat.com
  C bindings: yes
C++ bindings: yes
  Fortran77 bindings: yes (single underscore)
  Fortran90 bindings: yes
 Fortran90 bindings size: small
  C compiler: gcc
 C compiler absolute: /usr/bin/gcc
C++ compiler: g++
   C++ compiler absolute: /usr/bin/g++
  Fortran77 compiler: gfortran
  Fortran77 compiler abs: /usr/bin/gfortran
  Fortran90 compiler: gfortran
  Fortran90 compiler abs: /usr/bin/gfortran
 C profiling: yes
   C++ profiling: yes
 Fortran77 profiling: yes
 Fortran90 profiling: yes
  C++ exceptions: no
  Thread support: posix (mpi: no, progress: no)
  Internal debug support: no
 MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
 libltdl support: yes
Segmentation fault

It seems that at this point in the program, it tries to map 4GB of
memory, which results in ENOMEM.  I'm guessing that the return
value of
mmap isn't checked, which results in this segmentation fault.

Also, I tried running "mpirun", and the output was:
# mpirun
*** buffer overflow detected ***: mpirun terminated
=== Backtrace: =
/lib64/libc.so.6(__chk_fail+0x2f)[0x3f59ce0dff]
/lib64/libc.so.6[0x3f59ce065b]
/lib64/libc.so.6(__snprintf_chk+0x7b)[0x3f59ce052b]
/usr/lib64/openmpi/libopal.so.0(opal_cmd_line_get_usage_msg
+0x20a)[0x304901963a]
mpirun[0x403c7c]
mpirun(orterun+0xa4)[0x40260c]
mpirun(main+0x1b)[0x402563]
/lib64/libc.so.6(__libc_start_main+0xf4)[0x3f59c1da44]
mpirun[0x4024b9]

It also included a "Memory map", which I left out.

Any suggestions?

Thanks in advance,
Avishay

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users





___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems



[OMPI users] MPI_Type_create_subarray fails!

2007-01-30 Thread Ivan de Jesus Deras Tabora

Hi,
Recently I installed OpenMPI 1.1.4 using the source RPM on Fedora Core
6.  then I tried to run some benchmarks from NASA.  First I tried is
some I/O benchmarks, It compiles, but when I run it, it generates the
following error:

[abc:25584] *** An error occurred in MPI_Type_create_subarray
[abc:25583] *** on communicator MPI_COMM_WORLD
[abc:25583] *** MPI_ERR_TYPE: invalid datatype
[abc:25583] *** MPI_ERRORS_ARE_FATAL (goodbye)

Then I find all the references to the MPI_Type_create_subarray and
create a little program just to test that part of the code, the code I
created is:

#include "mpi.h"

int main(int argc, char *argv[])
{
   MPI_Datatype subarray, type;
   int array_size[] = {9};
   int array_subsize[] = {3};
   int array_start[] = {1};
   int i, err;

   MPI_Init(&argc, &argv);

   /* Create a new type */
   MPI_Type_contiguous(5, MPI_INT, &type);
   MPI_Type_commit(&type);

   /* Create a subarray datatype */
   MPI_Type_create_subarray(1, array_size, array_subsize,
array_start, MPI_ORDER_C, type, &subarray);
   MPI_Type_commit(&subarray);

   MPI_Finalize();
   return 0;
}

After running this little program using mpirun, it raises the same error.

Thanks in advance,
Ivan