date:20100809

Re: [OMPI users] MPI_Bcast issue

2010-08-09 Thread Randolph Pullen

The only MPI calls I am using are these (grep-ed from my code):

MPI_Abort(MPI_COMM_WORLD, 1);
MPI_Barrier(MPI_COMM_WORLD);
MPI_Bcast(&bufarray[0].hdr, sizeof(BD_CHDR), MPI_CHAR, 0, MPI_COMM_WORLD);
MPI_Comm_rank(MPI_COMM_WORLD,&myid);
MPI_Comm_size(MPI_COMM_WORLD,&numprocs); 
MPI_Finalize();
MPI_Init(&argc, &argv);
MPI_Irecv(
MPI_Isend(
MPI_Recv(buff, BUFSIZE, MPI_CHAR, 0, TAG, MPI_COMM_WORLD, &stat);
MPI_Send(buff, BUFSIZE, MPI_CHAR, 0, TAG, MPI_COMM_WORLD);
MPI_Test(&request, &complete, &status);
MPI_Wait(&request, &status);  

The big wait happens on receipt of a bcast call that would otherwise work.
Its a bit mysterious really...

I presume that bcast is implemented with multicast calls but does it use any 
actual broadcast calls at all?  
I know I'm scraping the edges here looking for something but I just cant get my 
head around why it should fail where it has.

--- On Mon, 9/8/10, Ralph Castain  wrote:

From: Ralph Castain 
Subject: Re: [OMPI users] MPI_Bcast issue
To: "Open MPI Users" 
Received: Monday, 9 August, 2010, 1:32 PM

Hi Randolph
Unless your code is doing a connect/accept between the copies, there is no way 
they can cross-communicate. As you note, mpirun instances are completely 
isolated from each other - no process in one instance can possibly receive 
information from a process in another instance because it lacks all knowledge 
of it -unless- they wireup into a greater communicator by performing 
connect/accept calls between them.
I suspect you are inadvertently doing just that - perhaps by doing 
connect/accept in a tree-like manner, not realizing that the end result is one 
giant communicator that now links together all the N servers.
Otherwise, there is no possible way an MPI_Bcast in one mpirun can collide or 
otherwise communicate with an MPI_Bcast between processes started by another 
mpirun.

On Aug 8, 2010, at 7:13 PM, Randolph Pullen wrote:
Thanks,  although “An intercommunicator cannot be used for collective 
communication.” i.e ,  bcast calls., I can see how the MPI_Group_xx calls can 
be used to produce a useful group and then communicator;  - thanks again but 
this is really the side issue to my main question about MPI_Bcast.

I seem to have duplicate concurrent processes interfering with each other.  
This would appear to be a breach of the MPI safety dictum, ie MPI_COMM_WORD is 
supposed to only include the processes started by a single mpirun command and 
isolate these processes from other similar groups of processes safely.

So, it would appear to be a bug.  If so this has significant implications for 
environments such as mine, where it may often occur that the same program is 
run by different users simultaneously.  

It is really this issue
 that it concerning me, I can rewrite the code but if it can crash when 2 
copies run at the same time, I have a much bigger problem.

My suspicion is that a within the MPI_Bcast handshaking, a syncronising 
broadcast call may be colliding across the environments.  My only evidence is 
an otherwise working program waits on broadcast reception forever when two or 
more copies are run at [exactly] the same time.

Has anyone else seen similar behavior in concurrently running programs that 
perform lots of broadcasts perhaps?

Randolph

--- On Sun, 8/8/10, David Zhang  wrote:

From: David Zhang 
Subject: Re: [OMPI users] MPI_Bcast issue
To: "Open MPI Users" 
Received: Sunday, 8 August, 2010, 12:34 PM

In particular, intercommunicators

On 8/7/10, Aurélien Bouteiller  wrote:
> You should consider reading about communicators in MPI.
>
> Aurelien
> --
> Aurelien Bouteiller, Ph.D.
> Innovative Computing Laboratory, The University of Tennessee.
>
> Envoyé de mon iPad
>
> Le Aug 7, 2010 à 1:05, Randolph Pullen  a
> écrit :
>
>> I seem to be having a problem with MPI_Bcast.
>> My massive I/O intensive data movement program must broadcast from n to n
>> nodes. My problem starts because I require 2 processes per node, a sender
>> and a receiver and I have implemented
 these using MPI processes rather
>> than tackle the complexities of threads on MPI.
>>
>> Consequently, broadcast and calls like alltoall are not completely
>> helpful.  The dataset is huge and each node must end up with a complete
>> copy built by the large number of contributing broadcasts from the sending
>> nodes.  Network efficiency and run time are paramount.
>>
>> As I don’t want to needlessly broadcast all this data to the sending nodes
>> and I have a perfectly good MPI program that distributes globally from a
>> single node (1 to N), I took the unusual decision to start N copies of
>> this program by spawning the MPI system from the PVM system in an effort
>> to get my N to N concurrent transfers.
>>
>> It seems that the broadcasts running on concurrent MPI environments
>> collide and cause all but
 the first process to hang waiting for their
>> broadcasts.  This theory seems to be confirmed by introducing a sleep of
>> n-1 seconds before the first MPI_Bcast  call

Re: [OMPI users] Fortran code generation error on 1.5 rc5

2010-08-09 Thread Shiqing Fan



Hi Damien,

I'll check this. Thanks for reporting it.


Shiqing


On 2010-8-8 10:16 PM, Damien wrote:

 Hi all,

There's a code generation bug in the CMake/Visual Studio build of rc 5 
on VS 2008.  A release build, with static libs, F77 and F90 support 
gives an error at line 91 in mpif-config.h :


parameter (MPI_STATUS_SIZE=)

This obviously makes the compiler unhappy.

In older trunk builds this was

parameter (MPI_STATUS_SIZE=5)

Damien
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




--
--
Shiqing Fan  http://www.hlrs.de/people/fan
High Performance Computing   Tel.: +49 711 685 87234
  Center Stuttgart (HLRS)Fax.: +49 711 685 65832
Address:Allmandring 30   email: f...@hlrs.de
70569 Stuttgart

Re: [OMPI users] minor glitch in 1.5-rc5 Windows build - has workaround

2010-08-09 Thread Shiqing Fan


 Hi Damien,

It is the user's responsibility to make sure the consistency of CMake 
and VS build types, but you can't change this setting from CMake in 
order to change it automatically in VS, it's a nature of using CMake.



Shiqing


On 2010-8-6 10:33 PM, Damien wrote:

 Hi all,

There's a small hiccup in building a Windows version of 1.5-rc5.  When 
you configure in the CMake GUI, you can ask for a Debug or Release 
project before you hit Generate.  If you ask for a Debug project, you 
can still change it to Release in Visual Studio, and it will build 
successfully.  BUT: the Install project will fail, because it tries to 
install libopen-pald.pdb (possibly others too, I didn't check).  It's 
a minor thing, only nuisance value.  If you set a Release project in 
CMake, everything works fine.


Damien

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




--
--
Shiqing Fan  http://www.hlrs.de/people/fan
High Performance Computing   Tel.: +49 711 685 87234
  Center Stuttgart (HLRS)Fax.: +49 711 685 65832
Address:Allmandring 30   email: f...@hlrs.de
70569 Stuttgart

Re: [OMPI users] Bug in POWERPC32.asm?

2010-08-09 Thread Nysal Jan

Thanks for reporting this Matthew. Fixed in r23576 (
https://svn.open-mpi.org/trac/ompi/changeset/23576)

Regards
--Nysal

On Fri, Aug 6, 2010 at 10:38 PM, Matthew Clark wrote:

> I was looking in my copy of openmpi-1.4.1 opal/asm/base/POWERPC32.asm
> and saw the following:
>
> START_FUNC(opal_sys_timer_get_cycles)
>LSYM(15)
>mftbu r0
>mftb r11
>mftbu r2
>cmpw cr7,r2,r0
>bne+ cr7,REFLSYM(14)
>li r4,0
>li r9,0
>or r3,r2,r9
>or r4,r4,r11
>blr
> END_FUNC(opal_sys_timer_get_cycles)
>
> I'll readily admit at my lack of ppc assembly smartness, but shouldn't
> the loop back at bne+ be to REFLSYM(15) instead of (14)?
>
> Matt
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] MPI_Bcast issue

2010-08-09 Thread Edgar Gabriel

On 8/8/2010 8:13 PM, Randolph Pullen wrote:
> Thanks,  although “An intercommunicator cannot be used for collective
> communication.” i.e ,  bcast calls., 

yes it can. MPI-1 did not allow for collective operations on
intercommunicators, but the MPI-2 specification did introduce that notion.

Thanks
Edgar

> I can see how the MPI_Group_xx
> calls can be used to produce a useful group and then communicator;  -
> thanks again but this is really the side issue to my main question
> about MPI_Bcast.
> 
> I seem to have duplicate concurrent processes interfering with each
> other.  This would appear to be a breach of the MPI safety dictum, ie
> MPI_COMM_WORD is supposed to only include the processes started by a
> single mpirun command and isolate these processes from other similar
> groups of processes safely.
> 
> So, it would appear to be a bug.  If so this has significant
> implications for environments such as mine, where it may often occur
> that the same program is run by different users simultaneously.
> 
> It is really this issue that it concerning me, I can rewrite the code
> but if it can crash when 2 copies run at the same time, I have a much
> bigger problem.
> 
> My suspicion is that a within the MPI_Bcast handshaking, a
> syncronising broadcast call may be colliding across the environments.
> My only evidence is an otherwise working program waits on broadcast
> reception forever when two or more copies are run at [exactly] the
> same time.
> 
> Has anyone else seen similar behavior in concurrently running
> programs that perform lots of broadcasts perhaps?
> 
> Randolph
> 
> 
> --- On Sun, 8/8/10, David Zhang  wrote:
> 
> From: David Zhang  Subject: Re: [OMPI users]
> MPI_Bcast issue To: "Open MPI Users"  Received:
> Sunday, 8 August, 2010, 12:34 PM
> 
> In particular, intercommunicators
> 
> On 8/7/10, Aurélien Bouteiller  wrote:
>> You should consider reading about communicators in MPI.
>> 
>> Aurelien -- Aurelien Bouteiller, Ph.D. Innovative Computing
>> Laboratory, The University of Tennessee.
>> 
>> Envoyé de mon iPad
>> 
>> Le Aug 7, 2010 à 1:05, Randolph Pullen
>>  a écrit :
>> 
>>> I seem to be having a problem with MPI_Bcast. My massive I/O
>>> intensive data movement program must broadcast from n to n nodes.
>>> My problem starts because I require 2 processes per node, a
>>> sender and a receiver and I have implemented these using MPI
>>> processes rather than tackle the complexities of threads on MPI.
>>> 
>>> Consequently, broadcast and calls like alltoall are not
>>> completely helpful.  The dataset is huge and each node must end
>>> up with a complete copy built by the large number of contributing
>>> broadcasts from the sending nodes.  Network efficiency and run
>>> time are paramount.
>>> 
>>> As I don’t want to needlessly broadcast all this data to the
>>> sending nodes and I have a perfectly good MPI program that
>>> distributes globally from a single node (1 to N), I took the
>>> unusual decision to start N copies of this program by spawning
>>> the MPI system from the PVM system in an effort to get my N to N
>>> concurrent transfers.
>>> 
>>> It seems that the broadcasts running on concurrent MPI
>>> environments collide and cause all but the first process to hang
>>> waiting for their broadcasts.  This theory seems to be confirmed
>>> by introducing a sleep of n-1 seconds before the first MPI_Bcast
>>> call on each node, which results in the code working perfectly.
>>> (total run time 55 seconds, 3 nodes, standard TCP stack)
>>> 
>>> My guess is that unlike PVM, OpenMPI implements broadcasts with
>>> broadcasts rather than multicasts.  Can someone confirm this?  Is
>>> this a bug?
>>> 
>>> Is there any multicast or N to N broadcast where sender processes
>>> can avoid participating when they don’t need to?
>>> 
>>> Thanks in advance Randolph
>>> 
>>> 
>>> 
>>> ___ users mailing
>>> list us...@open-mpi.org 
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
> 
> 
> 
> ___ users mailing list 
> us...@open-mpi.org 
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] MPI_Bcast issue

2010-08-09 Thread Richard Treumann

I did not take the time to try to fully understand your approach so this 
may sound like a dumb question; 

Do you have an MPI_Bcast ROOT process in every MPI_COMM_WORLD and does 
every non-ROOT MPI_Bcast call correctly identify the rank of ROOT in its 
MPI_COMM_WORLD ? 

An MPI_Bcast call when there is not root task in the communicator or when 
the root task rank is given incorrectly will hang.


Dick Treumann  -  MPI Team 
IBM Systems & Technology Group
Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
Tele (845) 433-7846 Fax (845) 433-8363




From:
Randolph Pullen 
To:
us...@open-mpi.org
Date:
08/07/2010 01:23 AM
Subject:
[OMPI users] MPI_Bcast issue
Sent by:
users-boun...@open-mpi.org




I seem to be having a problem with MPI_Bcast.
My massive I/O intensive data movement program must broadcast from n to n 
nodes. My problem starts because I require 2 processes per node, a sender 
and a receiver and I have implemented these using MPI processes rather 
than tackle the complexities of threads on MPI.

Consequently, broadcast and calls like alltoall are not completely 
helpful.  The dataset is huge and each node must end up with a complete 
copy built by the large number of contributing broadcasts from the sending 
nodes.  Network efficiency and run time are paramount.

As I don’t want to needlessly broadcast all this data to the sending nodes 
and I have a perfectly good MPI program that distributes globally from a 
single node (1 to N), I took the unusual decision to start N copies of 
this program by spawning the MPI system from the PVM system in an effort 
to get my N to N concurrent transfers.

It seems that the broadcasts running on concurrent MPI environments 
collide and cause all but the first process to hang waiting for their 
broadcasts.  This theory seems to be confirmed by introducing a sleep of 
n-1 seconds before the first MPI_Bcast  call on each node, which results 
in the code working perfectly.  (total run time 55 seconds, 3 nodes, 
standard TCP stack)

My guess is that unlike PVM, OpenMPI implements broadcasts with broadcasts 
rather than multicasts.  Can someone confirm this?  Is this a bug?

Is there any multicast or N to N broadcast where sender processes can 
avoid participating when they don’t need to?

Thanks in advance
Randolph


 ___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] MPI_Bcast issue

2010-08-09 Thread Richard Treumann

Sorry - 

I missed the statement that all works when you add sleeps.  That probably 
rules out any possible error in the way MPI_Bcast was used.

Dick Treumann  -  MPI Team 
IBM Systems & Technology Group
Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
Tele (845) 433-7846 Fax (845) 433-8363




From:
Randolph Pullen 
To:
us...@open-mpi.org
Date:
08/07/2010 01:23 AM
Subject:
[OMPI users] MPI_Bcast issue
Sent by:
users-boun...@open-mpi.org




I seem to be having a problem with MPI_Bcast.
My massive I/O intensive data movement program must broadcast from n to n 
nodes. My problem starts because I require 2 processes per node, a sender 
and a receiver and I have implemented these using MPI processes rather 
than tackle the complexities of threads on MPI.

Consequently, broadcast and calls like alltoall are not completely 
helpful.  The dataset is huge and each node must end up with a complete 
copy built by the large number of contributing broadcasts from the sending 
nodes.  Network efficiency and run time are paramount.

As I don’t want to needlessly broadcast all this data to the sending nodes 
and I have a perfectly good MPI program that distributes globally from a 
single node (1 to N), I took the unusual decision to start N copies of 
this program by spawning the MPI system from the PVM system in an effort 
to get my N to N concurrent transfers.

It seems that the broadcasts running on concurrent MPI environments 
collide and cause all but the first process to hang waiting for their 
broadcasts.  This theory seems to be confirmed by introducing a sleep of 
n-1 seconds before the first MPI_Bcast  call on each node, which results 
in the code working perfectly.  (total run time 55 seconds, 3 nodes, 
standard TCP stack)

My guess is that unlike PVM, OpenMPI implements broadcasts with broadcasts 
rather than multicasts.  Can someone confirm this?  Is this a bug?

Is there any multicast or N to N broadcast where sender processes can 
avoid participating when they don’t need to?

Thanks in advance
Randolph


 ___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] deadlock in openmpi 1.5rc5

2010-08-09 Thread Jeff Squyres

In your first mail, you mentioned that you are testing the new knem support.

Can you try disabling knem and see if that fixes the problem?  (i.e., run with 
--mca btl_sm_use_knem 0")  If it fixes the issue, that might mean we have a 
knem-based bug.



On Aug 6, 2010, at 1:42 PM, John Hsu wrote:

> Hi,
> 
> sorry for the confusion, that was indeed the trunk version of things I was 
> running.
> 
> Here's the same problem using
> 
> http://www.open-mpi.org/software/ompi/v1.5/downloads/openmpi-1.5rc5.tar.bz2
> 
> command-line:
> 
> ../openmpi_devel/bin/mpirun -hostfile hostfiles/hostfile.wgsgX -npernode 11 
> ./bin/mpi_test
> 
> back trace on sender:
> 
> (gdb) bt
> #0  0x7fa003bcacf3 in epoll_wait () from /lib/libc.so.6
> #1  0x7fa004f43a4b in epoll_dispatch ()
>from 
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
> #2  0x7fa004f4b5fa in opal_event_base_loop ()
>from 
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
> #3  0x7fa004f1ce69 in opal_progress ()
>from 
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
> #4  0x7f9ffe69be95 in mca_pml_ob1_recv ()
>from 
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/openmpi/mca_pml_ob1.so
> #5  0x7fa004ebb35c in PMPI_Recv ()
>from 
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
> #6  0x0040ae48 in MPI::Comm::Recv (this=0x612800, buf=0x7fff8f5cbb50, 
> count=1, datatype=..., source=29, 
> tag=100, status=...)
> at 
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/include/openmpi/ompi/mpi/cxx/comm_inln.h:36
> #7  0x00409a57 in main (argc=1, argv=0x7fff8f5cbd78)
> at 
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/mpi_test/src/mpi_test.cpp:30
> (gdb) 
> 
> back trace on receiver:
> 
> (gdb) bt
> #0  0x7fcce1ba5cf3 in epoll_wait () from /lib/libc.so.6
> #1  0x7fcce2f1ea4b in epoll_dispatch ()
>from 
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
> #2  0x7fcce2f265fa in opal_event_base_loop ()
>from 
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
> #3  0x7fcce2ef7e69 in opal_progress ()
>from 
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
> #4  0x7fccdc677b1d in mca_pml_ob1_send ()
>from 
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/openmpi/mca_pml_ob1.so
> #5  0x7fcce2e9874f in PMPI_Send ()
>from 
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
> #6  0x0040adda in MPI::Comm::Send (this=0x612800, buf=0x7fff3f18ad20, 
> count=1, datatype=..., dest=0, tag=100)
> at 
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/include/openmpi/ompi/mpi/cxx/comm_inln.h:29
> #7  0x00409b72 in main (argc=1, argv=0x7fff3f18af48)
> at 
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/mpi_test/src/mpi_test.cpp:38
> (gdb) 
> 
> and attached is my mpi_test file for reference.
> 
> thanks,
> John
> 
> 
> On Fri, Aug 6, 2010 at 6:24 AM, Ralph Castain  wrote:
> You clearly have an issue with version confusion. The file cited in your 
> warning:
> 
> > [wgsg0:29074] Warning -- mutex was double locked from errmgr_hnp.c:772
> 
> does not exist in 1.5rc5. It only exists in the developer's trunk at this 
> time. Check to ensure you have the right paths set, blow away the install 
> area (in case you have multiple versions installed on top of each other), etc.
> 
> 
> 
> On Aug 5, 2010, at 5:16 PM, John Hsu wrote:
> 
> > Hi All,
> > I am new to openmpi and have encountered an issue using pre-release 1.5rc5, 
> > for a simple mpi code (see attached).  In this test, nodes 1 to n sends out 
> > a random number to node 0, node 0 sums all numbers received.
> >
> > This code works fine on 1 machine with any number of nodes, and on 3 
> > machines running 10 nodes per machine, but when we try to run 11 nodes per 
> > machine this warning appears:
> >
> > [wgsg0:29074] Warning -- mutex was double locked from errmgr_hnp.c:772
> >
> > And node 0 (master summing node) hangs on receiving plus another random 
> > node hangs on sending indefinitely.  Below are the back traces:
> >
> > (gdb) bt
> > #0  0x7f0c5f109cd3 in epoll_wait () from /lib/libc.so.6
> > #1  0x7f0c6052db53 in epoll_dispatch (base=0x2310bf0, arg=0x22f91f0, 
> > tv=0x7fff90f623e0) at epoll.c:215
> > #2  0x7f0c6053ae58 in opal_event_base_loop (base=0x2310bf0, flags=2) at 
> > event.c:838
> > #3  0x7f0c6053ac2

Re: [OMPI users] MPI_Bcast issue

2010-08-09 Thread Eugene Loh





Personally, I've been having trouble following the explanations of the
problem.  Perhaps it'd be helpful if you gave us an example of how to
reproduce the problem.  E.g., short sample code and how you run the
example to produce the problem.  The shorter the example, the greater
the odds of resolution.

  

  
From:

Randolph Pullen


  
  
To:

us...@open-mpi.org

  
  
Date:

08/07/2010 01:23 AM

  
  
Subject:

[OMPI users] MPI_Bcast
issue

  
  
Sent by:

users-boun...@open-mpi.org
  

  
  
  

  
I seem to be having a problem with MPI_Bcast.
My massive I/O intensive data movement program must broadcast from n to
n nodes. My problem starts because I require 2 processes per node, a
sender
and a receiver and I have implemented these using MPI processes rather
than tackle the complexities of threads on MPI.

Consequently, broadcast and calls like alltoall are not completely
helpful.
 The dataset is huge and each node must end up with a complete copy
built by the large number of contributing broadcasts from the sending
nodes.
 Network efficiency and run time are paramount.

As I don’t want to needlessly broadcast all this data to the sending
nodes
and I have a perfectly good MPI program that distributes globally from
a single node (1 to N), I took the unusual decision to start N copies
of
this program by spawning the MPI system from the PVM system in an
effort
to get my N to N concurrent transfers.

It seems that the broadcasts running on concurrent MPI environments
collide
and cause all but the first process to hang waiting for their
broadcasts.
 This theory seems to be confirmed by introducing a sleep of n-1
seconds
before the first MPI_Bcast  call on each node, which results in the
code working perfectly.  (total run time 55 seconds, 3 nodes, standard
TCP stack)

My guess is that unlike PVM, OpenMPI implements broadcasts with
broadcasts
rather than multicasts.  Can someone confirm this?  Is this a
bug?

Is there any multicast or N to N broadcast where sender processes can
avoid
participating when they don’t need to?

Re: [OMPI users] openib issues

2010-08-09 Thread Eloi Gaudry

Hi,

Could someone have a look on these two different error messages ? I'd like to 
know the reason(s) why they were displayed and their actual meaning.

Thanks,
Eloi

On Monday 19 July 2010 16:38:57 Eloi Gaudry wrote:
> Hi,
> 
> I've been working on a random segmentation fault that seems to occur during
> a collective communication when using the openib btl (see [OMPI users]
> [openib] segfault when using openib btl).
> 
> During my tests, I've come across different issues reported by
> OpenMPI-1.4.2:
> 
> 1/
> [[12770,1],0][btl_openib_component.c:3227:handle_wc] from bn0103 to: bn0122
> error polling LP CQ with status LOCAL LENGTH ERROR status number 1 for
> wr_id 560618664 opcode 1  vendor error 105 qp_idx 3
> 
> 2/
> [[992,1],6][btl_openib_component.c:3227:handle_wc] from pbn04 to: pbn05
> error polling LP CQ with status REMOTE ACCESS ERROR status number 10 for
> wr_id 162858496 opcode 1  vendor error 136 qp_idx
> 0[[992,1],5][btl_openib_component.c:3227:handle_wc] from pbn05 to: pbn04
> error polling HP CQ with status WORK REQUEST FLUSHED ERROR status number 5
> for wr_id 485900928 opcode 0  vendor error 249 qp_idx 0
> 
> --
> The OpenFabrics stack has reported a network error event.  Open MPI will
> try to continue, but your job may end up failing.
> 
>   Local host:p'"
>   MPI process PID:   20743
>   Error number:  3 (IBV_EVENT_QP_ACCESS_ERR)
> 
> This error may indicate connectivity problems within the fabric; please
> contact your system administrator.
> --
> 
> I'd like to know what these two errors mean and where they come from.
> 
> Thanks for your help,
> Eloi

-- 


Eloi Gaudry

Free Field Technologies
Company Website: http://www.fft.be
Company Phone:   +32 10 487 959

Re: [OMPI users] MPI_Bcast issue

2010-08-09 Thread Ralph Castain

No idea what is going on here. No MPI call is implemented as a multicast - it 
all flows over the MPI pt-2-pt system via one of the various algorithms.

Best guess I can offer is that there is a race condition in your program that 
you are tripping when other procs that share the node change the timing.

How did you configure OMPI when you built it?


On Aug 8, 2010, at 11:02 PM, Randolph Pullen wrote:

> The only MPI calls I am using are these (grep-ed from my code):
> 
> MPI_Abort(MPI_COMM_WORLD, 1);
> MPI_Barrier(MPI_COMM_WORLD);
> MPI_Bcast(&bufarray[0].hdr, sizeof(BD_CHDR), MPI_CHAR, 0, MPI_COMM_WORLD);
> MPI_Comm_rank(MPI_COMM_WORLD,&myid);
> MPI_Comm_size(MPI_COMM_WORLD,&numprocs); 
> MPI_Finalize();
> MPI_Init(&argc, &argv);
> MPI_Irecv(
> MPI_Isend(
> MPI_Recv(buff, BUFSIZE, MPI_CHAR, 0, TAG, MPI_COMM_WORLD, &stat);
> MPI_Send(buff, BUFSIZE, MPI_CHAR, 0, TAG, MPI_COMM_WORLD);
> MPI_Test(&request, &complete, &status);
> MPI_Wait(&request, &status);  
> 
> The big wait happens on receipt of a bcast call that would otherwise work.
> Its a bit mysterious really...
> 
> I presume that bcast is implemented with multicast calls but does it use any 
> actual broadcast calls at all?  
> I know I'm scraping the edges here looking for something but I just cant get 
> my head around why it should fail where it has.
> 
> --- On Mon, 9/8/10, Ralph Castain  wrote:
> 
> From: Ralph Castain 
> Subject: Re: [OMPI users] MPI_Bcast issue
> To: "Open MPI Users" 
> Received: Monday, 9 August, 2010, 1:32 PM
> 
> Hi Randolph
> 
> Unless your code is doing a connect/accept between the copies, there is no 
> way they can cross-communicate. As you note, mpirun instances are completely 
> isolated from each other - no process in one instance can possibly receive 
> information from a process in another instance because it lacks all knowledge 
> of it -unless- they wireup into a greater communicator by performing 
> connect/accept calls between them.
> 
> I suspect you are inadvertently doing just that - perhaps by doing 
> connect/accept in a tree-like manner, not realizing that the end result is 
> one giant communicator that now links together all the N servers.
> 
> Otherwise, there is no possible way an MPI_Bcast in one mpirun can collide or 
> otherwise communicate with an MPI_Bcast between processes started by another 
> mpirun.
> 
> 
> 
> On Aug 8, 2010, at 7:13 PM, Randolph Pullen wrote:
> 
>> Thanks,  although “An intercommunicator cannot be used for collective 
>> communication.” i.e ,  bcast calls., I can see how the MPI_Group_xx calls 
>> can be used to produce a useful group and then communicator;  - thanks again 
>> but this is really the side issue to my main question about MPI_Bcast.
>> 
>> I seem to have duplicate concurrent processes interfering with each other.  
>> This would appear to be a breach of the MPI safety dictum, ie MPI_COMM_WORD 
>> is supposed to only include the processes started by a single mpirun command 
>> and isolate these processes from other similar groups of processes safely.
>> 
>> So, it would appear to be a bug.  If so this has significant implications 
>> for environments such as mine, where it may often occur that the same 
>> program is run by different users simultaneously.  
>> 
>> It is really this issue that it concerning me, I can rewrite the code but if 
>> it can crash when 2 copies run at the same time, I have a much bigger 
>> problem.
>> 
>> My suspicion is that a within the MPI_Bcast handshaking, a syncronising 
>> broadcast call may be colliding across the environments.  My only evidence 
>> is an otherwise working program waits on broadcast reception forever when 
>> two or more copies are run at [exactly] the same time.
>> 
>> Has anyone else seen similar behavior in concurrently running programs that 
>> perform lots of broadcasts perhaps?
>> 
>> Randolph
>> 
>> 
>> --- On Sun, 8/8/10, David Zhang  wrote:
>> 
>> From: David Zhang 
>> Subject: Re: [OMPI users] MPI_Bcast issue
>> To: "Open MPI Users" 
>> Received: Sunday, 8 August, 2010, 12:34 PM
>> 
>> In particular, intercommunicators
>> 
>> On 8/7/10, Aurélien Bouteiller  wrote:
>> > You should consider reading about communicators in MPI.
>> >
>> > Aurelien
>> > --
>> > Aurelien Bouteiller, Ph.D.
>> > Innovative Computing Laboratory, The University of Tennessee.
>> >
>> > Envoyé de mon iPad
>> >
>> > Le Aug 7, 2010 à 1:05, Randolph Pullen  a
>> > écrit :
>> >
>> >> I seem to be having a problem with MPI_Bcast.
>> >> My massive I/O intensive data movement program must broadcast from n to n
>> >> nodes. My problem starts because I require 2 processes per node, a sender
>> >> and a receiver and I have implemented these using MPI processes rather
>> >> than tackle the complexities of threads on MPI.
>> >>
>> >> Consequently, broadcast and calls like alltoall are not completely
>> >> helpful.  The dataset is huge and each node must end up with a complete
>> >> copy built by the large number of contributing broadcasts

[OMPI users] MPI Template Datatype?

2010-08-09 Thread Alexandru Blidaru

Hi,

I have to send some vectors from node to node, and the vecotrs are built
using a template. The datatypes used in the template will be long, int,
double, and char. How may I send those vectors since I wouldn't know what
MPI datatype i have to specify in MPI_Send and MPI Recv. Is there any way to
do this?

-- 

Alexandru Blidaru

University of Waterloo - Electrical Engineering '15
University email: asbli...@uwaterloo.ca
Twitter handle: @G_raph
Blog: http://alexblidaru.wordpress.com/

Re: [OMPI users] MPI_Allreduce on local machine

2010-08-09 Thread Jeff Squyres

On Jul 28, 2010, at 12:21 PM, Åke Sandgren wrote:

> > Jeff:  Is this correct?
> 
> This is wrong, it should be 8 and alignement should be 8 even for intel.
> And i also see exactly the same thing.

Good catch!

I just fixed this in https://svn.open-mpi.org/trac/ompi/changeset/23580 -- it 
looks like a copy-n-paste error in displaying the Fortran sizes/alignments in 
ompi_info.  It probably happened when ompi_info was converted from C++ to C.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/

Re: [OMPI users] MPI_Allreduce on local machine

2010-08-09 Thread Jeff Squyres

On Jul 28, 2010, at 5:07 PM, Gus Correa wrote:

> Still, the alignment under Intel may or may not be right.
> And this may or may not explain the errors that Hugo has got.
> 
> FYI, the ompi_info from my OpenMPI 1.3.2 and 1.2.8
> report exactly the same as OpenMPI 1.4.2, namely
> Fort dbl prec size: 4  and
> Fort dbl prec align: 4,
> except that *if the Intel Fortran compiler (ifort) was used*
> I get 1 byte alignment:
> Fort dbl prec align: 1
> 
> So, this issue has been around for a while,
> and involves both the size and the alignment (in Intel)
> of double precision.

Yes, it's quite problematic to try to determine the alignment of Fortran types 
-- compilers can do different things and there's no reliable way (that I know 
of, at least) to absolutely get the "native" alignment.

That being said, we didn't previously find any correctness issues with using an 
alignment of 1.

> We have a number of pieces of code here where grep shows
> MPI_DOUBLE_PRECISION.
> Not sure how much of it has actually been active, as there are always
> lots of cpp directives to select active code.
> 
> In particular I found this interesting snippet:
> 
>  if (MPI_DOUBLE_PRECISION==20 .and. MPI_REAL8==18) then
> ! LAM MPI defined MPI_REAL8 differently from MPI_DOUBLE_PRECISION
> ! and LAM MPI's allreduce does not accept on MPI_REAL8
> MPIreal_t= MPI_DOUBLE_PRECISION
>  else
> MPIreal_t= MPI_REAL8
>  endif

This kind of thing shouldn't be an issue with Open MPI, right?

FWIW, OMPI uses different numbers for MPI_DOUBLE_PRECISION and MPI_REAL8 than 
LAM.  They're distinct MPI datatypes because they *could* be different.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/

[OMPI users] Checkpointing mpi4py program

2010-08-09 Thread ananda.mudar

Hi



I have integrated mpi4py with openmpi 1.4.2 that was built with BLCR
0.8.2. When I run ompi-checkpoint on the program written using mpi4py, I
see that program doesn't resume sometimes after successful checkpoint
creation. This doesn't occur always meaning the program resumes after
successful checkpoint creation most of the time and completes
successfully. Has anyone tested the checkpoint/restart functionality
with mpi4py programs? Are there any best practices that I should keep in
mind while checkpointing mpi4py programs?



Thanks for your time

-  Ananda


Please do not print this email unless it is absolutely necessary. 

The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments. 

WARNING: Computer viruses can be transmitted via email. The recipient should 
check this email and any attachments for the presence of viruses. The company 
accepts no liability for any damage caused by any virus transmitted by this 
email. 

www.wipro.com

Re: [OMPI users] MPI Template Datatype?

2010-08-09 Thread Riccardo Murri

Hello Alexandru,

On Mon, Aug 9, 2010 at 6:05 PM, Alexandru Blidaru  wrote:
> I have to send some vectors from node to node, and the vecotrs are built
> using a template. The datatypes used in the template will be long, int,
> double, and char. How may I send those vectors since I wouldn't know what
> MPI datatype i have to specify in MPI_Send and MPI Recv. Is there any way to
> do this?
>

I'm not sure I understand what your question is about: are you asking
what MPI datatypes you should use to send C types "long", "int", etc.,
or are you trying to send a more complex C type ("vector")?
Can you send some code demonstrating the  problem you are trying to solve?

Besides, your wording suggests that you are trying to send a C++
std::vector over MPI: have you already had a look at Boost.MPI?  It
has out-of-the-box support for STL containers.

Cheers,
Riccardo

Re: [OMPI users] MPI Template Datatype?

2010-08-09 Thread Alexandru Blidaru

Hello Riccardo,

I basically have to implement a 4D vector. An additional goal of my project
is to support char, int, float and double datatypes in the vector. I figured
that the only way to do this is through a template. Up to this point I was
only supporting doubles in my vector, and I was sending each element
individually from node to node. Since MPI_Send and MPI_Recv require the
programmer to specify which datatype to use, and since I would only use
doubles in the initial version of my project, using  MPI_Send and MPI_Recv
was easy. However if I am to declare my 4D vector like this std::vector ,
there will be no way for me to know which datatype to specify in the  MPI_Send
and MPI_Recv commands.

No I haven't looked at Boost.MPI . I did a quick Ctrl-F of Boost.MPI in the
MPI 2.2 doc that i found here: http://www.mpi-forum.org/docs/docs.html , but
i was unable to find it. Could you point me to some resources about it? It
would be a lot easier to use that rather than send every element 1 by 1.

Thank you very much for your help.

Alex

On Mon, Aug 9, 2010 at 4:09 PM, Riccardo Murri wrote:

> Hello Alexandru,
>
> On Mon, Aug 9, 2010 at 6:05 PM, Alexandru Blidaru 
> wrote:
> > I have to send some vectors from node to node, and the vecotrs are built
> > using a template. The datatypes used in the template will be long, int,
> > double, and char. How may I send those vectors since I wouldn't know what
> > MPI datatype i have to specify in MPI_Send and MPI Recv. Is there any way
> to
> > do this?
> >
>
> I'm not sure I understand what your question is about: are you asking
> what MPI datatypes you should use to send C types "long", "int", etc.,
> or are you trying to send a more complex C type ("vector")?
> Can you send some code demonstrating the  problem you are trying to solve?
>
> Besides, your wording suggests that you are trying to send a C++
> std::vector over MPI: have you already had a look at Boost.MPI?  It
> has out-of-the-box support for STL containers.
>
> Cheers,
> Riccardo
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

-- 

Alexandru Blidaru

University of Waterloo - Electrical Engineering '15
University email: asbli...@uwaterloo.ca
Twitter handle: @G_raph
Blog: http://alexblidaru.wordpress.com/

Re: [OMPI users] Checkpointing mpi4py program

2010-08-09 Thread Joshua Hursey

I have not tried to checkpoint an mpi4py application, so I cannot say for sure 
if it works or not. You might be hitting something with the Python runtime 
interacting in an odd way with either Open MPI or BLCR.

Can you attach a debugger and get a backtrace on a stuck checkpoint? That might 
show us where things are held up.

-- Josh


On Aug 9, 2010, at 4:04 PM,   
wrote:

> Hi
>  
> I have integrated mpi4py with openmpi 1.4.2 that was built with BLCR 0.8.2. 
> When I run ompi-checkpoint on the program written using mpi4py, I see that 
> program doesn’t resume sometimes after successful checkpoint creation. This 
> doesn’t occur always meaning the program resumes after successful checkpoint 
> creation most of the time and completes successfully. Has anyone tested the 
> checkpoint/restart functionality with mpi4py programs? Are there any best 
> practices that I should keep in mind while checkpointing mpi4py programs?
>  
> Thanks for your time
> -  Ananda
> Please do not print this email unless it is absolutely necessary.
> 
> The information contained in this electronic message and any attachments to 
> this message are intended for the exclusive use of the addressee(s) and may 
> contain proprietary, confidential or privileged information. If you are not 
> the intended recipient, you should not disseminate, distribute or copy this 
> e-mail. Please notify the sender immediately and destroy all copies of this 
> message and any attachments.
> 
> WARNING: Computer viruses can be transmitted via email. The recipient should 
> check this email and any attachments for the presence of viruses. The company 
> accepts no liability for any damage caused by any virus transmitted by this 
> email.
> 
> www.wipro.com
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] deadlock in openmpi 1.5rc5

2010-08-09 Thread John Hsu

problem "fixed" by adding the --mca btl_sm_use_knem 0 option (with -npernode
11), so I proceeded to bump up -npernode to 12:

$ ../openmpi_devel/bin/mpirun -hostfile hostfiles/hostfile.wgsgX -npernode
12 --mca btl_sm_use_knem 0  ./bin/mpi_test

and the same error occurs,

(gdb) bt
#0  0x7fcca6ae5cf3 in epoll_wait () from /lib/libc.so.6
#1  0x7fcca7e5ea4b in epoll_dispatch ()
   from
/wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
#2  0x7fcca7e665fa in opal_event_base_loop ()
   from
/wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
#3  0x7fcca7e37e69 in opal_progress ()
   from
/wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
#4  0x7fcca15b6e95 in mca_pml_ob1_recv ()
   from
/wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/openmpi/mca_pml_ob1.so
#5  0x7fcca7dd635c in PMPI_Recv ()
   from
/wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
#6  0x0040ae48 in MPI::Comm::Recv (this=0x612800,
buf=0x7fff2a0d7e00,
count=1, datatype=..., source=23, tag=100, status=...)
at
/wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/include/openmpi/ompi/mpi/cxx/comm_inln.h:36
#7  0x00409a57 in main (argc=1, argv=0x7fff2a0d8028)
at
/wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/mpi_test/src/mpi_test.cpp:30
(gdb)


(gdb) bt
#0  0x7f5dc31d2cf3 in epoll_wait () from /lib/libc.so.6
#1  0x7f5dc454ba4b in epoll_dispatch ()
   from
/wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
#2  0x7f5dc45535fa in opal_event_base_loop ()
   from
/wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
#3  0x7f5dc4524e69 in opal_progress ()
   from
/wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
#4  0x7f5dbdca4b1d in mca_pml_ob1_send ()
   from
/wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/openmpi/mca_pml_ob1.so
#5  0x7f5dc44c574f in PMPI_Send ()
   from
/wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
#6  0x0040adda in MPI::Comm::Send (this=0x612800,
buf=0x7fff6e0c0790,
count=1, datatype=..., dest=0, tag=100)
at
/wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/include/openmpi/ompi/mpi/cxx/comm_inln.h:29
#7  0x00409b72 in main (argc=1, argv=0x7fff6e0c09b8)
at
/wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/mpi_test/src/mpi_test.cpp:38
(gdb)




On Mon, Aug 9, 2010 at 6:39 AM, Jeff Squyres  wrote:

> In your first mail, you mentioned that you are testing the new knem
> support.
>
> Can you try disabling knem and see if that fixes the problem?  (i.e., run
> with --mca btl_sm_use_knem 0")  If it fixes the issue, that might mean we
> have a knem-based bug.
>
>
>
> On Aug 6, 2010, at 1:42 PM, John Hsu wrote:
>
> > Hi,
> >
> > sorry for the confusion, that was indeed the trunk version of things I
> was running.
> >
> > Here's the same problem using
> >
> >
> http://www.open-mpi.org/software/ompi/v1.5/downloads/openmpi-1.5rc5.tar.bz2
> >
> > command-line:
> >
> > ../openmpi_devel/bin/mpirun -hostfile hostfiles/hostfile.wgsgX -npernode
> 11 ./bin/mpi_test
> >
> > back trace on sender:
> >
> > (gdb) bt
> > #0  0x7fa003bcacf3 in epoll_wait () from /lib/libc.so.6
> > #1  0x7fa004f43a4b in epoll_dispatch ()
> >from
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
> > #2  0x7fa004f4b5fa in opal_event_base_loop ()
> >from
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
> > #3  0x7fa004f1ce69 in opal_progress ()
> >from
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
> > #4  0x7f9ffe69be95 in mca_pml_ob1_recv ()
> >from
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/openmpi/mca_pml_ob1.so
> > #5  0x7fa004ebb35c in PMPI_Recv ()
> >from
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
> > #6  0x0040ae48 in MPI::Comm::Recv (this=0x612800,
> buf=0x7fff8f5cbb50, count=1, datatype=..., source=29,
> > tag=100, status=...)
> > at
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/include/openmpi/ompi/mpi/cxx/comm_inln.h:36
> > #7  0x00409a57 in main (argc=1, argv=0x7fff8f5cbd78)
> > at
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/s

Re: [OMPI users] MPI Template Datatype?

2010-08-09 Thread Riccardo Murri

Hi Alexandru,

you can read all about Boost.MPI at:

  http://www.boost.org/doc/libs/1_43_0/doc/html/mpi.html


On Mon, Aug 9, 2010 at 10:27 PM, Alexandru Blidaru  wrote:
> I basically have to implement a 4D vector. An additional goal of my project
> is to support char, int, float and double datatypes in the vector.

If your "vector" is fixed-size (i.e., all vectors are comprised of
4 elements), then you can likely dispose of std::vector, use
C-style arrays with templated send/receive calls (that would
be just interfaces to MPI_Send/MPI_Recv)

   // BEWARE: untested code!!!

   template 
   int send(T* vector, int dest, int tag, MPI_Comm comm) {
   throw std::logic_error("called generic MyVector::send");
   };

   template 
   int recv(T* vector, int source, int tag, MPI_Comm comm) {
   throw std::logic_error("called generic MyVector::send");
   };

and then you specialize the template for the types you actually use:

  template <>
  int send(int* vector, int dest, int tag, MPI_Comm comm)
  {
return MPI_Send(vector, 4, MPI_DOUBLE, dest, tag, comm);
  };

  template <>
  int recv(int* vector, int src, int tag, MPI_Comm comm)
  {
return MPI_Recv(vector, 4, MPI_DOUBLE, dest, tag, comm);
  };

  // etc.

However, let me warn you that it would likely take more time and
effort to write all the template specializations and get them working
than just use Boost.MPI.

Best regards,
Riccardo

Re: [OMPI users] deadlock in openmpi 1.5rc5

2010-08-09 Thread Jeff Squyres

I've opened a ticket about this -- if it's an actual problem, it's a 1.5 
blocker:

https://svn.open-mpi.org/trac/ompi/ticket/2530

What version of knem and Linux are you using?



On Aug 9, 2010, at 4:50 PM, John Hsu wrote:

> problem "fixed" by adding the --mca btl_sm_use_knem 0 option (with -npernode 
> 11), so I proceeded to bump up -npernode to 12:
> 
> $ ../openmpi_devel/bin/mpirun -hostfile hostfiles/hostfile.wgsgX -npernode 12 
> --mca btl_sm_use_knem 0  ./bin/mpi_test
> 
> and the same error occurs,
> 
> (gdb) bt
> #0  0x7fcca6ae5cf3 in epoll_wait () from /lib/libc.so.6
> #1  0x7fcca7e5ea4b in epoll_dispatch ()
>from 
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
> #2  0x7fcca7e665fa in opal_event_base_loop ()
>from 
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
> #3  0x7fcca7e37e69 in opal_progress ()
>from 
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
> #4  0x7fcca15b6e95 in mca_pml_ob1_recv ()
>from 
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/openmpi/mca_pml_ob1.so
> #5  0x7fcca7dd635c in PMPI_Recv ()
>from 
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
> #6  0x0040ae48 in MPI::Comm::Recv (this=0x612800, buf=0x7fff2a0d7e00, 
> count=1, datatype=..., source=23, tag=100, status=...)
> at 
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/include/openmpi/ompi/mpi/cxx/comm_inln.h:36
> #7  0x00409a57 in main (argc=1, argv=0x7fff2a0d8028)
> at 
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/mpi_test/src/mpi_test.cpp:30
> (gdb) 
> 
> 
> (gdb) bt
> #0  0x7f5dc31d2cf3 in epoll_wait () from /lib/libc.so.6
> #1  0x7f5dc454ba4b in epoll_dispatch ()
>from 
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
> #2  0x7f5dc45535fa in opal_event_base_loop ()
>from 
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
> #3  0x7f5dc4524e69 in opal_progress ()
>from 
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
> #4  0x7f5dbdca4b1d in mca_pml_ob1_send ()
>from 
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/openmpi/mca_pml_ob1.so
> #5  0x7f5dc44c574f in PMPI_Send ()
>from 
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
> #6  0x0040adda in MPI::Comm::Send (this=0x612800, buf=0x7fff6e0c0790, 
> count=1, datatype=..., dest=0, tag=100)
> at 
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/include/openmpi/ompi/mpi/cxx/comm_inln.h:29
> #7  0x00409b72 in main (argc=1, argv=0x7fff6e0c09b8)
> at 
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/mpi_test/src/mpi_test.cpp:38
> (gdb) 
> 
> 
> 
> 
> On Mon, Aug 9, 2010 at 6:39 AM, Jeff Squyres  wrote:
> In your first mail, you mentioned that you are testing the new knem support.
> 
> Can you try disabling knem and see if that fixes the problem?  (i.e., run 
> with --mca btl_sm_use_knem 0")  If it fixes the issue, that might mean we 
> have a knem-based bug.
> 
> 
> 
> On Aug 6, 2010, at 1:42 PM, John Hsu wrote:
> 
> > Hi,
> >
> > sorry for the confusion, that was indeed the trunk version of things I was 
> > running.
> >
> > Here's the same problem using
> >
> > http://www.open-mpi.org/software/ompi/v1.5/downloads/openmpi-1.5rc5.tar.bz2
> >
> > command-line:
> >
> > ../openmpi_devel/bin/mpirun -hostfile hostfiles/hostfile.wgsgX -npernode 11 
> > ./bin/mpi_test
> >
> > back trace on sender:
> >
> > (gdb) bt
> > #0  0x7fa003bcacf3 in epoll_wait () from /lib/libc.so.6
> > #1  0x7fa004f43a4b in epoll_dispatch ()
> >from 
> > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
> > #2  0x7fa004f4b5fa in opal_event_base_loop ()
> >from 
> > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
> > #3  0x7fa004f1ce69 in opal_progress ()
> >from 
> > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
> > #4  0x7f9ffe69be95 in mca_pml_ob1_recv ()
> >from 
> > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/openmpi/mca_pml_ob1.so
> > #5  0x7fa004ebb35c in PMPI_Recv ()
> >from 
> > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
> > #6  0x0040ae48 in M

Re: [OMPI users] deadlock in openmpi 1.5rc5

2010-08-09 Thread John Hsu

I've replied in the ticket.
https://svn.open-mpi.org/trac/ompi/ticket/2530#comment:2
thanks!
John

On Mon, Aug 9, 2010 at 2:42 PM, Jeff Squyres  wrote:

> I've opened a ticket about this -- if it's an actual problem, it's a 1.5
> blocker:
>
>https://svn.open-mpi.org/trac/ompi/ticket/2530
>
> What version of knem and Linux are you using?
>
>
>
> On Aug 9, 2010, at 4:50 PM, John Hsu wrote:
>
> > problem "fixed" by adding the --mca btl_sm_use_knem 0 option (with
> -npernode 11), so I proceeded to bump up -npernode to 12:
> >
> > $ ../openmpi_devel/bin/mpirun -hostfile hostfiles/hostfile.wgsgX
> -npernode 12 --mca btl_sm_use_knem 0  ./bin/mpi_test
> >
> > and the same error occurs,
> >
> > (gdb) bt
> > #0  0x7fcca6ae5cf3 in epoll_wait () from /lib/libc.so.6
> > #1  0x7fcca7e5ea4b in epoll_dispatch ()
> >from
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
> > #2  0x7fcca7e665fa in opal_event_base_loop ()
> >from
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
> > #3  0x7fcca7e37e69 in opal_progress ()
> >from
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
> > #4  0x7fcca15b6e95 in mca_pml_ob1_recv ()
> >from
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/openmpi/mca_pml_ob1.so
> > #5  0x7fcca7dd635c in PMPI_Recv ()
> >from
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
> > #6  0x0040ae48 in MPI::Comm::Recv (this=0x612800,
> buf=0x7fff2a0d7e00,
> > count=1, datatype=..., source=23, tag=100, status=...)
> > at
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/include/openmpi/ompi/mpi/cxx/comm_inln.h:36
> > #7  0x00409a57 in main (argc=1, argv=0x7fff2a0d8028)
> > at
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/mpi_test/src/mpi_test.cpp:30
> > (gdb)
> >
> >
> > (gdb) bt
> > #0  0x7f5dc31d2cf3 in epoll_wait () from /lib/libc.so.6
> > #1  0x7f5dc454ba4b in epoll_dispatch ()
> >from
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
> > #2  0x7f5dc45535fa in opal_event_base_loop ()
> >from
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
> > #3  0x7f5dc4524e69 in opal_progress ()
> >from
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
> > #4  0x7f5dbdca4b1d in mca_pml_ob1_send ()
> >from
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/openmpi/mca_pml_ob1.so
> > #5  0x7f5dc44c574f in PMPI_Send ()
> >from
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
> > #6  0x0040adda in MPI::Comm::Send (this=0x612800,
> buf=0x7fff6e0c0790,
> > count=1, datatype=..., dest=0, tag=100)
> > at
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/include/openmpi/ompi/mpi/cxx/comm_inln.h:29
> > #7  0x00409b72 in main (argc=1, argv=0x7fff6e0c09b8)
> > at
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/mpi_test/src/mpi_test.cpp:38
> > (gdb)
> >
> >
> >
> >
> > On Mon, Aug 9, 2010 at 6:39 AM, Jeff Squyres  wrote:
> > In your first mail, you mentioned that you are testing the new knem
> support.
> >
> > Can you try disabling knem and see if that fixes the problem?  (i.e., run
> with --mca btl_sm_use_knem 0")  If it fixes the issue, that might mean we
> have a knem-based bug.
> >
> >
> >
> > On Aug 6, 2010, at 1:42 PM, John Hsu wrote:
> >
> > > Hi,
> > >
> > > sorry for the confusion, that was indeed the trunk version of things I
> was running.
> > >
> > > Here's the same problem using
> > >
> > >
> http://www.open-mpi.org/software/ompi/v1.5/downloads/openmpi-1.5rc5.tar.bz2
> > >
> > > command-line:
> > >
> > > ../openmpi_devel/bin/mpirun -hostfile hostfiles/hostfile.wgsgX
> -npernode 11 ./bin/mpi_test
> > >
> > > back trace on sender:
> > >
> > > (gdb) bt
> > > #0  0x7fa003bcacf3 in epoll_wait () from /lib/libc.so.6
> > > #1  0x7fa004f43a4b in epoll_dispatch ()
> > >from
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
> > > #2  0x7fa004f4b5fa in opal_event_base_loop ()
> > >from
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
> > > #3  0x7fa004f1ce69 in opal_progress ()
> > >from
> /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0
> > > #4  0x7f9ffe69be95 in mca_pml_ob1_recv ()
> > >from
> /wg/stor5/wgsim/hsu/project

Re: [OMPI users] MPI_Bcast issue

2010-08-09 Thread Randolph Pullen

The install was completly vanilla - no extras a plain .configure command line 
(on FC10 x8x_64 linux)

Are you saying that all broadcast calls are actually implemented as serial 
point to point calls?

--- On Tue, 10/8/10, Ralph Castain  wrote:

From: Ralph Castain 
Subject: Re: [OMPI users] MPI_Bcast issue
To: "Open MPI Users" 
Received: Tuesday, 10 August, 2010, 12:33 AM

No idea what is going on here. No MPI call is implemented as a multicast - it 
all flows over the MPI pt-2-pt system via one of the various algorithms.
Best guess I can offer is that there is a race condition in your program that 
you are tripping when other procs that share the node change the timing.
How did you configure OMPI when you built it?

On Aug 8, 2010, at 11:02 PM, Randolph Pullen wrote:
The only MPI calls I am using are these (grep-ed from my code):

MPI_Abort(MPI_COMM_WORLD, 1);
MPI_Barrier(MPI_COMM_WORLD);
MPI_Bcast(&bufarray[0].hdr, sizeof(BD_CHDR), MPI_CHAR, 0, MPI_COMM_WORLD);
MPI_Comm_rank(MPI_COMM_WORLD,&myid);
MPI_Comm_size(MPI_COMM_WORLD,&numprocs); 
MPI_Finalize();
MPI_Init(&argc, &argv);
MPI_Irecv(
MPI_Isend(
MPI_Recv(buff, BUFSIZE, MPI_CHAR, 0, TAG, MPI_COMM_WORLD, &stat);
MPI_Send(buff, BUFSIZE, MPI_CHAR, 0, TAG, MPI_COMM_WORLD);
MPI_Test(&request, &complete, &status);
MPI_Wait(&request, &status);  

The big wait happens on receipt of a bcast call that would otherwise work.
Its a bit mysterious really...

I presume that bcast is implemented with multicast calls but does it use any 
actual broadcast calls at all?  
I
 know I'm scraping the edges here looking for something but I just cant get my 
head around why it should fail where it has.

--- On Mon, 9/8/10, Ralph Castain  wrote:

From: Ralph Castain 
Subject: Re: [OMPI users] MPI_Bcast issue
To: "Open MPI Users" 
Received: Monday, 9 August, 2010, 1:32 PM

Hi Randolph
Unless your code is doing a connect/accept between the copies, there is no way 
they can cross-communicate. As you note, mpirun instances are completely 
isolated from each other - no process in one instance can possibly receive 
information from a process in another instance because it lacks all knowledge 
of it -unless- they wireup into a greater communicator by performing 
connect/accept calls between
 them.
I suspect you are inadvertently doing just that - perhaps by doing 
connect/accept in a tree-like manner, not realizing that the end result is one 
giant communicator that now links together all the N servers.
Otherwise, there is no possible way an MPI_Bcast in one mpirun can collide or 
otherwise communicate with an MPI_Bcast between processes started by another 
mpirun.

On Aug 8, 2010, at 7:13 PM, Randolph Pullen wrote:
Thanks,  although “An intercommunicator cannot be
 used for collective communication.” i.e ,  bcast calls., I can see how the 
MPI_Group_xx calls can be used to produce a useful group and then communicator; 
 - thanks again but this is really the side issue to my main question about 
MPI_Bcast.

I seem to have duplicate concurrent processes interfering with each other.  
This would appear to be a breach of the MPI safety dictum, ie MPI_COMM_WORD is 
supposed to only include the processes started by a single mpirun command and 
isolate these processes from other similar groups of processes safely.

So, it would appear to be a bug.  If so this has significant implications for 
environments such as mine, where it may often occur that the same program is 
run by different users simultaneously.  

It is really this issue
 that it concerning me, I can rewrite the code but if it can crash when 2 
copies run at the same time, I have a much bigger problem.

My suspicion is that a within the MPI_Bcast handshaking, a syncronising 
broadcast call may be colliding across the environments.  My only evidence is 
an otherwise working program waits on broadcast reception forever when two or 
more copies are run at [exactly] the same time.

Has anyone else seen similar behavior in concurrently running programs that 
perform lots of broadcasts perhaps?

Randolph

--- On Sun, 8/8/10, David Zhang  wrote:

From: David Zhang 
Subject: Re: [OMPI users] MPI_Bcast issue
To: "Open MPI Users" 
Received: Sunday, 8 August, 2010, 12:34 PM

In particular, intercommunicators

On 8/7/10, Aurélien Bouteiller  wrote:
> You should consider reading about communicators in MPI.
>
> Aurelien
> --
> Aurelien Bouteiller, Ph.D.
> Innovative Computing Laboratory, The University of Tennessee.
>
> Envoyé de mon iPad
>
> Le Aug 7, 2010 à 1:05, Randolph Pullen  a
> écrit :
>
>> I seem to be having a problem with
 MPI_Bcast.
>> My massive I/O intensive data movement program must broadcast from n to n
>> nodes. My problem starts because I require 2 processes per node, a sender
>> and a receiver and I have implemented
 these using MPI processes rather
>> than tackle the complexities of threads on MPI.
>>
>> Consequently, broadcast and calls like alltoall are not completely
>> helpful.  The dataset is huge and each node

Re: [OMPI users] MPI_Bcast issue

Re: [OMPI users] Fortran code generation error on 1.5 rc5

Re: [OMPI users] minor glitch in 1.5-rc5 Windows build - has workaround

Re: [OMPI users] Bug in POWERPC32.asm?

Re: [OMPI users] MPI_Bcast issue

Re: [OMPI users] MPI_Bcast issue

Re: [OMPI users] MPI_Bcast issue

Re: [OMPI users] deadlock in openmpi 1.5rc5

Re: [OMPI users] MPI_Bcast issue

Re: [OMPI users] openib issues

Re: [OMPI users] MPI_Bcast issue

[OMPI users] MPI Template Datatype?

Re: [OMPI users] MPI_Allreduce on local machine

Re: [OMPI users] MPI_Allreduce on local machine

[OMPI users] Checkpointing mpi4py program

Re: [OMPI users] MPI Template Datatype?

Re: [OMPI users] MPI Template Datatype?

Re: [OMPI users] Checkpointing mpi4py program

Re: [OMPI users] deadlock in openmpi 1.5rc5

Re: [OMPI users] MPI Template Datatype?

Re: [OMPI users] deadlock in openmpi 1.5rc5

Re: [OMPI users] deadlock in openmpi 1.5rc5

Re: [OMPI users] MPI_Bcast issue

23 matches

Site Navigation

Mail list logo

Footer information