Re: [OMPI users] MPI_Bcast issue
The only MPI calls I am using are these (grep-ed from my code): MPI_Abort(MPI_COMM_WORLD, 1); MPI_Barrier(MPI_COMM_WORLD); MPI_Bcast(&bufarray[0].hdr, sizeof(BD_CHDR), MPI_CHAR, 0, MPI_COMM_WORLD); MPI_Comm_rank(MPI_COMM_WORLD,&myid); MPI_Comm_size(MPI_COMM_WORLD,&numprocs); MPI_Finalize(); MPI_Init(&argc, &argv); MPI_Irecv( MPI_Isend( MPI_Recv(buff, BUFSIZE, MPI_CHAR, 0, TAG, MPI_COMM_WORLD, &stat); MPI_Send(buff, BUFSIZE, MPI_CHAR, 0, TAG, MPI_COMM_WORLD); MPI_Test(&request, &complete, &status); MPI_Wait(&request, &status); The big wait happens on receipt of a bcast call that would otherwise work. Its a bit mysterious really... I presume that bcast is implemented with multicast calls but does it use any actual broadcast calls at all? I know I'm scraping the edges here looking for something but I just cant get my head around why it should fail where it has. --- On Mon, 9/8/10, Ralph Castain wrote: From: Ralph Castain Subject: Re: [OMPI users] MPI_Bcast issue To: "Open MPI Users" Received: Monday, 9 August, 2010, 1:32 PM Hi Randolph Unless your code is doing a connect/accept between the copies, there is no way they can cross-communicate. As you note, mpirun instances are completely isolated from each other - no process in one instance can possibly receive information from a process in another instance because it lacks all knowledge of it -unless- they wireup into a greater communicator by performing connect/accept calls between them. I suspect you are inadvertently doing just that - perhaps by doing connect/accept in a tree-like manner, not realizing that the end result is one giant communicator that now links together all the N servers. Otherwise, there is no possible way an MPI_Bcast in one mpirun can collide or otherwise communicate with an MPI_Bcast between processes started by another mpirun. On Aug 8, 2010, at 7:13 PM, Randolph Pullen wrote: Thanks, although “An intercommunicator cannot be used for collective communication.” i.e , bcast calls., I can see how the MPI_Group_xx calls can be used to produce a useful group and then communicator; - thanks again but this is really the side issue to my main question about MPI_Bcast. I seem to have duplicate concurrent processes interfering with each other. This would appear to be a breach of the MPI safety dictum, ie MPI_COMM_WORD is supposed to only include the processes started by a single mpirun command and isolate these processes from other similar groups of processes safely. So, it would appear to be a bug. If so this has significant implications for environments such as mine, where it may often occur that the same program is run by different users simultaneously. It is really this issue that it concerning me, I can rewrite the code but if it can crash when 2 copies run at the same time, I have a much bigger problem. My suspicion is that a within the MPI_Bcast handshaking, a syncronising broadcast call may be colliding across the environments. My only evidence is an otherwise working program waits on broadcast reception forever when two or more copies are run at [exactly] the same time. Has anyone else seen similar behavior in concurrently running programs that perform lots of broadcasts perhaps? Randolph --- On Sun, 8/8/10, David Zhang wrote: From: David Zhang Subject: Re: [OMPI users] MPI_Bcast issue To: "Open MPI Users" Received: Sunday, 8 August, 2010, 12:34 PM In particular, intercommunicators On 8/7/10, Aurélien Bouteiller wrote: > You should consider reading about communicators in MPI. > > Aurelien > -- > Aurelien Bouteiller, Ph.D. > Innovative Computing Laboratory, The University of Tennessee. > > Envoyé de mon iPad > > Le Aug 7, 2010 à 1:05, Randolph Pullen a > écrit : > >> I seem to be having a problem with MPI_Bcast. >> My massive I/O intensive data movement program must broadcast from n to n >> nodes. My problem starts because I require 2 processes per node, a sender >> and a receiver and I have implemented these using MPI processes rather >> than tackle the complexities of threads on MPI. >> >> Consequently, broadcast and calls like alltoall are not completely >> helpful. The dataset is huge and each node must end up with a complete >> copy built by the large number of contributing broadcasts from the sending >> nodes. Network efficiency and run time are paramount. >> >> As I don’t want to needlessly broadcast all this data to the sending nodes >> and I have a perfectly good MPI program that distributes globally from a >> single node (1 to N), I took the unusual decision to start N copies of >> this program by spawning the MPI system from the PVM system in an effort >> to get my N to N concurrent transfers. >> >> It seems that the broadcasts running on concurrent MPI environments >> collide and cause all but the first process to hang waiting for their >> broadcasts. This theory seems to be confirmed by introducing a sleep of >> n-1 seconds before the first MPI_Bcast call
Re: [OMPI users] Fortran code generation error on 1.5 rc5
Hi Damien, I'll check this. Thanks for reporting it. Shiqing On 2010-8-8 10:16 PM, Damien wrote: Hi all, There's a code generation bug in the CMake/Visual Studio build of rc 5 on VS 2008. A release build, with static libs, F77 and F90 support gives an error at line 91 in mpif-config.h : parameter (MPI_STATUS_SIZE=) This obviously makes the compiler unhappy. In older trunk builds this was parameter (MPI_STATUS_SIZE=5) Damien ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- -- Shiqing Fan http://www.hlrs.de/people/fan High Performance Computing Tel.: +49 711 685 87234 Center Stuttgart (HLRS)Fax.: +49 711 685 65832 Address:Allmandring 30 email: f...@hlrs.de 70569 Stuttgart
Re: [OMPI users] minor glitch in 1.5-rc5 Windows build - has workaround
Hi Damien, It is the user's responsibility to make sure the consistency of CMake and VS build types, but you can't change this setting from CMake in order to change it automatically in VS, it's a nature of using CMake. Shiqing On 2010-8-6 10:33 PM, Damien wrote: Hi all, There's a small hiccup in building a Windows version of 1.5-rc5. When you configure in the CMake GUI, you can ask for a Debug or Release project before you hit Generate. If you ask for a Debug project, you can still change it to Release in Visual Studio, and it will build successfully. BUT: the Install project will fail, because it tries to install libopen-pald.pdb (possibly others too, I didn't check). It's a minor thing, only nuisance value. If you set a Release project in CMake, everything works fine. Damien ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- -- Shiqing Fan http://www.hlrs.de/people/fan High Performance Computing Tel.: +49 711 685 87234 Center Stuttgart (HLRS)Fax.: +49 711 685 65832 Address:Allmandring 30 email: f...@hlrs.de 70569 Stuttgart
Re: [OMPI users] Bug in POWERPC32.asm?
Thanks for reporting this Matthew. Fixed in r23576 ( https://svn.open-mpi.org/trac/ompi/changeset/23576) Regards --Nysal On Fri, Aug 6, 2010 at 10:38 PM, Matthew Clark wrote: > I was looking in my copy of openmpi-1.4.1 opal/asm/base/POWERPC32.asm > and saw the following: > > START_FUNC(opal_sys_timer_get_cycles) >LSYM(15) >mftbu r0 >mftb r11 >mftbu r2 >cmpw cr7,r2,r0 >bne+ cr7,REFLSYM(14) >li r4,0 >li r9,0 >or r3,r2,r9 >or r4,r4,r11 >blr > END_FUNC(opal_sys_timer_get_cycles) > > I'll readily admit at my lack of ppc assembly smartness, but shouldn't > the loop back at bne+ be to REFLSYM(15) instead of (14)? > > Matt > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] MPI_Bcast issue
On 8/8/2010 8:13 PM, Randolph Pullen wrote: > Thanks, although “An intercommunicator cannot be used for collective > communication.” i.e , bcast calls., yes it can. MPI-1 did not allow for collective operations on intercommunicators, but the MPI-2 specification did introduce that notion. Thanks Edgar > I can see how the MPI_Group_xx > calls can be used to produce a useful group and then communicator; - > thanks again but this is really the side issue to my main question > about MPI_Bcast. > > I seem to have duplicate concurrent processes interfering with each > other. This would appear to be a breach of the MPI safety dictum, ie > MPI_COMM_WORD is supposed to only include the processes started by a > single mpirun command and isolate these processes from other similar > groups of processes safely. > > So, it would appear to be a bug. If so this has significant > implications for environments such as mine, where it may often occur > that the same program is run by different users simultaneously. > > It is really this issue that it concerning me, I can rewrite the code > but if it can crash when 2 copies run at the same time, I have a much > bigger problem. > > My suspicion is that a within the MPI_Bcast handshaking, a > syncronising broadcast call may be colliding across the environments. > My only evidence is an otherwise working program waits on broadcast > reception forever when two or more copies are run at [exactly] the > same time. > > Has anyone else seen similar behavior in concurrently running > programs that perform lots of broadcasts perhaps? > > Randolph > > > --- On Sun, 8/8/10, David Zhang wrote: > > From: David Zhang Subject: Re: [OMPI users] > MPI_Bcast issue To: "Open MPI Users" Received: > Sunday, 8 August, 2010, 12:34 PM > > In particular, intercommunicators > > On 8/7/10, Aurélien Bouteiller wrote: >> You should consider reading about communicators in MPI. >> >> Aurelien -- Aurelien Bouteiller, Ph.D. Innovative Computing >> Laboratory, The University of Tennessee. >> >> Envoyé de mon iPad >> >> Le Aug 7, 2010 à 1:05, Randolph Pullen >> a écrit : >> >>> I seem to be having a problem with MPI_Bcast. My massive I/O >>> intensive data movement program must broadcast from n to n nodes. >>> My problem starts because I require 2 processes per node, a >>> sender and a receiver and I have implemented these using MPI >>> processes rather than tackle the complexities of threads on MPI. >>> >>> Consequently, broadcast and calls like alltoall are not >>> completely helpful. The dataset is huge and each node must end >>> up with a complete copy built by the large number of contributing >>> broadcasts from the sending nodes. Network efficiency and run >>> time are paramount. >>> >>> As I don’t want to needlessly broadcast all this data to the >>> sending nodes and I have a perfectly good MPI program that >>> distributes globally from a single node (1 to N), I took the >>> unusual decision to start N copies of this program by spawning >>> the MPI system from the PVM system in an effort to get my N to N >>> concurrent transfers. >>> >>> It seems that the broadcasts running on concurrent MPI >>> environments collide and cause all but the first process to hang >>> waiting for their broadcasts. This theory seems to be confirmed >>> by introducing a sleep of n-1 seconds before the first MPI_Bcast >>> call on each node, which results in the code working perfectly. >>> (total run time 55 seconds, 3 nodes, standard TCP stack) >>> >>> My guess is that unlike PVM, OpenMPI implements broadcasts with >>> broadcasts rather than multicasts. Can someone confirm this? Is >>> this a bug? >>> >>> Is there any multicast or N to N broadcast where sender processes >>> can avoid participating when they don’t need to? >>> >>> Thanks in advance Randolph >>> >>> >>> >>> ___ users mailing >>> list us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > > > > ___ users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] MPI_Bcast issue
I did not take the time to try to fully understand your approach so this may sound like a dumb question; Do you have an MPI_Bcast ROOT process in every MPI_COMM_WORLD and does every non-ROOT MPI_Bcast call correctly identify the rank of ROOT in its MPI_COMM_WORLD ? An MPI_Bcast call when there is not root task in the communicator or when the root task rank is given incorrectly will hang. Dick Treumann - MPI Team IBM Systems & Technology Group Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601 Tele (845) 433-7846 Fax (845) 433-8363 From: Randolph Pullen To: us...@open-mpi.org Date: 08/07/2010 01:23 AM Subject: [OMPI users] MPI_Bcast issue Sent by: users-boun...@open-mpi.org I seem to be having a problem with MPI_Bcast. My massive I/O intensive data movement program must broadcast from n to n nodes. My problem starts because I require 2 processes per node, a sender and a receiver and I have implemented these using MPI processes rather than tackle the complexities of threads on MPI. Consequently, broadcast and calls like alltoall are not completely helpful. The dataset is huge and each node must end up with a complete copy built by the large number of contributing broadcasts from the sending nodes. Network efficiency and run time are paramount. As I don’t want to needlessly broadcast all this data to the sending nodes and I have a perfectly good MPI program that distributes globally from a single node (1 to N), I took the unusual decision to start N copies of this program by spawning the MPI system from the PVM system in an effort to get my N to N concurrent transfers. It seems that the broadcasts running on concurrent MPI environments collide and cause all but the first process to hang waiting for their broadcasts. This theory seems to be confirmed by introducing a sleep of n-1 seconds before the first MPI_Bcast call on each node, which results in the code working perfectly. (total run time 55 seconds, 3 nodes, standard TCP stack) My guess is that unlike PVM, OpenMPI implements broadcasts with broadcasts rather than multicasts. Can someone confirm this? Is this a bug? Is there any multicast or N to N broadcast where sender processes can avoid participating when they don’t need to? Thanks in advance Randolph ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] MPI_Bcast issue
Sorry - I missed the statement that all works when you add sleeps. That probably rules out any possible error in the way MPI_Bcast was used. Dick Treumann - MPI Team IBM Systems & Technology Group Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601 Tele (845) 433-7846 Fax (845) 433-8363 From: Randolph Pullen To: us...@open-mpi.org Date: 08/07/2010 01:23 AM Subject: [OMPI users] MPI_Bcast issue Sent by: users-boun...@open-mpi.org I seem to be having a problem with MPI_Bcast. My massive I/O intensive data movement program must broadcast from n to n nodes. My problem starts because I require 2 processes per node, a sender and a receiver and I have implemented these using MPI processes rather than tackle the complexities of threads on MPI. Consequently, broadcast and calls like alltoall are not completely helpful. The dataset is huge and each node must end up with a complete copy built by the large number of contributing broadcasts from the sending nodes. Network efficiency and run time are paramount. As I don’t want to needlessly broadcast all this data to the sending nodes and I have a perfectly good MPI program that distributes globally from a single node (1 to N), I took the unusual decision to start N copies of this program by spawning the MPI system from the PVM system in an effort to get my N to N concurrent transfers. It seems that the broadcasts running on concurrent MPI environments collide and cause all but the first process to hang waiting for their broadcasts. This theory seems to be confirmed by introducing a sleep of n-1 seconds before the first MPI_Bcast call on each node, which results in the code working perfectly. (total run time 55 seconds, 3 nodes, standard TCP stack) My guess is that unlike PVM, OpenMPI implements broadcasts with broadcasts rather than multicasts. Can someone confirm this? Is this a bug? Is there any multicast or N to N broadcast where sender processes can avoid participating when they don’t need to? Thanks in advance Randolph ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] deadlock in openmpi 1.5rc5
In your first mail, you mentioned that you are testing the new knem support. Can you try disabling knem and see if that fixes the problem? (i.e., run with --mca btl_sm_use_knem 0") If it fixes the issue, that might mean we have a knem-based bug. On Aug 6, 2010, at 1:42 PM, John Hsu wrote: > Hi, > > sorry for the confusion, that was indeed the trunk version of things I was > running. > > Here's the same problem using > > http://www.open-mpi.org/software/ompi/v1.5/downloads/openmpi-1.5rc5.tar.bz2 > > command-line: > > ../openmpi_devel/bin/mpirun -hostfile hostfiles/hostfile.wgsgX -npernode 11 > ./bin/mpi_test > > back trace on sender: > > (gdb) bt > #0 0x7fa003bcacf3 in epoll_wait () from /lib/libc.so.6 > #1 0x7fa004f43a4b in epoll_dispatch () >from > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 > #2 0x7fa004f4b5fa in opal_event_base_loop () >from > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 > #3 0x7fa004f1ce69 in opal_progress () >from > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 > #4 0x7f9ffe69be95 in mca_pml_ob1_recv () >from > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/openmpi/mca_pml_ob1.so > #5 0x7fa004ebb35c in PMPI_Recv () >from > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 > #6 0x0040ae48 in MPI::Comm::Recv (this=0x612800, buf=0x7fff8f5cbb50, > count=1, datatype=..., source=29, > tag=100, status=...) > at > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/include/openmpi/ompi/mpi/cxx/comm_inln.h:36 > #7 0x00409a57 in main (argc=1, argv=0x7fff8f5cbd78) > at > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/mpi_test/src/mpi_test.cpp:30 > (gdb) > > back trace on receiver: > > (gdb) bt > #0 0x7fcce1ba5cf3 in epoll_wait () from /lib/libc.so.6 > #1 0x7fcce2f1ea4b in epoll_dispatch () >from > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 > #2 0x7fcce2f265fa in opal_event_base_loop () >from > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 > #3 0x7fcce2ef7e69 in opal_progress () >from > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 > #4 0x7fccdc677b1d in mca_pml_ob1_send () >from > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/openmpi/mca_pml_ob1.so > #5 0x7fcce2e9874f in PMPI_Send () >from > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 > #6 0x0040adda in MPI::Comm::Send (this=0x612800, buf=0x7fff3f18ad20, > count=1, datatype=..., dest=0, tag=100) > at > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/include/openmpi/ompi/mpi/cxx/comm_inln.h:29 > #7 0x00409b72 in main (argc=1, argv=0x7fff3f18af48) > at > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/mpi_test/src/mpi_test.cpp:38 > (gdb) > > and attached is my mpi_test file for reference. > > thanks, > John > > > On Fri, Aug 6, 2010 at 6:24 AM, Ralph Castain wrote: > You clearly have an issue with version confusion. The file cited in your > warning: > > > [wgsg0:29074] Warning -- mutex was double locked from errmgr_hnp.c:772 > > does not exist in 1.5rc5. It only exists in the developer's trunk at this > time. Check to ensure you have the right paths set, blow away the install > area (in case you have multiple versions installed on top of each other), etc. > > > > On Aug 5, 2010, at 5:16 PM, John Hsu wrote: > > > Hi All, > > I am new to openmpi and have encountered an issue using pre-release 1.5rc5, > > for a simple mpi code (see attached). In this test, nodes 1 to n sends out > > a random number to node 0, node 0 sums all numbers received. > > > > This code works fine on 1 machine with any number of nodes, and on 3 > > machines running 10 nodes per machine, but when we try to run 11 nodes per > > machine this warning appears: > > > > [wgsg0:29074] Warning -- mutex was double locked from errmgr_hnp.c:772 > > > > And node 0 (master summing node) hangs on receiving plus another random > > node hangs on sending indefinitely. Below are the back traces: > > > > (gdb) bt > > #0 0x7f0c5f109cd3 in epoll_wait () from /lib/libc.so.6 > > #1 0x7f0c6052db53 in epoll_dispatch (base=0x2310bf0, arg=0x22f91f0, > > tv=0x7fff90f623e0) at epoll.c:215 > > #2 0x7f0c6053ae58 in opal_event_base_loop (base=0x2310bf0, flags=2) at > > event.c:838 > > #3 0x7f0c6053ac2
Re: [OMPI users] MPI_Bcast issue
Personally, I've been having trouble following the explanations of the problem. Perhaps it'd be helpful if you gave us an example of how to reproduce the problem. E.g., short sample code and how you run the example to produce the problem. The shorter the example, the greater the odds of resolution. From: Randolph Pullen To: us...@open-mpi.org Date: 08/07/2010 01:23 AM Subject: [OMPI users] MPI_Bcast issue Sent by: users-boun...@open-mpi.org I seem to be having a problem with MPI_Bcast. My massive I/O intensive data movement program must broadcast from n to n nodes. My problem starts because I require 2 processes per node, a sender and a receiver and I have implemented these using MPI processes rather than tackle the complexities of threads on MPI. Consequently, broadcast and calls like alltoall are not completely helpful. The dataset is huge and each node must end up with a complete copy built by the large number of contributing broadcasts from the sending nodes. Network efficiency and run time are paramount. As I don’t want to needlessly broadcast all this data to the sending nodes and I have a perfectly good MPI program that distributes globally from a single node (1 to N), I took the unusual decision to start N copies of this program by spawning the MPI system from the PVM system in an effort to get my N to N concurrent transfers. It seems that the broadcasts running on concurrent MPI environments collide and cause all but the first process to hang waiting for their broadcasts. This theory seems to be confirmed by introducing a sleep of n-1 seconds before the first MPI_Bcast call on each node, which results in the code working perfectly. (total run time 55 seconds, 3 nodes, standard TCP stack) My guess is that unlike PVM, OpenMPI implements broadcasts with broadcasts rather than multicasts. Can someone confirm this? Is this a bug? Is there any multicast or N to N broadcast where sender processes can avoid participating when they don’t need to?
Re: [OMPI users] openib issues
Hi, Could someone have a look on these two different error messages ? I'd like to know the reason(s) why they were displayed and their actual meaning. Thanks, Eloi On Monday 19 July 2010 16:38:57 Eloi Gaudry wrote: > Hi, > > I've been working on a random segmentation fault that seems to occur during > a collective communication when using the openib btl (see [OMPI users] > [openib] segfault when using openib btl). > > During my tests, I've come across different issues reported by > OpenMPI-1.4.2: > > 1/ > [[12770,1],0][btl_openib_component.c:3227:handle_wc] from bn0103 to: bn0122 > error polling LP CQ with status LOCAL LENGTH ERROR status number 1 for > wr_id 560618664 opcode 1 vendor error 105 qp_idx 3 > > 2/ > [[992,1],6][btl_openib_component.c:3227:handle_wc] from pbn04 to: pbn05 > error polling LP CQ with status REMOTE ACCESS ERROR status number 10 for > wr_id 162858496 opcode 1 vendor error 136 qp_idx > 0[[992,1],5][btl_openib_component.c:3227:handle_wc] from pbn05 to: pbn04 > error polling HP CQ with status WORK REQUEST FLUSHED ERROR status number 5 > for wr_id 485900928 opcode 0 vendor error 249 qp_idx 0 > > -- > The OpenFabrics stack has reported a network error event. Open MPI will > try to continue, but your job may end up failing. > > Local host:p'" > MPI process PID: 20743 > Error number: 3 (IBV_EVENT_QP_ACCESS_ERR) > > This error may indicate connectivity problems within the fabric; please > contact your system administrator. > -- > > I'd like to know what these two errors mean and where they come from. > > Thanks for your help, > Eloi -- Eloi Gaudry Free Field Technologies Company Website: http://www.fft.be Company Phone: +32 10 487 959
Re: [OMPI users] MPI_Bcast issue
No idea what is going on here. No MPI call is implemented as a multicast - it all flows over the MPI pt-2-pt system via one of the various algorithms. Best guess I can offer is that there is a race condition in your program that you are tripping when other procs that share the node change the timing. How did you configure OMPI when you built it? On Aug 8, 2010, at 11:02 PM, Randolph Pullen wrote: > The only MPI calls I am using are these (grep-ed from my code): > > MPI_Abort(MPI_COMM_WORLD, 1); > MPI_Barrier(MPI_COMM_WORLD); > MPI_Bcast(&bufarray[0].hdr, sizeof(BD_CHDR), MPI_CHAR, 0, MPI_COMM_WORLD); > MPI_Comm_rank(MPI_COMM_WORLD,&myid); > MPI_Comm_size(MPI_COMM_WORLD,&numprocs); > MPI_Finalize(); > MPI_Init(&argc, &argv); > MPI_Irecv( > MPI_Isend( > MPI_Recv(buff, BUFSIZE, MPI_CHAR, 0, TAG, MPI_COMM_WORLD, &stat); > MPI_Send(buff, BUFSIZE, MPI_CHAR, 0, TAG, MPI_COMM_WORLD); > MPI_Test(&request, &complete, &status); > MPI_Wait(&request, &status); > > The big wait happens on receipt of a bcast call that would otherwise work. > Its a bit mysterious really... > > I presume that bcast is implemented with multicast calls but does it use any > actual broadcast calls at all? > I know I'm scraping the edges here looking for something but I just cant get > my head around why it should fail where it has. > > --- On Mon, 9/8/10, Ralph Castain wrote: > > From: Ralph Castain > Subject: Re: [OMPI users] MPI_Bcast issue > To: "Open MPI Users" > Received: Monday, 9 August, 2010, 1:32 PM > > Hi Randolph > > Unless your code is doing a connect/accept between the copies, there is no > way they can cross-communicate. As you note, mpirun instances are completely > isolated from each other - no process in one instance can possibly receive > information from a process in another instance because it lacks all knowledge > of it -unless- they wireup into a greater communicator by performing > connect/accept calls between them. > > I suspect you are inadvertently doing just that - perhaps by doing > connect/accept in a tree-like manner, not realizing that the end result is > one giant communicator that now links together all the N servers. > > Otherwise, there is no possible way an MPI_Bcast in one mpirun can collide or > otherwise communicate with an MPI_Bcast between processes started by another > mpirun. > > > > On Aug 8, 2010, at 7:13 PM, Randolph Pullen wrote: > >> Thanks, although “An intercommunicator cannot be used for collective >> communication.” i.e , bcast calls., I can see how the MPI_Group_xx calls >> can be used to produce a useful group and then communicator; - thanks again >> but this is really the side issue to my main question about MPI_Bcast. >> >> I seem to have duplicate concurrent processes interfering with each other. >> This would appear to be a breach of the MPI safety dictum, ie MPI_COMM_WORD >> is supposed to only include the processes started by a single mpirun command >> and isolate these processes from other similar groups of processes safely. >> >> So, it would appear to be a bug. If so this has significant implications >> for environments such as mine, where it may often occur that the same >> program is run by different users simultaneously. >> >> It is really this issue that it concerning me, I can rewrite the code but if >> it can crash when 2 copies run at the same time, I have a much bigger >> problem. >> >> My suspicion is that a within the MPI_Bcast handshaking, a syncronising >> broadcast call may be colliding across the environments. My only evidence >> is an otherwise working program waits on broadcast reception forever when >> two or more copies are run at [exactly] the same time. >> >> Has anyone else seen similar behavior in concurrently running programs that >> perform lots of broadcasts perhaps? >> >> Randolph >> >> >> --- On Sun, 8/8/10, David Zhang wrote: >> >> From: David Zhang >> Subject: Re: [OMPI users] MPI_Bcast issue >> To: "Open MPI Users" >> Received: Sunday, 8 August, 2010, 12:34 PM >> >> In particular, intercommunicators >> >> On 8/7/10, Aurélien Bouteiller wrote: >> > You should consider reading about communicators in MPI. >> > >> > Aurelien >> > -- >> > Aurelien Bouteiller, Ph.D. >> > Innovative Computing Laboratory, The University of Tennessee. >> > >> > Envoyé de mon iPad >> > >> > Le Aug 7, 2010 à 1:05, Randolph Pullen a >> > écrit : >> > >> >> I seem to be having a problem with MPI_Bcast. >> >> My massive I/O intensive data movement program must broadcast from n to n >> >> nodes. My problem starts because I require 2 processes per node, a sender >> >> and a receiver and I have implemented these using MPI processes rather >> >> than tackle the complexities of threads on MPI. >> >> >> >> Consequently, broadcast and calls like alltoall are not completely >> >> helpful. The dataset is huge and each node must end up with a complete >> >> copy built by the large number of contributing broadcasts
[OMPI users] MPI Template Datatype?
Hi, I have to send some vectors from node to node, and the vecotrs are built using a template. The datatypes used in the template will be long, int, double, and char. How may I send those vectors since I wouldn't know what MPI datatype i have to specify in MPI_Send and MPI Recv. Is there any way to do this? -- Alexandru Blidaru University of Waterloo - Electrical Engineering '15 University email: asbli...@uwaterloo.ca Twitter handle: @G_raph Blog: http://alexblidaru.wordpress.com/
Re: [OMPI users] MPI_Allreduce on local machine
On Jul 28, 2010, at 12:21 PM, Åke Sandgren wrote: > > Jeff: Is this correct? > > This is wrong, it should be 8 and alignement should be 8 even for intel. > And i also see exactly the same thing. Good catch! I just fixed this in https://svn.open-mpi.org/trac/ompi/changeset/23580 -- it looks like a copy-n-paste error in displaying the Fortran sizes/alignments in ompi_info. It probably happened when ompi_info was converted from C++ to C. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] MPI_Allreduce on local machine
On Jul 28, 2010, at 5:07 PM, Gus Correa wrote: > Still, the alignment under Intel may or may not be right. > And this may or may not explain the errors that Hugo has got. > > FYI, the ompi_info from my OpenMPI 1.3.2 and 1.2.8 > report exactly the same as OpenMPI 1.4.2, namely > Fort dbl prec size: 4 and > Fort dbl prec align: 4, > except that *if the Intel Fortran compiler (ifort) was used* > I get 1 byte alignment: > Fort dbl prec align: 1 > > So, this issue has been around for a while, > and involves both the size and the alignment (in Intel) > of double precision. Yes, it's quite problematic to try to determine the alignment of Fortran types -- compilers can do different things and there's no reliable way (that I know of, at least) to absolutely get the "native" alignment. That being said, we didn't previously find any correctness issues with using an alignment of 1. > We have a number of pieces of code here where grep shows > MPI_DOUBLE_PRECISION. > Not sure how much of it has actually been active, as there are always > lots of cpp directives to select active code. > > In particular I found this interesting snippet: > > if (MPI_DOUBLE_PRECISION==20 .and. MPI_REAL8==18) then > ! LAM MPI defined MPI_REAL8 differently from MPI_DOUBLE_PRECISION > ! and LAM MPI's allreduce does not accept on MPI_REAL8 > MPIreal_t= MPI_DOUBLE_PRECISION > else > MPIreal_t= MPI_REAL8 > endif This kind of thing shouldn't be an issue with Open MPI, right? FWIW, OMPI uses different numbers for MPI_DOUBLE_PRECISION and MPI_REAL8 than LAM. They're distinct MPI datatypes because they *could* be different. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
[OMPI users] Checkpointing mpi4py program
Hi I have integrated mpi4py with openmpi 1.4.2 that was built with BLCR 0.8.2. When I run ompi-checkpoint on the program written using mpi4py, I see that program doesn't resume sometimes after successful checkpoint creation. This doesn't occur always meaning the program resumes after successful checkpoint creation most of the time and completes successfully. Has anyone tested the checkpoint/restart functionality with mpi4py programs? Are there any best practices that I should keep in mind while checkpointing mpi4py programs? Thanks for your time - Ananda Please do not print this email unless it is absolutely necessary. The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. www.wipro.com
Re: [OMPI users] MPI Template Datatype?
Hello Alexandru, On Mon, Aug 9, 2010 at 6:05 PM, Alexandru Blidaru wrote: > I have to send some vectors from node to node, and the vecotrs are built > using a template. The datatypes used in the template will be long, int, > double, and char. How may I send those vectors since I wouldn't know what > MPI datatype i have to specify in MPI_Send and MPI Recv. Is there any way to > do this? > I'm not sure I understand what your question is about: are you asking what MPI datatypes you should use to send C types "long", "int", etc., or are you trying to send a more complex C type ("vector")? Can you send some code demonstrating the problem you are trying to solve? Besides, your wording suggests that you are trying to send a C++ std::vector over MPI: have you already had a look at Boost.MPI? It has out-of-the-box support for STL containers. Cheers, Riccardo
Re: [OMPI users] MPI Template Datatype?
Hello Riccardo, I basically have to implement a 4D vector. An additional goal of my project is to support char, int, float and double datatypes in the vector. I figured that the only way to do this is through a template. Up to this point I was only supporting doubles in my vector, and I was sending each element individually from node to node. Since MPI_Send and MPI_Recv require the programmer to specify which datatype to use, and since I would only use doubles in the initial version of my project, using MPI_Send and MPI_Recv was easy. However if I am to declare my 4D vector like this std::vector , there will be no way for me to know which datatype to specify in the MPI_Send and MPI_Recv commands. No I haven't looked at Boost.MPI . I did a quick Ctrl-F of Boost.MPI in the MPI 2.2 doc that i found here: http://www.mpi-forum.org/docs/docs.html , but i was unable to find it. Could you point me to some resources about it? It would be a lot easier to use that rather than send every element 1 by 1. Thank you very much for your help. Alex On Mon, Aug 9, 2010 at 4:09 PM, Riccardo Murri wrote: > Hello Alexandru, > > On Mon, Aug 9, 2010 at 6:05 PM, Alexandru Blidaru > wrote: > > I have to send some vectors from node to node, and the vecotrs are built > > using a template. The datatypes used in the template will be long, int, > > double, and char. How may I send those vectors since I wouldn't know what > > MPI datatype i have to specify in MPI_Send and MPI Recv. Is there any way > to > > do this? > > > > I'm not sure I understand what your question is about: are you asking > what MPI datatypes you should use to send C types "long", "int", etc., > or are you trying to send a more complex C type ("vector")? > Can you send some code demonstrating the problem you are trying to solve? > > Besides, your wording suggests that you are trying to send a C++ > std::vector over MPI: have you already had a look at Boost.MPI? It > has out-of-the-box support for STL containers. > > Cheers, > Riccardo > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- Alexandru Blidaru University of Waterloo - Electrical Engineering '15 University email: asbli...@uwaterloo.ca Twitter handle: @G_raph Blog: http://alexblidaru.wordpress.com/
Re: [OMPI users] Checkpointing mpi4py program
I have not tried to checkpoint an mpi4py application, so I cannot say for sure if it works or not. You might be hitting something with the Python runtime interacting in an odd way with either Open MPI or BLCR. Can you attach a debugger and get a backtrace on a stuck checkpoint? That might show us where things are held up. -- Josh On Aug 9, 2010, at 4:04 PM, wrote: > Hi > > I have integrated mpi4py with openmpi 1.4.2 that was built with BLCR 0.8.2. > When I run ompi-checkpoint on the program written using mpi4py, I see that > program doesn’t resume sometimes after successful checkpoint creation. This > doesn’t occur always meaning the program resumes after successful checkpoint > creation most of the time and completes successfully. Has anyone tested the > checkpoint/restart functionality with mpi4py programs? Are there any best > practices that I should keep in mind while checkpointing mpi4py programs? > > Thanks for your time > - Ananda > Please do not print this email unless it is absolutely necessary. > > The information contained in this electronic message and any attachments to > this message are intended for the exclusive use of the addressee(s) and may > contain proprietary, confidential or privileged information. If you are not > the intended recipient, you should not disseminate, distribute or copy this > e-mail. Please notify the sender immediately and destroy all copies of this > message and any attachments. > > WARNING: Computer viruses can be transmitted via email. The recipient should > check this email and any attachments for the presence of viruses. The company > accepts no liability for any damage caused by any virus transmitted by this > email. > > www.wipro.com > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] deadlock in openmpi 1.5rc5
problem "fixed" by adding the --mca btl_sm_use_knem 0 option (with -npernode 11), so I proceeded to bump up -npernode to 12: $ ../openmpi_devel/bin/mpirun -hostfile hostfiles/hostfile.wgsgX -npernode 12 --mca btl_sm_use_knem 0 ./bin/mpi_test and the same error occurs, (gdb) bt #0 0x7fcca6ae5cf3 in epoll_wait () from /lib/libc.so.6 #1 0x7fcca7e5ea4b in epoll_dispatch () from /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 #2 0x7fcca7e665fa in opal_event_base_loop () from /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 #3 0x7fcca7e37e69 in opal_progress () from /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 #4 0x7fcca15b6e95 in mca_pml_ob1_recv () from /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/openmpi/mca_pml_ob1.so #5 0x7fcca7dd635c in PMPI_Recv () from /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 #6 0x0040ae48 in MPI::Comm::Recv (this=0x612800, buf=0x7fff2a0d7e00, count=1, datatype=..., source=23, tag=100, status=...) at /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/include/openmpi/ompi/mpi/cxx/comm_inln.h:36 #7 0x00409a57 in main (argc=1, argv=0x7fff2a0d8028) at /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/mpi_test/src/mpi_test.cpp:30 (gdb) (gdb) bt #0 0x7f5dc31d2cf3 in epoll_wait () from /lib/libc.so.6 #1 0x7f5dc454ba4b in epoll_dispatch () from /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 #2 0x7f5dc45535fa in opal_event_base_loop () from /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 #3 0x7f5dc4524e69 in opal_progress () from /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 #4 0x7f5dbdca4b1d in mca_pml_ob1_send () from /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/openmpi/mca_pml_ob1.so #5 0x7f5dc44c574f in PMPI_Send () from /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 #6 0x0040adda in MPI::Comm::Send (this=0x612800, buf=0x7fff6e0c0790, count=1, datatype=..., dest=0, tag=100) at /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/include/openmpi/ompi/mpi/cxx/comm_inln.h:29 #7 0x00409b72 in main (argc=1, argv=0x7fff6e0c09b8) at /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/mpi_test/src/mpi_test.cpp:38 (gdb) On Mon, Aug 9, 2010 at 6:39 AM, Jeff Squyres wrote: > In your first mail, you mentioned that you are testing the new knem > support. > > Can you try disabling knem and see if that fixes the problem? (i.e., run > with --mca btl_sm_use_knem 0") If it fixes the issue, that might mean we > have a knem-based bug. > > > > On Aug 6, 2010, at 1:42 PM, John Hsu wrote: > > > Hi, > > > > sorry for the confusion, that was indeed the trunk version of things I > was running. > > > > Here's the same problem using > > > > > http://www.open-mpi.org/software/ompi/v1.5/downloads/openmpi-1.5rc5.tar.bz2 > > > > command-line: > > > > ../openmpi_devel/bin/mpirun -hostfile hostfiles/hostfile.wgsgX -npernode > 11 ./bin/mpi_test > > > > back trace on sender: > > > > (gdb) bt > > #0 0x7fa003bcacf3 in epoll_wait () from /lib/libc.so.6 > > #1 0x7fa004f43a4b in epoll_dispatch () > >from > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 > > #2 0x7fa004f4b5fa in opal_event_base_loop () > >from > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 > > #3 0x7fa004f1ce69 in opal_progress () > >from > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 > > #4 0x7f9ffe69be95 in mca_pml_ob1_recv () > >from > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/openmpi/mca_pml_ob1.so > > #5 0x7fa004ebb35c in PMPI_Recv () > >from > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 > > #6 0x0040ae48 in MPI::Comm::Recv (this=0x612800, > buf=0x7fff8f5cbb50, count=1, datatype=..., source=29, > > tag=100, status=...) > > at > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/include/openmpi/ompi/mpi/cxx/comm_inln.h:36 > > #7 0x00409a57 in main (argc=1, argv=0x7fff8f5cbd78) > > at > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/s
Re: [OMPI users] MPI Template Datatype?
Hi Alexandru, you can read all about Boost.MPI at: http://www.boost.org/doc/libs/1_43_0/doc/html/mpi.html On Mon, Aug 9, 2010 at 10:27 PM, Alexandru Blidaru wrote: > I basically have to implement a 4D vector. An additional goal of my project > is to support char, int, float and double datatypes in the vector. If your "vector" is fixed-size (i.e., all vectors are comprised of 4 elements), then you can likely dispose of std::vector, use C-style arrays with templated send/receive calls (that would be just interfaces to MPI_Send/MPI_Recv) // BEWARE: untested code!!! template int send(T* vector, int dest, int tag, MPI_Comm comm) { throw std::logic_error("called generic MyVector::send"); }; template int recv(T* vector, int source, int tag, MPI_Comm comm) { throw std::logic_error("called generic MyVector::send"); }; and then you specialize the template for the types you actually use: template <> int send(int* vector, int dest, int tag, MPI_Comm comm) { return MPI_Send(vector, 4, MPI_DOUBLE, dest, tag, comm); }; template <> int recv(int* vector, int src, int tag, MPI_Comm comm) { return MPI_Recv(vector, 4, MPI_DOUBLE, dest, tag, comm); }; // etc. However, let me warn you that it would likely take more time and effort to write all the template specializations and get them working than just use Boost.MPI. Best regards, Riccardo
Re: [OMPI users] deadlock in openmpi 1.5rc5
I've opened a ticket about this -- if it's an actual problem, it's a 1.5 blocker: https://svn.open-mpi.org/trac/ompi/ticket/2530 What version of knem and Linux are you using? On Aug 9, 2010, at 4:50 PM, John Hsu wrote: > problem "fixed" by adding the --mca btl_sm_use_knem 0 option (with -npernode > 11), so I proceeded to bump up -npernode to 12: > > $ ../openmpi_devel/bin/mpirun -hostfile hostfiles/hostfile.wgsgX -npernode 12 > --mca btl_sm_use_knem 0 ./bin/mpi_test > > and the same error occurs, > > (gdb) bt > #0 0x7fcca6ae5cf3 in epoll_wait () from /lib/libc.so.6 > #1 0x7fcca7e5ea4b in epoll_dispatch () >from > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 > #2 0x7fcca7e665fa in opal_event_base_loop () >from > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 > #3 0x7fcca7e37e69 in opal_progress () >from > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 > #4 0x7fcca15b6e95 in mca_pml_ob1_recv () >from > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/openmpi/mca_pml_ob1.so > #5 0x7fcca7dd635c in PMPI_Recv () >from > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 > #6 0x0040ae48 in MPI::Comm::Recv (this=0x612800, buf=0x7fff2a0d7e00, > count=1, datatype=..., source=23, tag=100, status=...) > at > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/include/openmpi/ompi/mpi/cxx/comm_inln.h:36 > #7 0x00409a57 in main (argc=1, argv=0x7fff2a0d8028) > at > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/mpi_test/src/mpi_test.cpp:30 > (gdb) > > > (gdb) bt > #0 0x7f5dc31d2cf3 in epoll_wait () from /lib/libc.so.6 > #1 0x7f5dc454ba4b in epoll_dispatch () >from > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 > #2 0x7f5dc45535fa in opal_event_base_loop () >from > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 > #3 0x7f5dc4524e69 in opal_progress () >from > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 > #4 0x7f5dbdca4b1d in mca_pml_ob1_send () >from > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/openmpi/mca_pml_ob1.so > #5 0x7f5dc44c574f in PMPI_Send () >from > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 > #6 0x0040adda in MPI::Comm::Send (this=0x612800, buf=0x7fff6e0c0790, > count=1, datatype=..., dest=0, tag=100) > at > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/include/openmpi/ompi/mpi/cxx/comm_inln.h:29 > #7 0x00409b72 in main (argc=1, argv=0x7fff6e0c09b8) > at > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/mpi_test/src/mpi_test.cpp:38 > (gdb) > > > > > On Mon, Aug 9, 2010 at 6:39 AM, Jeff Squyres wrote: > In your first mail, you mentioned that you are testing the new knem support. > > Can you try disabling knem and see if that fixes the problem? (i.e., run > with --mca btl_sm_use_knem 0") If it fixes the issue, that might mean we > have a knem-based bug. > > > > On Aug 6, 2010, at 1:42 PM, John Hsu wrote: > > > Hi, > > > > sorry for the confusion, that was indeed the trunk version of things I was > > running. > > > > Here's the same problem using > > > > http://www.open-mpi.org/software/ompi/v1.5/downloads/openmpi-1.5rc5.tar.bz2 > > > > command-line: > > > > ../openmpi_devel/bin/mpirun -hostfile hostfiles/hostfile.wgsgX -npernode 11 > > ./bin/mpi_test > > > > back trace on sender: > > > > (gdb) bt > > #0 0x7fa003bcacf3 in epoll_wait () from /lib/libc.so.6 > > #1 0x7fa004f43a4b in epoll_dispatch () > >from > > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 > > #2 0x7fa004f4b5fa in opal_event_base_loop () > >from > > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 > > #3 0x7fa004f1ce69 in opal_progress () > >from > > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 > > #4 0x7f9ffe69be95 in mca_pml_ob1_recv () > >from > > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/openmpi/mca_pml_ob1.so > > #5 0x7fa004ebb35c in PMPI_Recv () > >from > > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 > > #6 0x0040ae48 in M
Re: [OMPI users] deadlock in openmpi 1.5rc5
I've replied in the ticket. https://svn.open-mpi.org/trac/ompi/ticket/2530#comment:2 thanks! John On Mon, Aug 9, 2010 at 2:42 PM, Jeff Squyres wrote: > I've opened a ticket about this -- if it's an actual problem, it's a 1.5 > blocker: > >https://svn.open-mpi.org/trac/ompi/ticket/2530 > > What version of knem and Linux are you using? > > > > On Aug 9, 2010, at 4:50 PM, John Hsu wrote: > > > problem "fixed" by adding the --mca btl_sm_use_knem 0 option (with > -npernode 11), so I proceeded to bump up -npernode to 12: > > > > $ ../openmpi_devel/bin/mpirun -hostfile hostfiles/hostfile.wgsgX > -npernode 12 --mca btl_sm_use_knem 0 ./bin/mpi_test > > > > and the same error occurs, > > > > (gdb) bt > > #0 0x7fcca6ae5cf3 in epoll_wait () from /lib/libc.so.6 > > #1 0x7fcca7e5ea4b in epoll_dispatch () > >from > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 > > #2 0x7fcca7e665fa in opal_event_base_loop () > >from > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 > > #3 0x7fcca7e37e69 in opal_progress () > >from > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 > > #4 0x7fcca15b6e95 in mca_pml_ob1_recv () > >from > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/openmpi/mca_pml_ob1.so > > #5 0x7fcca7dd635c in PMPI_Recv () > >from > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 > > #6 0x0040ae48 in MPI::Comm::Recv (this=0x612800, > buf=0x7fff2a0d7e00, > > count=1, datatype=..., source=23, tag=100, status=...) > > at > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/include/openmpi/ompi/mpi/cxx/comm_inln.h:36 > > #7 0x00409a57 in main (argc=1, argv=0x7fff2a0d8028) > > at > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/mpi_test/src/mpi_test.cpp:30 > > (gdb) > > > > > > (gdb) bt > > #0 0x7f5dc31d2cf3 in epoll_wait () from /lib/libc.so.6 > > #1 0x7f5dc454ba4b in epoll_dispatch () > >from > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 > > #2 0x7f5dc45535fa in opal_event_base_loop () > >from > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 > > #3 0x7f5dc4524e69 in opal_progress () > >from > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 > > #4 0x7f5dbdca4b1d in mca_pml_ob1_send () > >from > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/openmpi/mca_pml_ob1.so > > #5 0x7f5dc44c574f in PMPI_Send () > >from > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 > > #6 0x0040adda in MPI::Comm::Send (this=0x612800, > buf=0x7fff6e0c0790, > > count=1, datatype=..., dest=0, tag=100) > > at > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/include/openmpi/ompi/mpi/cxx/comm_inln.h:29 > > #7 0x00409b72 in main (argc=1, argv=0x7fff6e0c09b8) > > at > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/mpi_test/src/mpi_test.cpp:38 > > (gdb) > > > > > > > > > > On Mon, Aug 9, 2010 at 6:39 AM, Jeff Squyres wrote: > > In your first mail, you mentioned that you are testing the new knem > support. > > > > Can you try disabling knem and see if that fixes the problem? (i.e., run > with --mca btl_sm_use_knem 0") If it fixes the issue, that might mean we > have a knem-based bug. > > > > > > > > On Aug 6, 2010, at 1:42 PM, John Hsu wrote: > > > > > Hi, > > > > > > sorry for the confusion, that was indeed the trunk version of things I > was running. > > > > > > Here's the same problem using > > > > > > > http://www.open-mpi.org/software/ompi/v1.5/downloads/openmpi-1.5rc5.tar.bz2 > > > > > > command-line: > > > > > > ../openmpi_devel/bin/mpirun -hostfile hostfiles/hostfile.wgsgX > -npernode 11 ./bin/mpi_test > > > > > > back trace on sender: > > > > > > (gdb) bt > > > #0 0x7fa003bcacf3 in epoll_wait () from /lib/libc.so.6 > > > #1 0x7fa004f43a4b in epoll_dispatch () > > >from > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 > > > #2 0x7fa004f4b5fa in opal_event_base_loop () > > >from > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 > > > #3 0x7fa004f1ce69 in opal_progress () > > >from > /wg/stor5/wgsim/hsu/projects/cturtle_mpi/wg-ros-pkg-unreleased/stacks/mpi/openmpi_devel/lib/libmpi.so.0 > > > #4 0x7f9ffe69be95 in mca_pml_ob1_recv () > > >from > /wg/stor5/wgsim/hsu/project
Re: [OMPI users] MPI_Bcast issue
The install was completly vanilla - no extras a plain .configure command line (on FC10 x8x_64 linux) Are you saying that all broadcast calls are actually implemented as serial point to point calls? --- On Tue, 10/8/10, Ralph Castain wrote: From: Ralph Castain Subject: Re: [OMPI users] MPI_Bcast issue To: "Open MPI Users" Received: Tuesday, 10 August, 2010, 12:33 AM No idea what is going on here. No MPI call is implemented as a multicast - it all flows over the MPI pt-2-pt system via one of the various algorithms. Best guess I can offer is that there is a race condition in your program that you are tripping when other procs that share the node change the timing. How did you configure OMPI when you built it? On Aug 8, 2010, at 11:02 PM, Randolph Pullen wrote: The only MPI calls I am using are these (grep-ed from my code): MPI_Abort(MPI_COMM_WORLD, 1); MPI_Barrier(MPI_COMM_WORLD); MPI_Bcast(&bufarray[0].hdr, sizeof(BD_CHDR), MPI_CHAR, 0, MPI_COMM_WORLD); MPI_Comm_rank(MPI_COMM_WORLD,&myid); MPI_Comm_size(MPI_COMM_WORLD,&numprocs); MPI_Finalize(); MPI_Init(&argc, &argv); MPI_Irecv( MPI_Isend( MPI_Recv(buff, BUFSIZE, MPI_CHAR, 0, TAG, MPI_COMM_WORLD, &stat); MPI_Send(buff, BUFSIZE, MPI_CHAR, 0, TAG, MPI_COMM_WORLD); MPI_Test(&request, &complete, &status); MPI_Wait(&request, &status); The big wait happens on receipt of a bcast call that would otherwise work. Its a bit mysterious really... I presume that bcast is implemented with multicast calls but does it use any actual broadcast calls at all? I know I'm scraping the edges here looking for something but I just cant get my head around why it should fail where it has. --- On Mon, 9/8/10, Ralph Castain wrote: From: Ralph Castain Subject: Re: [OMPI users] MPI_Bcast issue To: "Open MPI Users" Received: Monday, 9 August, 2010, 1:32 PM Hi Randolph Unless your code is doing a connect/accept between the copies, there is no way they can cross-communicate. As you note, mpirun instances are completely isolated from each other - no process in one instance can possibly receive information from a process in another instance because it lacks all knowledge of it -unless- they wireup into a greater communicator by performing connect/accept calls between them. I suspect you are inadvertently doing just that - perhaps by doing connect/accept in a tree-like manner, not realizing that the end result is one giant communicator that now links together all the N servers. Otherwise, there is no possible way an MPI_Bcast in one mpirun can collide or otherwise communicate with an MPI_Bcast between processes started by another mpirun. On Aug 8, 2010, at 7:13 PM, Randolph Pullen wrote: Thanks, although “An intercommunicator cannot be used for collective communication.” i.e , bcast calls., I can see how the MPI_Group_xx calls can be used to produce a useful group and then communicator; - thanks again but this is really the side issue to my main question about MPI_Bcast. I seem to have duplicate concurrent processes interfering with each other. This would appear to be a breach of the MPI safety dictum, ie MPI_COMM_WORD is supposed to only include the processes started by a single mpirun command and isolate these processes from other similar groups of processes safely. So, it would appear to be a bug. If so this has significant implications for environments such as mine, where it may often occur that the same program is run by different users simultaneously. It is really this issue that it concerning me, I can rewrite the code but if it can crash when 2 copies run at the same time, I have a much bigger problem. My suspicion is that a within the MPI_Bcast handshaking, a syncronising broadcast call may be colliding across the environments. My only evidence is an otherwise working program waits on broadcast reception forever when two or more copies are run at [exactly] the same time. Has anyone else seen similar behavior in concurrently running programs that perform lots of broadcasts perhaps? Randolph --- On Sun, 8/8/10, David Zhang wrote: From: David Zhang Subject: Re: [OMPI users] MPI_Bcast issue To: "Open MPI Users" Received: Sunday, 8 August, 2010, 12:34 PM In particular, intercommunicators On 8/7/10, Aurélien Bouteiller wrote: > You should consider reading about communicators in MPI. > > Aurelien > -- > Aurelien Bouteiller, Ph.D. > Innovative Computing Laboratory, The University of Tennessee. > > Envoyé de mon iPad > > Le Aug 7, 2010 à 1:05, Randolph Pullen a > écrit : > >> I seem to be having a problem with MPI_Bcast. >> My massive I/O intensive data movement program must broadcast from n to n >> nodes. My problem starts because I require 2 processes per node, a sender >> and a receiver and I have implemented these using MPI processes rather >> than tackle the complexities of threads on MPI. >> >> Consequently, broadcast and calls like alltoall are not completely >> helpful. The dataset is huge and each node