[OMPI users] very long linking time with mixed-language libraries
Hello, I am using mpic++ to create a program that combines c++ and f90 libraries. The libraries are created with mpic++ and mpif90. OpenMPI-1.2 was built using gcc-4.1.1. (below follows the output of ompi_info. The final linking stage takes quite a long time compared to the creation of the libraries; I am wondering why and whether there is a way to speed up. Thanks for any inputs. -- Valmor ->./ompi_info Open MPI: 1.2 Open MPI SVN revision: r14027 Open RTE: 1.2 Open RTE SVN revision: r14027 OPAL: 1.2 OPAL SVN revision: r14027 Prefix: /usr/local/openmpi-1.2 Configured architecture: i686-pc-linux-gnu Configured by: root Configured on: Sun Mar 18 23:47:21 EDT 2007 Configure host: xeon0 Built by: root Built on: Sun Mar 18 23:57:41 EDT 2007 Built host: xeon0 C bindings: yes C++ bindings: yes Fortran77 bindings: yes (all) Fortran90 bindings: yes Fortran90 bindings size: medium C compiler: cc C compiler absolute: /usr/bin/cc C++ compiler: g++ C++ compiler absolute: /usr/bin/g++ Fortran77 compiler: gfortran Fortran77 compiler abs: /usr/i686-pc-linux-gnu/gcc-bin/4.1.1/gfortran Fortran90 compiler: gfortran Fortran90 compiler abs: /usr/i686-pc-linux-gnu/gcc-bin/4.1.1/gfortran C profiling: yes C++ profiling: yes Fortran77 profiling: yes Fortran90 profiling: yes C++ exceptions: no Thread support: posix (mpi: no, progress: no) Internal debug support: no MPI parameter check: always Memory profiling support: no Memory debugging support: no libltdl support: yes Heterogeneous support: yes mpirun default --prefix: no MCA backtrace: execinfo (MCA v1.0, API v1.0, Component v1.2) MCA memory: ptmalloc2 (MCA v1.0, API v1.0, Component v1.2) MCA paffinity: linux (MCA v1.0, API v1.0, Component v1.2) MCA maffinity: first_use (MCA v1.0, API v1.0, Component v1.2) MCA timer: linux (MCA v1.0, API v1.0, Component v1.2) MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0) MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0) MCA coll: basic (MCA v1.0, API v1.0, Component v1.2) MCA coll: self (MCA v1.0, API v1.0, Component v1.2) MCA coll: sm (MCA v1.0, API v1.0, Component v1.2) MCA coll: tuned (MCA v1.0, API v1.0, Component v1.2) MCA io: romio (MCA v1.0, API v1.0, Component v1.2) MCA mpool: sm (MCA v1.0, API v1.0, Component v1.2) MCA pml: cm (MCA v1.0, API v1.0, Component v1.2) MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.2) MCA bml: r2 (MCA v1.0, API v1.0, Component v1.2) MCA rcache: rb (MCA v1.0, API v1.0, Component v1.2) MCA rcache: vma (MCA v1.0, API v1.0, Component v1.2) MCA btl: self (MCA v1.0, API v1.0.1, Component v1.2) MCA btl: sm (MCA v1.0, API v1.0.1, Component v1.2) MCA btl: tcp (MCA v1.0, API v1.0.1, Component v1.0) MCA topo: unity (MCA v1.0, API v1.0, Component v1.2) MCA osc: pt2pt (MCA v1.0, API v1.0, Component v1.2) MCA errmgr: hnp (MCA v1.0, API v1.3, Component v1.2) MCA errmgr: orted (MCA v1.0, API v1.3, Component v1.2) MCA errmgr: proxy (MCA v1.0, API v1.3, Component v1.2)
[OMPI users] Memory leak in openmpi-1.2?
Hello, We are testing openmpi version 1.2 on Debian etch with openib. Some of our users are using scalapack/blacs that are running for a long time use a lot of mpi_comm functions. we have made a small C example that test if the mpi can handle this situation (see attach file). When we run this program wit argument 100 we see that it consumes more and more memory, eg: 100 Regards -- * * * Bas van der Vlies e-mail: b...@sara.nl * * SARA - Academic Computing Servicesphone: +31 20 592 8012 * * Kruislaan 415 fax:+31 20 6683167* * 1098 SJ Amsterdam * * *
Re: [OMPI users] Memory leak in openmpi-1.2?
Bas van der Vlies wrote: Hello, We are testing openmpi version 1.2 on Debian etch with openib. Some of our users are using scalapack/blacs that are running for a long time use a lot of mpi_comm functions. we have made a small C example that test if the mpi can handle this situation (see attach file). When we run this program wit argument 100 we see that it consumes more and more memory, eg: 100 Regards Forgot to attach theh file :-( -- * * * Bas van der Vlies e-mail: b...@sara.nl * * SARA - Academic Computing Servicesphone: +31 20 592 8012 * * Kruislaan 415 fax:+31 20 6683167* * 1098 SJ Amsterdam * * * #include #include int main(int argc, char *argv[]) { MPI_Comm comm; int size,rank; MPI_Init(&argc,&argv); MPI_Comm_size(MPI_COMM_WORLD,&size); MPI_Comm_rank(MPI_COMM_WORLD,&rank); int n; if (argc > 1) n=atoi(argv[1]); else n=1; if (rank == 0) { printf("Running with %d processes\n",size); printf("will do %d dups and frees\n",n); } int i; for (i=0; i
Re: [OMPI users] very long linking time with mixed-language libraries
I notice that you are using the "medium" sized F90 bindings. Do these FAQ entries help? http://www.open-mpi.org/faq/?category=mpi-apps#f90-mpi-slow-compiles http://www.open-mpi.org/faq/?category=building#f90-bindings-slow-compile On Mar 27, 2007, at 2:21 AM, de Almeida, Valmor F. wrote: Hello, I am using mpic++ to create a program that combines c++ and f90 libraries. The libraries are created with mpic++ and mpif90. OpenMPI-1.2 was built using gcc-4.1.1. (below follows the output of ompi_info. The final linking stage takes quite a long time compared to the creation of the libraries; I am wondering why and whether there is a way to speed up. Thanks for any inputs. -- Valmor ->./ompi_info Open MPI: 1.2 Open MPI SVN revision: r14027 Open RTE: 1.2 Open RTE SVN revision: r14027 OPAL: 1.2 OPAL SVN revision: r14027 Prefix: /usr/local/openmpi-1.2 Configured architecture: i686-pc-linux-gnu Configured by: root Configured on: Sun Mar 18 23:47:21 EDT 2007 Configure host: xeon0 Built by: root Built on: Sun Mar 18 23:57:41 EDT 2007 Built host: xeon0 C bindings: yes C++ bindings: yes Fortran77 bindings: yes (all) Fortran90 bindings: yes Fortran90 bindings size: medium C compiler: cc C compiler absolute: /usr/bin/cc C++ compiler: g++ C++ compiler absolute: /usr/bin/g++ Fortran77 compiler: gfortran Fortran77 compiler abs: /usr/i686-pc-linux-gnu/gcc-bin/4.1.1/ gfortran Fortran90 compiler: gfortran Fortran90 compiler abs: /usr/i686-pc-linux-gnu/gcc-bin/4.1.1/ gfortran C profiling: yes C++ profiling: yes Fortran77 profiling: yes Fortran90 profiling: yes C++ exceptions: no Thread support: posix (mpi: no, progress: no) Internal debug support: no MPI parameter check: always Memory profiling support: no Memory debugging support: no libltdl support: yes Heterogeneous support: yes mpirun default --prefix: no MCA backtrace: execinfo (MCA v1.0, API v1.0, Component v1.2) MCA memory: ptmalloc2 (MCA v1.0, API v1.0, Component v1.2) MCA paffinity: linux (MCA v1.0, API v1.0, Component v1.2) MCA maffinity: first_use (MCA v1.0, API v1.0, Component v1.2) MCA timer: linux (MCA v1.0, API v1.0, Component v1.2) MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0) MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0) MCA coll: basic (MCA v1.0, API v1.0, Component v1.2) MCA coll: self (MCA v1.0, API v1.0, Component v1.2) MCA coll: sm (MCA v1.0, API v1.0, Component v1.2) MCA coll: tuned (MCA v1.0, API v1.0, Component v1.2) MCA io: romio (MCA v1.0, API v1.0, Component v1.2) MCA mpool: sm (MCA v1.0, API v1.0, Component v1.2) MCA pml: cm (MCA v1.0, API v1.0, Component v1.2) MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.2) MCA bml: r2 (MCA v1.0, API v1.0, Component v1.2) MCA rcache: rb (MCA v1.0, API v1.0, Component v1.2) MCA rcache: vma (MCA v1.0, API v1.0, Component v1.2) MCA btl: self (MCA v1.0, API v1.0.1, Component v1.2) MCA btl: sm (MCA v1.0, API v1.0.1, Component v1.2) MCA btl: tcp (MCA v1.0, API v1.0.1, Component v1.0) MCA topo: unity (MCA v1.0, API v1.0, Component v1.2) MCA osc: pt2pt (MCA v1.0, API v1.0, Component v1.2) MCA errmgr: hnp (MCA v1.0, API v1.3, Component v1.2) MCA errmgr: orted (MCA v1.0, API v1.3, Component v1.2) MCA errmgr: proxy (MCA v1.0, API v1.3, Component v1.2) ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Cisco Systems
Re: [OMPI users] MPI processes swapping out
I tried the trunk version with "--mca btl tcp,self". Essentially system time changes to idle time, since empty polling is being replaced by blocking (right?). Page faults go to 0 though. It is interesting since you can see what is going on now, with distinct phases of user time and idle time (sleep mode, en masse). Before, vmstat showed processes going into sleep mode rather randomly, and distinct phases of mostly user time or mostly system time were not visible. I also tried mpi_yield_when_idle=0 with the trunk version. No effect on behavior. Todd On 3/23/07 7:15 PM, "George Bosilca" wrote: > So far the described behavior seems as normal as expected. As Open > MPI never goes in blocking mode, the processes will always spin > between active and sleep mode. More processes on the same node leads > to more time in the system mode (because of the empty polls). There > is a trick in the trunk version of Open MPI which will trigger the > blocking mode if and only if TCP is the only used device. Please try > add "--mca btl tcp,self" to your mpirun command line, and check the > output of vmstat. > >Thanks, > george. > > On Mar 23, 2007, at 3:32 PM, Heywood, Todd wrote: > >> Rolf, >> >>> Is it possible that everything is working just as it should? >> >> That's what I'm afraid of :-). But I did not expect to see such >> communication overhead due to blocking from mpiBLAST, which is very >> course-grained. I then tried HPL, which is computation-heavy, and >> found the >> same thing. Also, the system time seemed to correspond to the MPI >> processes >> cycling between run and sleep (as seen via top), and I thought that >> setting >> the mpi_yield_when_idle parameter to 0 would keep the processes from >> entering sleep state when blocking. But it doesn't. >> >> Todd >> >> >> >> On 3/23/07 2:06 PM, "Rolf Vandevaart" wrote: >> >>> >>> Todd: >>> >>> I assume the system time is being consumed by >>> the calls to send and receive data over the TCP sockets. >>> As the number of processes in the job increases, then more >>> time is spent waiting for data from one of the other processes. >>> >>> I did a little experiment on a single node to see the difference >>> in system time consumed when running over TCP vs when >>> running over shared memory. When running on a single >>> node and using the sm btl, I see almost 100% user time. >>> I assume this is because the sm btl handles sending and >>> receiving its data within a shared memory segment. >>> However, when I switch over to TCP, I see my system time >>> go up. Note that this is on Solaris. >>> >>> RUNNING OVER SELF,SM mpirun -np 8 -mca btl self,sm hpcc.amd64 >>> >>>PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG >>> PROCESS/NLWP >>> 3505 rolfv100 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 75 182 0 >>> hpcc.amd64/1 >>> 3503 rolfv100 0.0 0.0 0.0 0.0 0.0 0.0 0.2 0 69 116 0 >>> hpcc.amd64/1 >>> 3499 rolfv 99 0.0 0.0 0.0 0.0 0.0 0.0 0.5 0 106 236 0 >>> hpcc.amd64/1 >>> 3497 rolfv 99 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0 169 200 0 >>> hpcc.amd64/1 >>> 3501 rolfv 98 0.0 0.0 0.0 0.0 0.0 0.0 1.9 0 127 158 0 >>> hpcc.amd64/1 >>> 3507 rolfv 98 0.0 0.0 0.0 0.0 0.0 0.0 2.0 0 244 200 0 >>> hpcc.amd64/1 >>> 3509 rolfv 98 0.0 0.0 0.0 0.0 0.0 0.0 2.0 0 282 212 0 >>> hpcc.amd64/1 >>> 3495 rolfv 97 0.0 0.0 0.0 0.0 0.0 0.0 3.2 0 237 98 0 >>> hpcc.amd64/1 >>> >>> RUNNING OVER SELF,TCP mpirun -np 8 -mca btl self,tcp hpcc.amd64 >>> >>>PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG >>> PROCESS/NLWP >>> 4316 rolfv 93 6.9 0.0 0.0 0.0 0.0 0.0 0.2 5 346 .6M 0 >>> hpcc.amd64/1 >>> 4328 rolfv 91 8.4 0.0 0.0 0.0 0.0 0.0 0.4 3 59 .15 0 >>> hpcc.amd64/1 >>> 4324 rolfv 98 1.1 0.0 0.0 0.0 0.0 0.0 0.7 2 270 .1M 0 >>> hpcc.amd64/1 >>> 4320 rolfv 88 12 0.0 0.0 0.0 0.0 0.0 0.8 4 244 .15 0 >>> hpcc.amd64/1 >>> 4322 rolfv 94 5.1 0.0 0.0 0.0 0.0 0.0 1.3 2 150 .2M 0 >>> hpcc.amd64/1 >>> 4318 rolfv 92 6.7 0.0 0.0 0.0 0.0 0.0 1.4 5 236 .9M 0 >>> hpcc.amd64/1 >>> 4326 rolfv 93 5.3 0.0 0.0 0.0 0.0 0.0 1.7 7 117 .2M 0 >>> hpcc.amd64/1 >>> 4314 rolfv 91 6.6 0.0 0.0 0.0 0.0 1.3 0.9 19 150 .10 0 >>> hpcc.amd64/1 >>> >>> I also ran HPL over a larger cluster of 6 nodes, and noticed even >>> higher >>> system times. >>> >>> And lastly, I ran a simple MPI test over a cluster of 64 nodes, 2 >>> procs >>> per node >>> using Sun HPC ClusterTools 6, and saw about a 50/50 split between >>> user >>> and system time. >>> >>> PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG >>> PROCESS/NLWP >>> 11525 rolfv 55 44 0.1 0.0 0.0 0.0 0.1 0.4 76 960 .3M 0 >>> maxtrunc_ct6/1 >>> 11526 rolfv 54 45 0.0 0.0 0.0 0.0 0.0 1.0 0 362 .4M 0 >>> maxtrunc_ct6/1 >>> >>> Is it possible that everything is working just as it should? >>> >>> Rolf >>> >>> Heywood, Todd wrote On 03/22/07 13:30,: >>>
Re: [OMPI users] very long linking time with mixed-language libraries
> -Original Message- > From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On > Behalf Of Jeff Squyres > > I notice that you are using the "medium" sized F90 bindings. Do > these FAQ entries help? > > http://www.open-mpi.org/faq/?category=mpi-apps#f90-mpi-slow-compiles > http://www.open-mpi.org/faq/?category=building#f90-bindings-slow-compile > My understanding is that this is a problem with building the mpi library and not with compiling a user's code or library. In fact compiling my f90 codes is quite fast as compared with the c++ code. The time-consuming step is liking them all with mpic++. My application is a mix of c++ and f90 parallel codes and the main program is written in c++. Therefore mpic++ is used as the last phase to compile main and create the application. This last liking phase is very slow and slows down debugging incredibly because I don't know about the unresolved symbols until I do this final linking. Thanks, -- Valmor
[OMPI users] Open-MPI 1.2 and GM
Having a user who requires some of the features of gfortran in 4.1.2, I recently began building a new image. The issue is that "-mca btl gm" fails while "-mca mtl gm" works. I have not yet done any benchmarking, as I was wondering if the move to mtl is part of the upgrade. Below are the packages I rebuilt. Kernel 2.6.16.27 -> 2.6.20.1 Gcc 4.1.1 -> 4.1.2 GM Drivers 2.0.26 -> 2.0.26 (with patches for newer kernels) OpenMPI 1.1.4 -> 1.2 The following works as expected: /usr/local/ompi-gnu/bin/mpirun -np 4 -mca mtl gm --host node84,node83 ./xhpl The following fails: /usr/local/ompi-gnu/bin/mpirun -np 4 -mca btl gm --host node84,node83 ./xhpl I've attached gziped files as suggested on the "Getting Help" section of the website and the output from the failed mpirun. Both nodes are known good Myrinet nodes, using FMA to map. Thanks in advance, -- Justin Bronder Advanced Computing Research Lab University of Maine, Orono 20 Godfrey Dr Orono, ME 04473 www.clusters.umaine.edu config.log.gz Description: Binary data ompi_info.gz Description: Binary data -- Process 0.1.2 is unable to reach 0.1.2 for MPI communication. If you specified the use of a BTL component, you may have forgotten a component (such as "self") in the list of usable components. -- -- It looks like MPI_INIT failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during MPI_INIT; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): PML add procs failed --> Returned "Unreachable" (-12) instead of "Success" (0) -- *** An error occurred in MPI_Init *** before MPI was initialized *** MPI_ERRORS_ARE_FATAL (goodbye) -- Process 0.1.1 is unable to reach 0.1.1 for MPI communication. If you specified the use of a BTL component, you may have forgotten a component (such as "self") in the list of usable components. -- -- It looks like MPI_INIT failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during MPI_INIT; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): PML add procs failed --> Returned "Unreachable" (-12) instead of "Success" (0) -- *** An error occurred in MPI_Init *** before MPI was initialized *** MPI_ERRORS_ARE_FATAL (goodbye) -- Process 0.1.0 is unable to reach 0.1.0 for MPI communication. If you specified the use of a BTL component, you may have forgotten a component (such as "self") in the list of usable components. -- -- It looks like MPI_INIT failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during MPI_INIT; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): PML add procs failed --> Returned "Unreachable" (-12) instead of "Success" (0) -- *** An error occurred in MPI_Init *** before MPI was initialized *** MPI_ERRORS_ARE_FATAL (goodbye) -- Process 0.1.3 is unable to reach 0.1.3 for MPI communication. If you specified the use of a BTL component, you may have forgotten a component (such as "self") in the list of usable components. -- -- It looks like MPI_INIT failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during MPI_INIT; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): PML
Re: [OMPI users] [Re: Memory leak in openmpi-1.2?]
Hello Mr. Van der Vlies, We are currently looking into this problem and will send out an email as soon as we recognize something and fix it. Thank you, > Subject: Re: [OMPI users] Memory leak in openmpi-1.2? > Date: Tue, 27 Mar 2007 13:58:15 +0200 > From: Bas van der Vlies > Reply-To: Open MPI Users > To: Open MPI Users > References: <460905a3.7080...@sara.nl> > > Bas van der Vlies wrote: >> Hello, >> >> We are testing openmpi version 1.2 on Debian etch with openib. Some of >> our users are using scalapack/blacs that are running for a long time use >> a lot of mpi_comm functions. we have made a small C example that test if >> the mpi can handle this situation (see attach file). When we run this >> program wit argument 100 we see that it consumes more and more >> memory, eg: >> 100 >> >> Regards >> > Forgot to attach theh file :-( > > > -- > > * * > * Bas van der Vlies e-mail: b...@sara.nl * > * SARA - Academic Computing Servicesphone: +31 20 592 8012 * > * Kruislaan 415 fax:+31 20 6683167* > * 1098 SJ Amsterdam * > * * > -- Mohamad Chaarawi Philip G. Hall Room 526 Department of Computer Science University of Houston Houston, TX 77204
Re: [OMPI users] Issues with Get/Put and IRecv
Well, mpich2 and mvapich2 are working smoothly for my app. mpich2 under gige is also giving ~2X the performance of openmpi during the working cases for openmpi. After the paper deadline, I'll attempt to package up a simple test case and send it to the list. Thanks! -Mike Mike Houston wrote: Sadly, I've just hit this problem again, so I'll have to find another MPI implementation as I have a paper deadline quickly approaching. I'm using single threads now, but I had very similar issues when using multiple threads and issuing send/recv on one thread and waiting on a posted MPI_Recv on another. The issue seems to actually be with MPI_Gets. I can do heavy MPI_Put's and things seem okay. But as soon as I have a similar communication pattern with MPI_Get's things get unstable. -Mike Brian Barrett wrote: Mike - In Open MPI 1.2, one-sided is implemented over point-to-point, so I would expect it to be slower. This may or may not be addressed in a future version of Open MPI (I would guess so, but don't want to commit to it). Where you using multiple threads? If so, how? On the good news, I think your call stack looked similar to what I was seeing, so hopefully I can make some progress on a real solution. Brian On Mar 20, 2007, at 8:54 PM, Mike Houston wrote: Well, I've managed to get a working solution, but I'm not sure how I got there. I built a test case that looked like a nice simple version of what I was trying to do and it worked, so I moved the test code into my implementation and low and behold it works. I must have been doing something a little funky in the original pass, likely causing a stack smash somewhere or trying to do a get/put out of bounds. If I have any more problems, I'll let y'all know. I've tested pretty heavy usage up to 128 MPI processes across 16 nodes and things seem to be behaving. I did notice that single sided transfers seem to be a little slower than explicit send/recv, at least on GigE. Once I do some more testing, I'll bring things up on IB and see how things are going. -Mike Mike Houston wrote: Brian Barrett wrote: On Mar 20, 2007, at 3:15 PM, Mike Houston wrote: If I only do gets/puts, things seem to be working correctly with version 1.2. However, if I have a posted Irecv on the target node and issue a MPI_Get against that target, MPI_Test on the posed IRecv causes a segfaults: Anyone have suggestions? Sadly, I need to have IRecv's posted. I'll attempt to find a workaround, but it looks like the posed IRecv is getting all the data of the MPI_Get from the other node. It's like the message tagging is getting ignored. I've never tried posting two different IRecv's with different message tags either... Hi Mike - I've spent some time this afternoon looking at the problem and have some ideas on what could be happening. I don't think it's a data mismatch (the data intended for the IRecv getting delivered to the Get), but more a problem with the call to MPI_Test perturbing the progress flow of the one-sided engine. I can see one or two places where it's possible this could happen, although I'm having trouble replicating the problem with any test case I can write. Is it possible for you to share the code causing the problem (or some small test case)? It would make me feel considerably better if I could really understand the conditions required to end up in a seg fault state. Thanks, Brian Well, I can give you a linux x86 binary if that would do it. The code is huge as it's part of a much larger system, so there is no such thing as a simple case at the moment, and the code is in pieces an largely unrunnable now with all the hacking... I basically have one thread spinning on an MPI_Test on a posted IRecv while being used as the target to the MPI_Get. I'll see if I can hack together a simple version that breaks late tonight. I've just played with posting a send to that IRecv, issuing the MPI_Get, handshaking and then posting another IRecv and the MPI_Test continues to eat it, but in a memcpy: #0 0x001c068c in memcpy () from /lib/libc.so.6 #1 0x00e412d9 in ompi_convertor_pack (pConv=0x83c1198, iov=0xa0, out_size=0xaffc1fd8, max_data=0xaffc1fdc) at convertor.c:254 #2 0x00ea265d in ompi_osc_pt2pt_replyreq_send (module=0x856e668, replyreq=0x83c1180) at osc_pt2pt_data_move.c:411 #3 0x00ea0ebe in ompi_osc_pt2pt_component_fragment_cb (pt2pt_buffer=0x8573380) at osc_pt2pt_component.c:582 #4 0x00ea1389 in ompi_osc_pt2pt_progress () at osc_pt2pt_component.c:769 #5 0x00aa3019 in opal_progress () at runtime/opal_progress.c:288 #6 0x00ea59e5 in ompi_osc_pt2pt_passive_unlock (module=0x856e668, origin=1, count=1) at osc_pt2pt_sync.c:60 #7 0x00ea0cd2 in ompi_osc_pt2pt_component_fragment_cb (pt2pt_buffer=0x856f300) at osc_pt2pt_component.c:688 #8 0x00ea1389 in ompi_osc_pt2pt_progress () at
Re: [OMPI users] Open-MPI 1.2 and GM
Justin, There is no GM MTL. Therefore, the first mpirun allow the use of every available BTL, while the second one don't allow intra-node communications or self. The correct mpirun command line should be: mpirun -np 4 --mca btl gm,self ... george. On Mar 27, 2007, at 12:18 PM, Justin Bronder wrote: Having a user who requires some of the features of gfortran in 4.1.2, I recently began building a new image. The issue is that "-mca btl gm" fails while "-mca mtl gm" works. I have not yet done any benchmarking, as I was wondering if the move to mtl is part of the upgrade. Below are the packages I rebuilt. Kernel 2.6.16.27 -> 2.6.20.1 Gcc 4.1.1 -> 4.1.2 GM Drivers 2.0.26 -> 2.0.26 (with patches for newer kernels) OpenMPI 1.1.4 -> 1.2 The following works as expected: /usr/local/ompi-gnu/bin/mpirun -np 4 -mca mtl gm --host node84,node83 ./xhpl The following fails: /usr/local/ompi-gnu/bin/mpirun -np 4 -mca btl gm --host node84,node83 ./xhpl I've attached gziped files as suggested on the "Getting Help" section of the website and the output from the failed mpirun. Both nodes are known good Myrinet nodes, using FMA to map. Thanks in advance, -- Justin Bronder Advanced Computing Research Lab University of Maine, Orono 20 Godfrey Dr Orono, ME 04473 www.clusters.umaine.edu ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users "Half of what I say is meaningless; but I say it so that the other half may reach you" Kahlil Gibran
Re: [OMPI users] Open-MPI 1.2 and GM
Thanks for the response, I was hoping I'd just messed up something simple. Your advice took care of my issues. On 27/03/07 14:15 -0400, George Bosilca wrote: > Justin, > > There is no GM MTL. Therefore, the first mpirun allow the use of > every available BTL, while the second one don't allow intra-node > communications or self. The correct mpirun command line should be: > > mpirun -np 4 --mca btl gm,self ... > -- Justin Bronder Advanced Computing Research Lab University of Maine, Orono 20 Godfrey Dr Orono, ME 04473 www.clusters.umaine.edu