Hi, Thanks Nysal for these details.
I also fixed my memory allocation issue using environment variable OMPI_MCA_memory_ptmalloc2_disable which is much more easier (at least in my case) than compiled new openmpi package and install that new package. The point is that it is a bit complicated to have information about this variable (seems to be a secret variable !). Actually I have read that it cannot be used as normal MCA parameter and cannot be set in configuration file ( http://www.open-mpi.org/community/lists/users/2010/06/13208.php ). When using this variable, I have added -x OMPI_MCA_memory_ptmalloc2_disable option to my mpirun command line. Do I really have to do it ? Is the environment variable (plus -x option if required) is still the only solution to set this parameter to 1 ? Regards, Nicolas 2010/8/15 Nysal Jan <jny...@gmail.com> > >What does it exactly imply to compile with this option ? > Open MPI's internal malloc library (ptmalloc) will not be built/used. If > you are using an RDMA capable interconnect such as Infiniband, you will not > be able to use the "mpi_leave_pinned" feature. mpi_leave_pinned might > improve performance for applications that reuse/repeatedly send from the > same buffer. If you are not using such interconnects then there is no impact > on performance. For more details see the FAQ entries (24-28) - > http://www.open-mpi.org/faq/?category=openfabrics#large-message-leave-pinned > > --Nysal > > > > On Thu, Aug 12, 2010 at 6:30 PM, Nicolas Deladerriere < > nicolas.deladerri...@gmail.com> wrote: > >> building openmpi with option "--without-memory-manager" fix my problem. >> >> What does it exactly imply to compile with this option ? >> I guess all malloc use functions from libc instead of openmpi one, but >> does it have an effect on performance or something else ? >> >> Nicolas >> >> 2010/8/8 Nysal Jan <jny...@gmail.com> >> >> What interconnect are you using? Infiniband? Use >>> "--without-memory-manager" option while building ompi in order to disable >>> ptmalloc. >>> >>> Regards >>> --Nysal >>> >>> >>> On Sun, Aug 8, 2010 at 7:49 PM, Nicolas Deladerriere < >>> nicolas.deladerri...@gmail.com> wrote: >>> >>>> Yes, I'am using 24G machine on 64 bit Linux OS. >>>> If I compile without wrapper, I did not get any problems. >>>> >>>> It seems that when I am linking with openmpi, my program use a kind of >>>> openmpi implemented malloc. Is it possible to switch it off in order ot >>>> only >>>> use malloc from libc ? >>>> >>>> Nicolas >>>> >>>> 2010/8/8 Terry Frankcombe <te...@chem.gu.se> >>>> >>>> You're trying to do a 6GB allocate. Can your underlying system handle >>>>> that? IF you compile without the wrapper, does it work? >>>>> >>>>> I see your executable is using the OMPI memory stuff. IIRC there are >>>>> switches to turn that off. >>>>> >>>>> >>>>> On Fri, 2010-08-06 at 15:05 +0200, Nicolas Deladerriere wrote: >>>>> > Hello, >>>>> > >>>>> > I'am having an sigsegv error when using simple program compiled and >>>>> > link with openmpi. >>>>> > I have reproduce the problem using really simple fortran code. It >>>>> > actually does not even use MPI, but just link with mpi shared >>>>> > libraries. (problem does not appear when I do not link with mpi >>>>> > libraries) >>>>> > % cat allocate.F90 >>>>> > program test >>>>> > implicit none >>>>> > integer, dimension(:), allocatable :: z >>>>> > integer(kind=8) :: l >>>>> > >>>>> > write(*,*) "l ?" >>>>> > read(*,*) l >>>>> > >>>>> > ALLOCATE(z(l)) >>>>> > z(1) = 111 >>>>> > z(l) = 222 >>>>> > DEALLOCATE(z) >>>>> > >>>>> > end program test >>>>> > >>>>> > I am using openmpi 1.4.2 and gfortran for my tests. Here is the >>>>> > compilation : >>>>> > >>>>> > % ./openmpi-1.4.2/build/bin/mpif90 --showme -g -o testallocate >>>>> > allocate.F90 >>>>> > gfortran -g -o testallocate allocate.F90 >>>>> > -I/s0/scr1/TOMOT_19311_HAL_/openmpi-1.4.2/build/include -pthread >>>>> > -I/s0/scr1/TOMOT_19311_HAL_/openmpi-1.4.2/build/lib >>>>> > -L/s0/scr1/TOMOT_19311_HAL_/openmpi-1.4.2/build/lib -lmpi_f90 >>>>> > -lmpi_f77 -lmpi -lopen-rte -lopen-pal -ldl -Wl,--export-dynamic -lnsl >>>>> > -lutil -lm -ldl -pthread >>>>> > >>>>> > When I am running that test with different length, I sometimes get a >>>>> > "Segmentation fault" error. Here are two examples using two specific >>>>> > values, but error happens for many other values of length (I did not >>>>> > manage to find which values of lenght gives that error) >>>>> > >>>>> > % ./testallocate >>>>> > l ? >>>>> > 1600000000 >>>>> > Segmentation fault >>>>> > % ./testallocate >>>>> > l ? >>>>> > 2000000000 >>>>> > >>>>> > I used debugger with re-compiled version of openmpi using debug flag. >>>>> > I got the folowing error in function sYSMALLOc >>>>> > >>>>> > Program received signal SIGSEGV, Segmentation fault. >>>>> > 0x00002aaaab70b3b3 in sYSMALLOc (nb=6400000016, av=0x2aaaab930200) >>>>> > at malloc.c:3239 >>>>> > 3239 set_head(remainder, remainder_size | PREV_INUSE); >>>>> > Current language: auto; currently c >>>>> > (gdb) bt >>>>> > #0 0x00002aaaab70b3b3 in sYSMALLOc (nb=6400000016, >>>>> > av=0x2aaaab930200) at malloc.c:3239 >>>>> > #1 0x00002aaaab70d0db in opal_memory_ptmalloc2_int_malloc >>>>> > (av=0x2aaaab930200, bytes=6400000000) at malloc.c:4322 >>>>> > #2 0x00002aaaab70b773 in opal_memory_ptmalloc2_malloc >>>>> > (bytes=6400000000) at malloc.c:3435 >>>>> > #3 0x00002aaaab70a665 in opal_memory_ptmalloc2_malloc_hook >>>>> > (sz=6400000000, caller=0x2aaaabf8534d) at hooks.c:667 >>>>> > #4 0x00002aaaabf8534d in _gfortran_internal_free () >>>>> > from /usr/lib64/libgfortran.so.1 >>>>> > #5 0x0000000000400bcc in MAIN__ () at allocate.F90:11 >>>>> > #6 0x0000000000400c4e in main () >>>>> > (gdb) display >>>>> > (gdb) list >>>>> > 3234 if ((unsigned long)(size) >= (unsigned long)(nb + >>>>> > MINSIZE)) { >>>>> > 3235 remainder_size = size - nb; >>>>> > 3236 remainder = chunk_at_offset(p, nb); >>>>> > 3237 av->top = remainder; >>>>> > 3238 set_head(p, nb | PREV_INUSE | (av != &main_arena ? >>>>> > NON_MAIN_ARENA : 0)); >>>>> > 3239 set_head(remainder, remainder_size | PREV_INUSE); >>>>> > 3240 check_malloced_chunk(av, p, nb); >>>>> > 3241 return chunk2mem(p); >>>>> > 3242 } >>>>> > 3243 >>>>> > >>>>> > >>>>> > I also did the same test in C and I got the same problem. >>>>> > >>>>> > Does someone has any idea that could help me understand what's going >>>>> > on ? >>>>> > >>>>> > Regards >>>>> > Nicolas >>>>> > >>>>> > _______________________________________________ >>>>> > users mailing list >>>>> > us...@open-mpi.org >>>>> > http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>> >>>>> _______________________________________________ >>>>> users mailing list >>>>> us...@open-mpi.org >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>> >>>> >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>> >>> >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >