Hi,

Thanks Nysal for these details.

I also fixed my memory allocation issue using environment variable
OMPI_MCA_memory_ptmalloc2_disable which is much more easier (at least in my
case) than compiled new openmpi package and install that new package.
The point is that it is a bit complicated to have information about this
variable (seems to be a secret variable !). Actually I have read that it
cannot be used as normal MCA parameter and cannot be set in configuration
file ( http://www.open-mpi.org/community/lists/users/2010/06/13208.php ).

When using this variable, I have added -x OMPI_MCA_memory_ptmalloc2_disable
option to my mpirun command line. Do I really have to do it ?
Is the environment variable (plus -x option if required) is still the only
solution to set this parameter to 1 ?

Regards,
Nicolas



2010/8/15 Nysal Jan <jny...@gmail.com>

> >What does it exactly imply to compile with this option ?
> Open MPI's internal malloc library (ptmalloc) will not be built/used. If
> you are using an RDMA capable interconnect such as Infiniband, you will not
> be able to use the "mpi_leave_pinned" feature. mpi_leave_pinned might
> improve performance for applications that reuse/repeatedly send from the
> same buffer. If you are not using such interconnects then there is no impact
> on performance. For more details see the FAQ entries (24-28) -
> http://www.open-mpi.org/faq/?category=openfabrics#large-message-leave-pinned
>
> --Nysal
>
>
>
> On Thu, Aug 12, 2010 at 6:30 PM, Nicolas Deladerriere <
> nicolas.deladerri...@gmail.com> wrote:
>
>> building openmpi with option "--without-memory-manager" fix my problem.
>>
>> What does it exactly imply to compile with this option ?
>> I guess all malloc use functions from libc instead of openmpi one, but
>> does it have an effect on performance or something else ?
>>
>> Nicolas
>>
>> 2010/8/8 Nysal Jan <jny...@gmail.com>
>>
>> What interconnect are you using? Infiniband? Use
>>> "--without-memory-manager" option while building ompi in order to disable
>>> ptmalloc.
>>>
>>> Regards
>>> --Nysal
>>>
>>>
>>> On Sun, Aug 8, 2010 at 7:49 PM, Nicolas Deladerriere <
>>> nicolas.deladerri...@gmail.com> wrote:
>>>
>>>> Yes, I'am using 24G machine on 64 bit Linux OS.
>>>> If I compile without wrapper, I did not get any problems.
>>>>
>>>> It seems that when I am linking with openmpi, my program use a kind of
>>>> openmpi implemented malloc. Is it possible to switch it off in order ot 
>>>> only
>>>> use malloc from libc ?
>>>>
>>>> Nicolas
>>>>
>>>> 2010/8/8 Terry Frankcombe <te...@chem.gu.se>
>>>>
>>>> You're trying to do a 6GB allocate.  Can your underlying system handle
>>>>> that?  IF you compile without the wrapper, does it work?
>>>>>
>>>>> I see your executable is using the OMPI memory stuff.  IIRC there are
>>>>> switches to turn that off.
>>>>>
>>>>>
>>>>> On Fri, 2010-08-06 at 15:05 +0200, Nicolas Deladerriere wrote:
>>>>> > Hello,
>>>>> >
>>>>> > I'am having an sigsegv error when using simple program compiled and
>>>>> > link with openmpi.
>>>>> > I have reproduce the problem using really simple fortran code. It
>>>>> > actually does not even use MPI, but just link with mpi shared
>>>>> > libraries. (problem does not appear when I do not link with mpi
>>>>> > libraries)
>>>>> >    % cat allocate.F90
>>>>> >    program test
>>>>> >    implicit none
>>>>> >        integer, dimension(:), allocatable :: z
>>>>> >        integer(kind=8) :: l
>>>>> >
>>>>> >        write(*,*) "l ?"
>>>>> >        read(*,*) l
>>>>> >
>>>>> >        ALLOCATE(z(l))
>>>>> >        z(1) = 111
>>>>> >        z(l) = 222
>>>>> >        DEALLOCATE(z)
>>>>> >
>>>>> >    end program test
>>>>> >
>>>>> > I am using openmpi 1.4.2 and gfortran for my tests. Here is the
>>>>> > compilation :
>>>>> >
>>>>> >    % ./openmpi-1.4.2/build/bin/mpif90 --showme -g -o testallocate
>>>>> > allocate.F90
>>>>> >    gfortran -g -o testallocate allocate.F90
>>>>> > -I/s0/scr1/TOMOT_19311_HAL_/openmpi-1.4.2/build/include -pthread
>>>>> > -I/s0/scr1/TOMOT_19311_HAL_/openmpi-1.4.2/build/lib
>>>>> > -L/s0/scr1/TOMOT_19311_HAL_/openmpi-1.4.2/build/lib -lmpi_f90
>>>>> > -lmpi_f77 -lmpi -lopen-rte -lopen-pal -ldl -Wl,--export-dynamic -lnsl
>>>>> > -lutil -lm -ldl -pthread
>>>>> >
>>>>> > When I am running that test with different length, I sometimes get a
>>>>> > "Segmentation fault" error. Here are two examples using two specific
>>>>> > values, but error happens for many other values of length (I did not
>>>>> > manage to find which values of lenght gives that error)
>>>>> >
>>>>> >    %  ./testallocate
>>>>> >     l ?
>>>>> >    1600000000
>>>>> >    Segmentation fault
>>>>> >    % ./testallocate
>>>>> >     l ?
>>>>> >    2000000000
>>>>> >
>>>>> > I used debugger with re-compiled version of openmpi using debug flag.
>>>>> > I got the folowing error in function sYSMALLOc
>>>>> >
>>>>> >    Program received signal SIGSEGV, Segmentation fault.
>>>>> >    0x00002aaaab70b3b3 in sYSMALLOc (nb=6400000016, av=0x2aaaab930200)
>>>>> > at malloc.c:3239
>>>>> >    3239        set_head(remainder, remainder_size | PREV_INUSE);
>>>>> >    Current language:  auto; currently c
>>>>> >    (gdb) bt
>>>>> >    #0  0x00002aaaab70b3b3 in sYSMALLOc (nb=6400000016,
>>>>> > av=0x2aaaab930200) at malloc.c:3239
>>>>> >    #1  0x00002aaaab70d0db in opal_memory_ptmalloc2_int_malloc
>>>>> > (av=0x2aaaab930200, bytes=6400000000) at malloc.c:4322
>>>>> >    #2  0x00002aaaab70b773 in opal_memory_ptmalloc2_malloc
>>>>> > (bytes=6400000000) at malloc.c:3435
>>>>> >    #3  0x00002aaaab70a665 in opal_memory_ptmalloc2_malloc_hook
>>>>> > (sz=6400000000, caller=0x2aaaabf8534d) at hooks.c:667
>>>>> >    #4  0x00002aaaabf8534d in _gfortran_internal_free ()
>>>>> > from /usr/lib64/libgfortran.so.1
>>>>> >    #5  0x0000000000400bcc in MAIN__ () at allocate.F90:11
>>>>> >    #6  0x0000000000400c4e in main ()
>>>>> >    (gdb) display
>>>>> >    (gdb) list
>>>>> >    3234      if ((unsigned long)(size) >= (unsigned long)(nb +
>>>>> > MINSIZE)) {
>>>>> >    3235        remainder_size = size - nb;
>>>>> >    3236        remainder = chunk_at_offset(p, nb);
>>>>> >    3237        av->top = remainder;
>>>>> >    3238        set_head(p, nb | PREV_INUSE | (av != &main_arena ?
>>>>> > NON_MAIN_ARENA : 0));
>>>>> >    3239        set_head(remainder, remainder_size | PREV_INUSE);
>>>>> >    3240        check_malloced_chunk(av, p, nb);
>>>>> >    3241        return chunk2mem(p);
>>>>> >    3242      }
>>>>> >    3243
>>>>> >
>>>>> >
>>>>> > I also did the same test in C and I got the same problem.
>>>>> >
>>>>> > Does someone has any idea that could help me understand what's going
>>>>> > on ?
>>>>> >
>>>>> > Regards
>>>>> > Nicolas
>>>>> >
>>>>> > _______________________________________________
>>>>> > users mailing list
>>>>> > us...@open-mpi.org
>>>>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Reply via email to