Issue #8290 reported.
Thanks all for your help and the workaround provided.

Patrick

Le 14/12/2020 à 17:40, Jeff Squyres (jsquyres) a écrit :
> Yes, opening an issue would be great -- thanks!
>
>
>> On Dec 14, 2020, at 11:32 AM, Patrick Bégou via users
>> <users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>> wrote:
>>
>> OK, Thanks Gilles.
>> Does it still require that I open an issue for tracking ?
>>
>> Patrick
>>
>> Le 14/12/2020 à 14:56, Gilles Gouaillardet via users a écrit :
>>> Hi Patrick,
>>>
>>> Glad to hear you are now able to move forward.
>>>
>>> Please keep in mind this is not a fix but a temporary workaround.
>>> At first glance, I did not spot any issue in the current code.
>>> It turned out that the memory leak disappeared when doing things
>>> differently
>>>
>>> Cheers,
>>>
>>> Gilles
>>>
>>> On Mon, Dec 14, 2020 at 7:11 PM Patrick Bégou via users
>>> <users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>> wrote:
>>>
>>>     Hi Gilles,
>>>
>>>     you catch the bug! With this patch, on a single node, the memory
>>>     leak disappear. The cluster is actualy overloaded, as soon as
>>>     possible I will launch a multinode test.
>>>     Below the memory used by rank 0 before (blue) and after (red)
>>>     the patch.
>>>
>>>     Thanks
>>>
>>>     Patrick
>>>
>>>     <patch.png>
>>>
>>>     Le 10/12/2020 à 10:15, Gilles Gouaillardet via users a écrit :
>>>>     Patrick,
>>>>
>>>>
>>>>     First, thank you very much for sharing the reproducer.
>>>>
>>>>
>>>>     Yes, please open a github issue so we can track this.
>>>>
>>>>
>>>>     I cannot fully understand where the leak is coming from, but so
>>>>     far
>>>>
>>>>      - the code fails on master built with --enable-debug (the data
>>>>     engine reports an error) but not with the v3.1.x branch
>>>>
>>>>       (this suggests there could be an error in the latest Open MPI
>>>>     ... or in the code)
>>>>
>>>>      - the attached patch seems to have a positive effect, can you
>>>>     please give it a try?
>>>>
>>>>
>>>>     Cheers,
>>>>
>>>>
>>>>     Gilles
>>>>
>>>>
>>>>
>>>>     On 12/7/2020 6:15 PM, Patrick Bégou via users wrote:
>>>>>     Hi,
>>>>>
>>>>>     I've written a small piece of code to show the problem. Based
>>>>>     on my application but 2D and using integers arrays for testing.
>>>>>     The  figure below shows the max RSS size of rank 0 process on
>>>>>     20000 iterations on 8 and 16 cores, with openib and tcp drivers.
>>>>>     The more processes I have, the larger the memory leak.  I use
>>>>>     the same binaries for the 4 runs and OpenMPI 3.1 (same
>>>>>     behavior with 4.0.5).
>>>>>     The code is in attachment. I'll try to check type deallocation
>>>>>     as soon as possible.
>>>>>
>>>>>     Patrick
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>     Le 04/12/2020 à 01:34, Gilles Gouaillardet via users a écrit :
>>>>>>     Patrick,
>>>>>>
>>>>>>
>>>>>>     based on George's idea, a simpler check is to retrieve the
>>>>>>     Fortran index via the (standard) MPI_Type_c2() function
>>>>>>
>>>>>>     after you create a derived datatype.
>>>>>>
>>>>>>
>>>>>>     If the index keeps growing forever even after you
>>>>>>     MPI_Type_free(), then this clearly indicates a leak.
>>>>>>
>>>>>>     Unfortunately, this simple test cannot be used to definitely
>>>>>>     rule out any memory leak.
>>>>>>
>>>>>>
>>>>>>     Note you can also
>>>>>>
>>>>>>     mpirun --mca pml ob1 --mca btl tcp,self ...
>>>>>>
>>>>>>     in order to force communications over TCP/IP and hence rule
>>>>>>     out any memory leak that could be triggered by your fast
>>>>>>     interconnect.
>>>>>>
>>>>>>
>>>>>>
>>>>>>     In any case, a reproducer will greatly help us debugging this
>>>>>>     issue.
>>>>>>
>>>>>>
>>>>>>     Cheers,
>>>>>>
>>>>>>
>>>>>>     Gilles
>>>>>>
>>>>>>
>>>>>>
>>>>>>     On 12/4/2020 7:20 AM, George Bosilca via users wrote:
>>>>>>>     Patrick,
>>>>>>>
>>>>>>>     I'm afraid there is no simple way to check this. The main
>>>>>>>     reason being that OMPI use handles for MPI objects, and
>>>>>>>     these handles are not tracked by the library, they are
>>>>>>>     supposed to be provided by the user for each call. In
>>>>>>>     your case, as you already called MPI_Type_free on the
>>>>>>>     datatype, you cannot produce a valid handle.
>>>>>>>
>>>>>>>     There might be a trick. If the datatype is manipulated with
>>>>>>>     any Fortran MPI functions, then we convert the handle (which
>>>>>>>     in fact is a pointer) to an index into a pointer array
>>>>>>>     structure. Thus, the index will remain used, and can
>>>>>>>     therefore be used to convert back into a valid datatype
>>>>>>>     pointer, until OMPI completely releases the datatype. Look
>>>>>>>     into the ompi_datatype_f_to_c_table table to see the
>>>>>>>     datatypes that exist and get their pointers, and then use
>>>>>>>     these pointers as arguments to ompi_datatype_dump() to see
>>>>>>>     if any of these existing datatypes are the ones you define.
>>>>>>>
>>>>>>>     George.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>     On Thu, Dec 3, 2020 at 4:44 PM Patrick Bégou via users
>>>>>>>     <users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
>>>>>>>     <mailto:users@lists.open-mpi.org>
>>>>>>>     <mailto:users@lists.open-mpi.org>> wrote:
>>>>>>>
>>>>>>>         Hi,
>>>>>>>
>>>>>>>         I'm trying to solve a memory leak since my new
>>>>>>>     implementation of
>>>>>>>         communications based on MPI_AllToAllW and
>>>>>>>     MPI_type_Create_SubArray
>>>>>>>         calls.  Arrays of SubArray types are created/destroyed
>>>>>>>     at each
>>>>>>>         time step and used for communications.
>>>>>>>
>>>>>>>         On my laptop the code runs fine (running for 15000 temporal
>>>>>>>         itérations on 32 processes with oversubscription) but on
>>>>>>>     our
>>>>>>>         cluster memory used by the code increase until the
>>>>>>>     OOMkiller stop
>>>>>>>         the job. On the cluster we use IB QDR for communications.
>>>>>>>
>>>>>>>         Same Gcc/Gfortran 7.3 (built from sources), same sources of
>>>>>>>         OpenMPI (3.1 or 4.0.5 tested), same sources of the
>>>>>>>     fortran code on
>>>>>>>         the laptop and on the cluster.
>>>>>>>
>>>>>>>         Using Gcc/Gfortran 4.8 and OpenMPI 1.7.3 on the cluster
>>>>>>>     do not
>>>>>>>         show the problem (resident memory do not increase and we
>>>>>>>     ran
>>>>>>>         100000 temporal iterations)
>>>>>>>
>>>>>>>         MPI_type_free manual says that it "/Marks the datatype
>>>>>>>     object
>>>>>>>         associated with datatype for deallocation/". But  how
>>>>>>>     can I check
>>>>>>>         that the deallocation is really done ?
>>>>>>>
>>>>>>>         Thanks for ant suggestions.
>>>>>>>
>>>>>>>         Patrick
>>>>>>>
>>>>>
>>>
>>
>
>
> -- 
> Jeff Squyres
> jsquy...@cisco.com <mailto:jsquy...@cisco.com>
>

Reply via email to