Simply configure a bad log collation host. In our lab, we just changed the port 
number to something bogus. So make ALL defined hosts down if you have more than 
one.

-----Original Message-----
From: Bryan Call [mailto:bc...@apache.org] 
Sent: Thursday, July 26, 2018 12:46 PM
To: dev@trafficserver.apache.org
Subject: Re: Issue #2416 - Log Collation Memory Leak.

How are you seeing that it is leaking memory.  It looks like this change 
happened back in ATS 4.2.0:

commit 09ce8dc3434b92b00d99e0147e533a5f0bde8d7c
Author: Yunkai Zhang <qiushu....@taobao.com>
Date:   Mon Jan 6 23:00:14 2014 +0800

    TS-2479: Don't output orphan log after failover sucessfully

    Don't output orphan log after failover successfully, BTW fix a potential
    issue about LogBuffer->m_references.

    ==NOTE==:
    Some logs maybe pending in the buffer of network thread when log server
    is down, in which case, the pending logs will be outputed to orphan file,
    and them will not be sent to the failover host.

    Signed-off-by: Yunkai Zhang <qiushu....@taobao.com>

-Bryan



> On Jul 26, 2018, at 2:14 AM, Chou, Peter <pbc...@labs.att.com> wrote:
> 
> Hi All,
> 
> After some additional study on how the ink_atomic_increment() and 
> LogBuffer::destroy() work together in the LogFile::, LogHost::, and 
> LogHostList::preproc_and_try_delete() functions, the following is probably a 
> better patch to address this issue. Appreciate any second opinions on this 
> before I open a PR.
> 
> diff --git a/proxy/logging/LogHost.cc b/proxy/logging/LogHost.cc
> index 981bd1265..851221e0c 100644
> --- a/proxy/logging/LogHost.cc
> +++ b/proxy/logging/LogHost.cc
> @@ -302,8 +302,8 @@ LogHost::orphan_write_and_try_delete(LogBuffer *&lb)
>     m_orphan_file->preproc_and_try_delete(lb);
>   } else {
>     Debug("log-host", "logging space exhausted, failed to write orphan file, 
> drop(%" PRIu32 ") bytes", lb->header()->byte_count);
> -    LogBuffer::destroy(lb);
>   }
> +  LogBuffer::destroy(lb);
> }
> 
> Inside of this function, the LogBuffer::destroy() is only called in the else 
> branch but not the main branch of the if statement. So I moved it out so it 
> is called in both cases. This should make this function match the 
> LogHost::preproc_and_try_delete() case where (like the function above) 
> increment is called before the function is called in 
> LogHostList::preproc_and_try_delete() and destroy is called before the end of 
> the function (either within the client SM or in the :done part of the 
> function).
> 
> Thanks,
> Peter
> 
> 
> From: Chou, Peter
> Sent: Wednesday, July 25, 2018 6:53 PM
> To: dev@trafficserver.apache.org
> Subject: Issue #2416 - Log Collation Memory Leak.
> 
> Hi All,
> 
> We are following up on a memory leak that we reported with 6.2.1 and have now 
> verified to be in 7.1.3 also. What happens is that when Log Collation is 
> configured and the log collation host is down that the log buffer will be 
> orphaned and leaked. At first we thought this was a lab only issue, but this 
> can also affect production depending upon connectivity issues, etc.
> 
> What I did was put debug code into the LogBuffer::destroy() to monitor the 
> refcnt on each call. The log buffer will be deleted only if refcnt is 1. The 
> value of refcnt is the value of the log buffer's m_references before 
> ink_atomic_increment(-1) is run inside this function since 
> ink_atomic_increment() returns the old value rather than the new value. So if 
> refcnt is 1 that means the m_references is now 0 and the log buffer can be 
> deleted. When I looked at the debug trace I saw for a particular log buffer 
> pointer value that the refcnt would be 2 then 3 then 2 and the pointer would 
> never be found again in the trace (so likely leaked and forgotten about). So 
> it turns out that the incrementing from 2 to 3 is due to the following --
> 
> diff --git a/proxy/logging/LogHost.cc b/proxy/logging/LogHost.cc
> index 981bd1265..860906df9 100644
> --- a/proxy/logging/LogHost.cc
> +++ b/proxy/logging/LogHost.cc
> @@ -427,7 +427,6 @@ LogHostList::preproc_and_try_delete(LogBuffer *lb)
>   }
> 
>   if (lb != nullptr && need_orphan && available_host) {
> -    ink_atomic_increment(&lb->m_references, 1);
>     available_host->orphan_write_and_try_delete(lb);
>   }   }
> 
> I tested with the patch above (removing the increment in the case of orphan 
> writes) and the memory leak is gone. But appreciate any second opinions on 
> this issue/solution since I know that some other folks may have better 
> knowledge on the logging system.
> 
> Thanks,
> Peter

Reply via email to