Hi All,

After some additional study on how the ink_atomic_increment() and 
LogBuffer::destroy() work together in the LogFile::, LogHost::, and 
LogHostList::preproc_and_try_delete() functions, the following is probably a 
better patch to address this issue. Appreciate any second opinions on this 
before I open a PR.

diff --git a/proxy/logging/LogHost.cc b/proxy/logging/LogHost.cc
index 981bd1265..851221e0c 100644
--- a/proxy/logging/LogHost.cc
+++ b/proxy/logging/LogHost.cc
@@ -302,8 +302,8 @@ LogHost::orphan_write_and_try_delete(LogBuffer *&lb)
     m_orphan_file->preproc_and_try_delete(lb);
   } else {
     Debug("log-host", "logging space exhausted, failed to write orphan file, 
drop(%" PRIu32 ") bytes", lb->header()->byte_count);
-    LogBuffer::destroy(lb);
   }
+  LogBuffer::destroy(lb);
}

Inside of this function, the LogBuffer::destroy() is only called in the else 
branch but not the main branch of the if statement. So I moved it out so it is 
called in both cases. This should make this function match the 
LogHost::preproc_and_try_delete() case where (like the function above) 
increment is called before the function is called in 
LogHostList::preproc_and_try_delete() and destroy is called before the end of 
the function (either within the client SM or in the :done part of the function).

Thanks,
Peter


From: Chou, Peter
Sent: Wednesday, July 25, 2018 6:53 PM
To: dev@trafficserver.apache.org
Subject: Issue #2416 - Log Collation Memory Leak.

Hi All,

We are following up on a memory leak that we reported with 6.2.1 and have now 
verified to be in 7.1.3 also. What happens is that when Log Collation is 
configured and the log collation host is down that the log buffer will be 
orphaned and leaked. At first we thought this was a lab only issue, but this 
can also affect production depending upon connectivity issues, etc.

What I did was put debug code into the LogBuffer::destroy() to monitor the 
refcnt on each call. The log buffer will be deleted only if refcnt is 1. The 
value of refcnt is the value of the log buffer's m_references before 
ink_atomic_increment(-1) is run inside this function since 
ink_atomic_increment() returns the old value rather than the new value. So if 
refcnt is 1 that means the m_references is now 0 and the log buffer can be 
deleted. When I looked at the debug trace I saw for a particular log buffer 
pointer value that the refcnt would be 2 then 3 then 2 and the pointer would 
never be found again in the trace (so likely leaked and forgotten about). So it 
turns out that the incrementing from 2 to 3 is due to the following --

diff --git a/proxy/logging/LogHost.cc b/proxy/logging/LogHost.cc
index 981bd1265..860906df9 100644
--- a/proxy/logging/LogHost.cc
+++ b/proxy/logging/LogHost.cc
@@ -427,7 +427,6 @@ LogHostList::preproc_and_try_delete(LogBuffer *lb)
   }

   if (lb != nullptr && need_orphan && available_host) {
-    ink_atomic_increment(&lb->m_references, 1);
     available_host->orphan_write_and_try_delete(lb);
   }   }

I tested with the patch above (removing the increment in the case of orphan 
writes) and the memory leak is gone. But appreciate any second opinions on this 
issue/solution since I know that some other folks may have better knowledge on 
the logging system.

Thanks,
Peter

Reply via email to