On 28.04.2015 20:06, Marc Strapetz wrote:
On 28.04.2015 18:12, Branko Čibej wrote:
On 28.04.2015 18:03, Marc Strapetz wrote:
Hi Brane,

On 28.04.2015 07:36, Branko Čibej wrote:
On 24.04.2015 14:11, Branko Čibej wrote:

Hi Marc,

Just a quick note: your last msg jogged my memory and I think I know
the root cause of the leak: improper JNI frame management within a
loop. If I'm right, I can both fix the leak and remove the
close-stream requirement I just added.

On 24 Apr 2015 11:00 am, "Marc Strapetz" <marc.strap...@syntevo.com
<mailto:marc.strap...@syntevo.com>> wrote:

     On 24.04.2015 06 <tel:24.04.2015%2006>:34, Branko Čibej wrote:

         On 22.03.2015 05 <tel:22.03.2015%2005>:06, Branko Čibej
wrote:

             On 21.03.2015 16 <tel:21.03.2015%2016>:23, Branko Čibej
wrote:

                 On 19.03.2015 11:43, Marc Strapetz wrote:

                     Attached example performs an endless series of
                     remote status against
                     the Subversion repository. When invoked with
                     -Xmx24M, the VM will run
                     out of memory soon. Monitoring with jvisualvm
                     shows that the used heap
                     size constantly grows. Monitoring with the Task
                     Manager shows that the
                     allocated memory grows even more (significantly).
                     Looks like a memory
                     leak, for which a large amount of native
memory is
                     involved, too.

                     Tested on Windows 8.1 with almost latest
                     Subversion 1.9 JavaHL builds.

                 I can confirm that this happens on the Mac, too, and
                 it's not a garbage
                 collector artefact. I'm trying to trace where the
leak
                 is happening ...
                 valgrind with APR pool debugging doesn't tell me much
                 (no surprise there).

             Just to make sure we weren't doing something bad in our
             libraries, I
             wrote a small C program that does the same as your Java
             example (Ev2
             shims included), and memory usage is completely
steady. So
             it is
             something in JavaHL, but I have no clue yet what the
             problem is.


         I have to say this was one of the more "interesting"
bug-hunts
         in my not
         entirely boring career, and that's not really obvious from
the fix
         itself. :)

         http://svn.apache.org/r1675771

         Marc: this will not be in RC1, but please give the patch a
         spin and let
         me know if it fixes your problem. I tested this with the Java
         program
         you attached to your original report, and heap size no longer
         grows
         without bounds.


     Great hunt, Brane! The native leak seems to be fixed. I've run my
     remote status loop with -Xmx24M and still get an OOME after ~170
     loop iterations. The memory leak is significantly smaller and
this
     time it seems to be in the Java part. According to the profiler,
     most memory is allocated by HashMap and friends, referenced from
     JNI code. Only two org.apache.subversion classes show up, but I
     guess they indicate the source of the leak:

     org.apache.subversion.javahl.types.Checksum (~10K instances)
     org.apache.subversion.javahl.types.NativeInputStream (~10K
instances)

     Let me know, if you more profiler statistics will be helpful.


So I've been looking at this in depth. At first I thought that one of
the problems was that we didn't release JNI local references; I added
code to make sure this happens in the status callbacks (not committed
yet) and I verified that all the native wrapped objects do get
finalized. However, the Java objects still hang around.

One of the problems is that all the callbacks happen within the
scope of
the ISVNReporter.finishReport call, which means that the whole edit
drive is considered a single JNI call (despite the callbacks to Java)
and the garbage collector can't reclaim space for the objects created
within JNI during that time. But even a forced GC after the report is
done and the remote session disposed won't release all the native
references. I'm a bit stumped here ... JVM's built-in memory profiler
shows the live references and where they're allocated, but doesn't show
why they're not released even when I explicitly create and destroy JNI
frames.

Can you please commit your current state somewhere or send me a patch?
I can give this one a try in JProfiler and see whether I can gather
some more useful information.

Here's the complete patch against 1.9.x. I do see some NativeInputStream
objects (but not all) being garbage-collected, but there are a number of
other objects, even those allocated in Java code in the callback, that
are just hanging around, even if I force a GC. Any help in tracking this
down will be greatly appreciated.

Thanks, Brane. I'm not sure whether I can confirm this: I took a couple
of heap dumps while running the remote status 1000 times in a loop and
my impression was that the amount of almost all objects was basically
constant, except for Checksum which was constantly rising. The only
reference to those Checksum objects are "JNI local references". The
final heap (after 1000 iterations) looks like this:

http://imgur.com/gLdnv84

As you can see, there are no "contents" InputStreams nor "properties"
(Hash)Maps referenced. AFAIU, I should see almost the same amount of
objects in all three cases?

Also, I should add that according to the Profiler, the byte[]s are referenced from the Checksums. The char[]s are referenced from the Strings. And the Strings are referenced directly as JNI local references. Browsing through these Strings, they seem to be server-side paths ("subversion/branches/1.8.x/...")

-Marc

Reply via email to