Thank you for the positive encouragement, Roman :-) Cheers, Thomas
On Mon, Dec 5, 2022 at 12:03 PM Kennke, Roman <rken...@amazon.de> wrote: > Hi Thomas, > > I very much like the idea and also your proposals how to do it. Insights > in JDK's native memory usage is sorely lacking and would be very useful! > I don't have all that much to add about the details beyond what you > already covered, though :-) > > Cheers, > Roman > > > > Are there any opinions about whether or not to extend NMT across the JDK? > > > > This blocks https://bugs.openjdk.org/browse/JDK-8296360 > > <https://bugs.openjdk.org/browse/JDK-8296360>, and I had a PR prepared > > as https://github.com/openjdk/jdk/pull/10988 > > <https://github.com/openjdk/jdk/pull/10988>. Originally I was hoping to > > get this into JDK 20, but I don't think that is realistic anymore. I am > > fine with postponing my work in favor of a baseline discussion, but so > > far there is very little discussion about this topic. > > > > How should I proceed? > > > > Thanks, Thomas > > > > > > > > On Wed, Nov 9, 2022 at 8:12 AM Thomas Stüfe <thomas.stu...@gmail.com > > <mailto:thomas.stu...@gmail.com>> wrote: > > > > Hi Alan, > > > > (replaced hotspot-runtime-dev with hotspot-dev, since its more of a > > general topic) > > > > thank you for your time! > > > > I am very happy to talk this through. I think native memory > > observability in the JDK (and customer code!) is sorely lacking. > > Witness the countless "where did my native memory go" blog articles. > > At SAP we have been struggling with this topic for a long time and > > have come up with a mixture of solutions. The aforementioned tracker > > was one, which extended our version of NMT across the JDK. Our > > SapMachine MallocTracer, which allows us to trace uninstrumented > > customer code, another. We even experimented with exchanging the > > allocator (using jemalloc) to gain insights. But that is a whole > > different topic with deep logistical implications, I don't want to > > touch it here. Exchanging the allocator does not help to observe > > virtual memory or the brk segment, of course. > > > > And to make the picture complete, another insight we currently lack > > is the implicit allocator overhead, which can be very significant > > and is hidden by the libc. We also have observability for that in > > the SapMachine, and I miss it in OpenJDK. > > > > As you noticed, my original intent was just to instrument Zlib and > > possibly improve tracking for DBBs. Although, thinking beyond that, > > another attractive instrumentation target would be mapped NIO > > buffers at least. > > > > So I think native memory observability is important. Arguably we > > could even extend observability to cover other OS resources, e.g. > > file handles. If we shift code around, to java/Panama: data that > > move the java heap does not need to be tracked, but other memory > > will always come from one of the basic system APIs, regardless of > > who allocates it and where in the stack allocation happens. Be it > > native JDK code, Panama, or even customer JNI code. > > > > If we agree on the importance of native memory observability, then I > > believe NMT is the right tool for it. It is a good tool. The > > machinery is already there. It covers both C-heap and virtual memory > > APIs, as well as thread stacks, and could easily be extended to > > cover sbrk if needed. And I assume that whatever shape OpenJDK takes > > on in the future, there always will be a libjvm.so at its core, so > > we will always have it. But even if not, NMT could be separated from > > libjvm.so quite easily, since it has no deep ties with the JVM. > > > > About coupling JVM with outside code: We don't have to directly link > > against libjvm.so. We can keep things loose if the intent is to be > > runnable without a JVM, or be JVM-version-agnostic. That could take > > the form of a function-pointer interface like JVMTI. Or outside code > > could dynamically dlsym the JVM allocation hooks. In any case > > gracefully falling back to system allocation routines when necessary. > > > > And I agree, polluting the NMT tag space with outside meaning is > > ugly. I only did it because I planned to go no further than > > instrumenting Zlib and possibly DBBs. But if we take this further, > > my preferred solution would be a reserved tag range or -ranges for > > outside use, whose inner meaning would be opaque to the JVM. Kind of > > like SIGRTMIN+SIGRTMAX. Then, outside code could register tags and > > their meta information with the JVM, or we find a different way to > > convey the tag meaning to NMT (config files, or callbacks). That > > could even be opened up for customer use. > > > > This also touches on another question, that of NMT tag space. NMT > > tags are very useful since they allow cheap tracking without > > capturing call stacks. However, tags are underused and show growing > > pains since they are too one-dimensional and restrictive. We had > > competing interests in the past about tag granularity. It is all > > over the place. We have coarse-grained tags like "mtThread", and > > very fine-grained ones like "mtObjectMonitor". There are several > > ways we could improve, e.g., by making them combinable like UL does, > > or allowing for a hierarchy of them - either a hard-wired limited > > one like "domain"+"tag", or an unlimited tree-like one. Technically > > interesting since whatever the new encoding is, they still must fit > > into a malloc header. I opened > > https://bugs.openjdk.org/browse/JDK-8281819 > > <https://bugs.openjdk.org/browse/JDK-8281819> to track ideas like > these. > > > > Instrumenting Panama allocations, including the ability to tag > > allocations, would be a very good idea. For instance, if we ever > > remove the native Zlib layer and convert it to java using Panama, we > > can do the same with Panama I do now natively - use the Zlib zalloc > > interface to hook in JVM memory allocation functions. The result > > could be completely identical, and the end user looking at the NMT > > output need never know that anything changed. > > > > And that goes for all instrumentation - if today we add it to JNI > > code, and that code gets removed tomorrow, we can add it to Panama > > code too. Unless data structures move to the heap, in which case > > there is no need to track them. > > > > You mentioned that NMT was more of an in-house support tool. Our > > experience is different. Even though it was positioned as a tool for > > JVM developers, and we never cared for the backward compatibility or > > consistency, it gets used a *lot* by our customers. We have to > > explain its output frequently. Also, many blog articles exist > > documenting its use. So, maybe it would be okay to elevate it to a > > user-facing tool since it seems to occupy that role anyway. We may > > also open up consumption of NMT results via java APIs, or expose its > > results via MXBeans. > > > > If this is to be a JEP, okay, but I'm afraid it would stall things a > > bit. I am interested in getting a simpler and quicker solution for > > older support releases at least, possibly based on my PR. I know > > that would be unconventional though. > > > > Thank you, > > > > Thomas > > > > > > On Sun, Nov 6, 2022 at 9:31 AM Alan Bateman <alan.bate...@oracle.com > > <mailto:alan.bate...@oracle.com>> wrote: > > > > On 04/11/2022 16:54, Thomas Stüfe wrote: > > > Hi all, > > > > > > I am currently working on > > https://bugs.openjdk.org/browse/JDK-8296360 > > <https://bugs.openjdk.org/browse/JDK-8296360>; > > > I was preparing the final PR [1], but then Alan did ask me to > > discuss > > > this on core-libs first. > > > > > > Backstory: > > > > > > NMT tracks hotspot native allocations but does not cover the > JDK > > > libraries (small exception: Unsafe.AllocateMemory). However, > the > > > native memory footprint of JDK libraries can be significant. > > We have > > > no in-VM tracker for these and need tools like valgrind or our > > > SapMachine MallocTracer [2] to observe them. > > > > Thanks for starting a discussion on this as this is a topic that > > requires agreement from several areas. If this is the start of > > something > > bigger, where you want to have all allocation sites in the > > libraries > > using NMT, then I think it needs a write-up, maybe a JEP. > > > > For starters, I think it needs some agreement on using NMT for > > memory > > allocated outside of libjvm. You mentioned Unsafe as an > > exception but > > that is implemented in the VM so you get tracking for free, > > albeit I > > think all allocations are in the "mtOther" category. > > > > A general concern is that it creates more coupling between the > > VM code > > and the libraries code. As you probably know, we've removed most > > of the > > dependences on JVM_* functions from non-core areas over many > > years. So I > > think that needs consideration as I assume we don't want > > memory/allocation.hpp declaring a dozen catagories for > > allocations done > > in say java.desktop module for example. Maybe your proposal will > be > > strictly limited to java.base but even then, do we really want > > the VM > > even knowing about categories that are specific to zip > > compression or > > decompression? > > > > There are probably longer term trends that should be part of the > > discussion too. One general trend is that "run time" is becoming > > more > > and more a hybrid of code in libvm and the Java libraries. > Lambdas, > > module system, virtual threads implementations are a few > > examples in the > > last few release. This comes with many "Java on Java" challenges, > > including serviceability where users of the platform will expect > > tools > > to just work and won't care where the code is. NMT is probably > > more for > > support teams and not something that most developers will ever > > use but I > > think is part of the challenge of having serviceability > > solutions "just > > work". > > > > In addition to having more of the Java runtime written in Java, > > there > > will likely be less JNI code in the future. It's very possible > > that the > > JNI code (including the JNI methods in libzip) will be replaced > > with > > code that uses Panama memory and linker APIs once they are become > > permanent. The effect of that would to have a lot of the memory > > allocations be tracked in the mtOther category again. Maybe > > integration > > with memory tracking should be looked at in conjunction with > > these APIs > > and this migration. I could imagine the proposed "Arena" API > > (MemorySession in Java 19) having some integration with NMT and > > it might > > be interesting to look into that. > > > > So yes, this topic does need broader discussion and it might be > > a bit > > premature to start with a PR for libzip without talking about > > the bigger > > picture first. > > > > -Alan > > > > > > > > > > Amazon Development Center Germany GmbH > Krausenstr. 38 > 10117 Berlin > Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss > Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B > Sitz: Berlin > Ust-ID: DE 289 237 879 > > >