On Mon, Apr 15, 2024 at 6:01 PM Jiangli Zhou <jiangliz...@google.com> wrote:
>
> Magnus, thanks for the response. Please see comments inlined below.
>
> On Fri, Apr 12, 2024 at 4:52 AM Magnus Ihse Bursie
> <magnus.ihse.bur...@oracle.com> wrote:
> >
> > On 2024-04-02 21:16, Jiangli Zhou wrote:
> >
> > Hi Magnus,
> >
> > In today's zoom meeting with Alan, Ron, Liam and Chuck, we (briefly) 
> > discussed how to move forward contributing the static Java related changes 
> > (additional runtime fixes/enhancements on top of the existing static 
> > support in JDK) from 
> > https://github.com/openjdk/leyden/tree/hermetic-java-runtime to JDK 
> > mainline.
> >
> > Just a bit more details/context below, which may be useful for others 
> > reading this thread.
> >
> > The https://github.com/openjdk/leyden/tree/hermetic-java-runtime branch 
> > currently contains following for supporting hermetic Java (without the 
> > launcher work for runtime support):
> >
> > 1. Build change for linking the Java launcher (as bin/javastatic) with 
> > JDK/hotspot static libraries (.a), mainly in 
> > https://github.com/openjdk/leyden/blob/hermetic-java-runtime/make/StaticLink.gmk.
> >  The part for creating the complete sets of static libraries (including 
> > libjvm.a) has already been included in the mainline since last year. 
> > https://github.com/openjdk/leyden/blob/hermetic-java-runtime/make/StaticLink.gmk
> >  is in a very raw state and is intended to demonstrate the capability of 
> > building a static Java launcher.
> >
> > Indeed. It is nowhere near being able to be integrated.
> >
>
> The main purpose of StaticLink.gmk is to support the static-java-image
> make target, which can be used to perform the actual static linking
> step using libjvm.a and JDK static libraries. That currently doesn't
> exist in the JDK mainline. Creating a "fully" statically linked Java
> launcher is the first step (out of many) towards supporting
> static/hermetic Java.
>
> As part of cleaning/refactoring/integrating for the static linking
> step, we want to agree and decide/accept on the following:
>
> - Support the "fully" statically linked java launcher for testing and
> demoing the capability of static JDK support, e.g.
>   - Support running jtreg testing using the "fully" statically linked
> Java launcher
>   - Set up tests in github workflow to help detect any breaking
> changes for static support, e.g. new symbol issues introduced by any
> changes. There were some earlier discussions on this with Ron and Alan
> during the zoom meetings.
> - Which JDK native libraries to be statically linked with the new
> launcher target? E.g. StaticLink.gmk currently excludes libjsound.a,
> libawt_xawt.a, etc from statically linked with the launcher.
> - Do we want more than one statically linked launcher target, based on
> the set of linked native libraries?
>
> Based on the decisions of the above, the launcher static linking part
> would mostly be in a different shape when it's integrated into the
> mainline. That's why I referred to StaticLink.gmk as in a "very raw"
> state.
>
> Here is a high-level view of the state of things for static support:
>
> (I)  What we already have in the JDK mainline:
> - Able to build a complete set of JDK/VM static libraries using
> `static-libs-image` make target (necessary for supporting static JDK)
> - Compilation for .o files are done separately for the static
> libraries and dynamic library (ok for now)
>
> (II) What missing:
> - Static linking step as mentioned above
>
> (III) What needs to be improved (require cleanups and refactoring, and
> you mentioned some of those in your response as well):
> - Support building both the static libraries and dynamic libraries
> using the same set of .o files, instead of separately compiled .o
> files. That helps improve build speed and reduce memory overhead for
> building JDK. Your current refactoring work aims to help that.
> - Clean up the usages of STATIC_BUILD macro. Most of the usages are in
> test code.
> - Other runtime fixes/enhancements in the leyden
> https://github.com/openjdk/leyden/tree/hermetic-java-runtime branch
>
> I think most work mentioned in III has dependencies on II. We need a
> workable base to be able to build the "fully" statically linked
> launcher for building and testing the work mentioned in III, when
> integrating any of those to the JDK mainline. The makefile refactoring
> work can be done in parallel but does not need to be completed before
> we add the static linking step in JDK mainline.
>
> >
> > 2. Additional runtime fixes/enhancements on top of the existing static 
> > support in JDK, e.g. support further lookup dynamic native library if the 
> > built-in native library cannot be found.
> >
> > 3. Some initial (prototype) work on supporting hermetic JDK resource files 
> > in the jimage (JDK modules image).
> >
> > To move forward, one of the earliest items needed is to add the capability 
> > of building the fully statically linked Java launcher in JDK mainline. The 
> > other static Java runtime changes can be followed up after the launcher 
> > linking part, so they can be built and tested as individual PRs created for 
> > the JDK mainline. Magnus, you have expressed interest in helping get the 
> > launcher linking part (refactor from 
> > https://github.com/openjdk/leyden/blob/hermetic-java-runtime/make/StaticLink.gmk)
> >  into JDK mainline. What's your thought on prioritizing the launcher static 
> > linking part before other makefile clean ups for static libraries?
> >
> > Trust me, my absolute top priority now is working on getting the proper 
> > build support needed for Hermetic Java. I can't prioritize it any higher.
>
> Thanks!
>
> >
> > I am not sure what you are asking for. We can't just merge StaticLink.gmk 
> > from your prototype. And even if we did, what good will it do you?
>
> Please see my comments above.
>
> >
> > The problem you are running into is that the build system has not been 
> > designed to properly support static linking. There are already 3-4 hacks in 
> > place to get something sort-of useful out, but they are prone to breaking. 
> > I assume that we agree that for Hermetic Java to become a success, we need 
> > to have a stable foundation for static builds.
> >
> > The core problem of all static linking hacks is that they are not 
> > integrated in the right place. They need to be a core part of what 
> > NativeCompilation delivers, not something done in a separate file. To put 
> > it in other words, StaticLink.gmk from your branch do not need cleanup -- 
> > it needs to go away, and the functionality moved to the proper place.
> >
> > My approach is that NativeCompilation should support doing either only 
> > dynamic linking (as today), or static linking (as today with STATIC_LIBS or 
> > STATIC_BUILD), or both. The assumption is that the latter will be default, 
> > or at least should be tested by default in GHA. For this to work, we need 
> > to compile the source code to .o files only once, and then link these .o 
> > files either into a dynamic or a static library (or both).
>
> As of today, the leyden
> https://github.com/openjdk/leyden/tree/hermetic-java-runtime branch
> can build a "fully" statically linked Java launcher. The issue of
> compiling the dynamic and static libraries .o files separately is not
> a blocker. It's good to have it resolved at some point of time.
>
> >
> > This, in turn, require several changes:
> >
> > 1) The linking code needs to be cleaned up, and all technical debt needs to 
> > be resolved. This is what I have been doing since I started working on 
> > static builds for Hermetic Java. JDK-8329704 (which was integrated 
> > yesterday) was the first major milestone of this cleanup. Now, the path 
> > were to find a library created by the JDK (static or dynamic) is 
> > encapsulated in ResolveLibPath. This is currently a monster, but at least 
> > all knowledge is collected in a single location, instead of spread over the 
> > code base. Getting this simplified is the next step.
> >
> > 2) We need to stop passing the STATIC_BUILD define when compiling. This is 
> > partially addressed in your PR, where you have replaced #ifdef STATIC_BUILD 
> > with a dynamic lookup. But there is also the problem of JNI/JVMTI entry 
> > points. I have been pondering how we can compile the code in a way so we 
> > support both dynamic and static name resolution, and I think I have a 
> > solution.
> >
> > This is unfortunately quite complex, and I have started a discussion with 
> > Alan if it is possible to update the JNI spec so that both static and 
> > dynamic entry points can have the form "JNI_OnLoad_<library-name>". 
> > Ideally, I'd like to see us push for this with as much effort as possible. 
> > If we got this in place, static builds would be much easier, and the 
> > changes required for Hermetic Java even smaller.
>
> Thumbs up! That seems to be a good direction. Currently in the leyden
> branch, it first looks up the unique
> JNI_OnLoad<_lib_name>|Agent_OnLoad<_lib_name> etc for built-in
> libraries, then search for the dynamic libraries using the
> conventional naming when necessary. e.g.:
>
> https://github.com/openjdk/leyden/commit/a5c886d2e85a0ff0c3712a5488ae61d8c9d7ba1a
> https://github.com/openjdk/leyden/commit/1da8e3240e0bd27366d19f2e7dde386e46015135
>
> When spec supports JNI_OnLoad_<library-name> and etc. for dynamic
> libraries, we may still need to support the conventional naming
> without the <_lib_name> part for existing libraries out there.

Resuming the conversation on using
JNI_OnLoad_L|JNI_OnUnload_L|Agent_OnLoad_L|Agent_OnUnLoad_L|Agent_OnAttach_L
for dynamically linked JNI & agent libraries. It is related to
JDK-8350450 [1]: Compile object files once for both static and dynamic
builds. We have recently enabled building & tier1 testing for
static-jdk with release binary in GHA on linux-x64. The debug build
however cannot be enabled in GHA due to space/resource limit (please
see more details in JDK-8350450). So it's a good time to pick up
JDK-8350450 related work. Based on discussions with Magnus in
JDK-8350450 bug comments and separate emails from last year, I'll
extract the runtime changes from the leyden/hermetic-java-runtime
branch [2] for supporting JNI_OnLoad_L (and etc) for dynamically
linked JNI/agent libraries. The work has been broadly tested in our
internal prototype on JDK 11 and newer versions (linux-x64).

Regarding the spec part, Ron, Alan, Magnus and myself had several
discussions last year during hermetic Java meetings. The general
understanding was that using JNI_OnLoad_L (and etc) for dynamically
linked JDK native libraries requires no JNI/JVMTI spec change. I
wonder if the following languages should be relaxed a bit to address
potential questions. Any thoughts?

>From JNI spec [3]:

- JNI_OnLoad/JNI_OnUnload
  Optional function defined by dynamically linked libraries.

  LINKAGE:
  Exported from dynamically linked native libraries that contain
native method implementations.

- JNI_OnLoad_L
  Mandatory function that must be defined by statically linked libraries .

  LINKAGE:
  Exported from statically linked native libraries that contain native
method implementations.

- JNI_OnUnload_L
  Optional function defined by statically linked libraries.

>From JVMTI spec [4]:

An agent L whose image has been combined with the VM is defined as
statically linked if and only if the agent exports a function called
Agent_OnLoad_L.

[1]: https://bugs.openjdk.org/browse/JDK-8350450
[2]: https://github.com/openjdk/leyden/tree/hermetic-java-runtime
[3]: https://docs.oracle.com/en/java/javase/21/docs/specs/jni/
[4]: https://docs.oracle.com/en/java/javase/24/docs/specs/jvmti.html

Best,
Jiangli

>
> >
> > And finally, on top of all of this, is the question of widening the 
> > platform support. To support linux/gcc with objcopy is trivial, but the 
> > question about Windows still remain. I have two possible ways forward, one 
> > is to check if there is alternative tooling to use (the prime candidate is 
> > the clang-ldd), and the other is to try to "fake" a partial linking by 
> > concatenating all source code before compiling. This is not ideal, though, 
> > for many reasons, and I am not keen on implementing it, not even for 
> > testing. And at this point, I have not had time to investigate any of these 
> > options much further, since I have been focusing on 1) above.
> >
> > A third option is of course to just say that due to toolchain limitations, 
> > static linking is not available on Windows.
>
> Thank you for taking this on! Potentially we could consider taking the
> objcopy to localizing hotspot symbols on unix-like platforms, based on
> https://github.com/openjdk/jdk/pull/17456 discussions. Additional
> testing is still needed to verify the solution.
>
> >
> > My recommendation is that you keep on working to resolve the (much more 
> > thorny) issues of resource access in Hermetic Java in your branch, where 
> > you have a prototype static build that works for you. In the meantime, I 
> > will make sure that there will be a functioning, stable and robust way of 
> > creating static builds in the mainline, that can be regularly tested and 
> > not bit-rot, like the static build hacks that has gone in before.
>
> Most of the JDK resources are now supported as hermetic jimage
> (lib/modules) bundled in the
> https://github.com/openjdk/leyden/tree/hermetic-java-runtime branch.
> The remaining sound.properties, ct.sym and .jfc files can be handled
> later. Overally, that part of the work has confirmed the hermetic
> jimage bundled solution is robust and helps resolve some of the
> difficult start-up sequence issues observed when the hermetic resource
> was implemented using JAR file based solution.
>
> It might be a good idea to follow up on the static linking discussion
> in tomorrow's zoom meeting (hope you'll be able to join tomorrow).
>
> Thanks!
>
> Jiangli
> >
> > /Magnus
> >
> >
> >
> > Thanks!
> > Jiangli
> >
> > On Thu, Feb 15, 2024 at 12:01 PM Jiangli Zhou <jiangliz...@google.com> 
> > wrote:
> >>
> >> On Wed, Feb 14, 2024 at 5:07 PM Jiangli Zhou <jiangliz...@google.com> 
> >> wrote:
> >> >
> >> > Hi Magnus,
> >> >
> >> > Thanks for looking into this from the build perspective.
> >> >
> >> > On Wed, Feb 14, 2024 at 1:00 AM Magnus Ihse Bursie
> >> > <magnus.ihse.bur...@oracle.com> wrote:
> >> > >
> >> > > First some background for build-dev: I have spent some time looking at
> >> > > the build implications of the Hermetic Java effort, which is part of
> >> > > Project Leyden. A high-level overview is available here:
> >> > > https://cr.openjdk.org/~jiangli/hermetic_java.pdf and the current 
> >> > > source
> >> > > code is here: 
> >> > > https://github.com/openjdk/leyden/tree/hermetic-java-runtime.
> >> >
> >> > Some additional hermetic Java related references that are also useful:
> >> >
> >> > - https://bugs.openjdk.org/browse/JDK-8303796 is an umbrella bug that
> >> > links to the issues for resolving static linking issues so far
> >> > - https://github.com/openjdk/jdk21/pull/26 is the enhancement for
> >> > building the complete set of static libraries in JDK/VM, particularly
> >> > including libjvm.a
> >> >
> >> > >
> >> > > Hermetic Java faces several challenges, but the part that is relevant
> >> > > for the build system is the ability to create static libraries. We've
> >> > > had this functionality (in three different ways...) for some time, but
> >> > > it is rather badly implemented.
> >> > >
> >> > > As a result of my investigations, I have a bunch of questions. :-) I
> >> > > have gotten some answers in private discussion, but for the sake of
> >> > > transparency I will repeat them here, to foster an open dialogue.
> >> > >
> >> > > 1. Am I correct in understanding that the ultimate goal of this 
> >> > > exercise
> >> > > is to be able to have jmods which include static libraries (*.a) of the
> >> > > native code which the module uses, and that the user can then run a
> >> > > special jlink command to have this linked into a single executable
> >> > > binary (which also bundles the *.class files and any additional
> >> > > resources needed)?
> >> > >
> >> > > 2. If so, is the idea to create special kinds of static jmods, like
> >> > > java.base-static.jmod, that contains *.a files instead of lib*.so 
> >> > > files?
> >> > > Or is the idea that the normal jmod should contain both?
> >> > >
> >> > > 3. Linking .o and .a files into an executable is a formidable task. Is
> >> > > the intention to have jlink call a system-provided ld, or to bundle ld
> >> > > with jlink, or to reimplement this functionality in Java?
> >> >
> >> > I have a similar view as Alan responded in your other email thread.
> >> > Things are still in the early stage for the general solution.
> >> >
> >> > In the https://github.com/openjdk/leyden/tree/hermetic-java-runtime
> >> > branch, when configuring JDK with --with-static-java=yes, the JDK
> >> > binary contains the following extra artifacts:
> >> >
> >> > - static-libs/*.a: The complete set of JDK/VM static libraries
> >> > - jdk/bin/javastatic: A demo Java launcher fully statically linked
> >> > with the selected JDK .a libraries (e.g. it currently statically link
> >> > with the headless) and libjvm.a. It's the standard Java launcher
> >> > without additional work for hermetic Java.
> >> >
> >> > In our prototype for hermetic Java, we build the hermetic executable
> >> > image (a single image) from the following input (see description on
> >> > singlejar packaging tool in
> >> > https://cr.openjdk.org/~jiangli/hermetic_java.pdf):
> >> >
> >> > - A customized launcher (with additional work for hermetic) executable
> >> > fully statically linked with JDK/VM static libraries (.a files),
> >> > application natives and dependencies (e.g. in .a static libraries)
> >> > - JDK lib/modules, JDK resource files
> >> > - Application classes and resource files
> >> >
> >> > Including a JDK library .a into the corresponding .jmod would require
> >> > extracting the .a for linking with the executable. In some systems
> >> > that may cause memory overhead due to the extracted copy of the .a
> >> > files. I think we should consider the memory overhead issue.
> >> >
> >> > One possibility (as Alan described in his response) is for jlink to
> >> > invoke the ld on the build system. jlink could pass the needed JDK
> >> > static libraries and libjvm.a (provided as part of the JDK binary) to
> >> > ld based on the modules required for the application.
> >> >
> >>
> >> I gave a bit more thoughts on this one. For jlink to trigger ld, it
> >> would need to know the complete linker options and inputs. Those
> >> include options and inputs related to the application part as well. In
> >> some usages, it might be easier to handle native linking separately
> >> and pass the linker output, the executable to jlink directly. Maybe we
> >> could consider supporting different modes for various usages
> >> requirements, from static libraries and native linking point of view:
> >>
> >> Mode #1
> >> Support .jmod packaged natives static libraries, for both JDK/VM .a
> >> and application natives and dependencies. If the inputs to jlink
> >> include .jmods, jlink can extract the .a libraries and pass the
> >> information to ld to link the executable.
> >>
> >> Mode #2
> >> Support separate .a as jlink input. Jlink could pass the path
> >> information to the .a libraries and other linker options to ld to
> >> create the executable.
> >>
> >> For both mode #1 and #2, jlink would then use the linker output
> >> executable to create the final hermetic image.
> >>
> >> Mode #3
> >> Support a fully linked executable as a jlink input. When a linked
> >> executable is given to jlink, it can process it directly with other
> >> JDK data/files to create the final image, without native linking step.
> >>
> >> Any other thoughts and considerations?
> >>
> >> Best,
> >> Jiangli
> >>
> >> > >
> >> > > 4. Is the intention is to allow users to create their own jmods with
> >> > > static libraries, and have these linked in as well? This seems to be 
> >> > > the
> >> > > case.
> >> >
> >> > An alternative with less memory overhead could be using application
> >> > modular JAR and separate .a as the input for jlink.
> >> >
> >> > > If that is so, then there will always be the risk for name
> >> > > collisions, and we can only minimize the risk by making sure any global
> >> > > names are as unique as possible.
> >> >
> >> > Part of the current effort includes resolving the discovered symbol
> >> > collision issues with static linking. Will respond to your other email
> >> > on the symbol issue separately later.
> >> >
> >> > >
> >> > > 5. The original implementation of static builds in the JDK, created for
> >> > > the Mobile project, used a configure flag, --enable-static-builds, to
> >> > > change the entire behavior of the build system to only produce *.a 
> >> > > files
> >> > > instead of lib*.so. In contrast, the current system is using a special
> >> > > target instead.
> >> >
> >> > I think we would need both configure flag and special target for the
> >> > static builds.
> >> >
> >> > > In my eyes, this is a much worse solution. Apart from
> >> > > the conceptual principle (if the build should generate static or 
> >> > > dynamic
> >> > > libraries is definitely a property of what a "configuration" means),
> >> > > this makes it much harder to implement efficiently, since we cannot 
> >> > > make
> >> > > changes in NativeCompilation.gmk, where they are needed.
> >> >
> >> > For the potential objcopy work to resolve symbol issues, we can add
> >> > that conditionally in NativeCompilation.gmk if STATIC_LIBS is true. We
> >> > have an internal prototype (not included in
> >> > https://github.com/openjdk/leyden/tree/hermetic-java-runtime yet) done
> >> > by one of colleagues for localizing symbols in libfreetype using
> >> > objcopy.
> >> >
> >> > >
> >> > > That was not as much a question as a statement. 🙂 But here is the
> >> > > question: Do you think it would be reasonable to restore the old
> >> > > behavior but with the new methods, so that we don't use special 
> >> > > targets,
> >> > > but instead tells configure to generate static libraries? I'm thinking
> >> > > we should have a flag like "--with-library-type=" that can have values
> >> > > "dynamic" (which is default), "static" or "both".
> >> >
> >> > If we want to also build a fully statically linked launcher, maybe
> >> > --with-static-java? Being able to configure either dynamic, static or
> >> > both as you suggested also seems to be a good idea.
> >> >
> >> > > I am not sure if "both" are needed, but if we want to bundle both 
> >> > > lib*.so and *.a files
> >> > > into a single jmod file (see question 2 above), then it definitely is.
> >> > > In general, the cost of producing two kinds of libraries are quite
> >> > > small, compared to the cost of compiling the source code to object 
> >> > > files.
> >> >
> >> > Completely agree. It would be good to avoid recompiling the .o file
> >> > for static and dynamic builds. As proposed in
> >> > https://bugs.openjdk.org/browse/JDK-8303796:
> >> >
> >> > It's beneficial to be able to build both .so and .a from the same set
> >> > of .o files. That would involve some changes to handle the dynamic JDK
> >> > and static JDK difference at runtime, instead of relying on the
> >> > STATIC_BUILD macro.
> >> >
> >> > >
> >> > > Finally, I have looked at how to manipulate symbol visibility. There
> >> > > seems many ways forward, so I feel confident that we can find a good
> >> > > solution.
> >> > >
> >> > > One way forward is to use objcopy to manipulate symbol status
> >> > > (global/local). There is an option --localize-symbol in objcopy, that
> >> > > has been available in objcopy since at least 2.15, which was released
> >> > > 2004, so it should be safe to use. But ideally we should avoid using
> >> > > objcopy and do this as part of the linking process. This should be
> >> > > possible to do, given that we make changes in NativeCompilation.gmk --
> >> > > see question 5 above.
> >> > >
> >> > > As a fallback, it is also possible to rename symbols, either piecewise
> >> > > or wholesale, using objcopy. There are many ways to do this, using
> >> > > --prefix-symbols, --redefine-sym or --redefine-syms (note the -s, this
> >> > > takes a file with a list of symbols). Thus we can always introduce a
> >> > > "post factum namespace" by renaming symbols.
> >> >
> >> > Renaming or redefining the symbol at build time could cause confusions
> >> > with debugging. That's a concern raised in
> >> > https://github.com/openjdk/jdk/pull/17456 discussions.
> >> >
> >> > Additionally, redefining symbols using tools like objcopy may not
> >> > handle member names referenced in string literals. For example, in
> >> > https://github.com/openjdk/jdk/pull/17456 additional changes are
> >> > needed in assembling and SA to reflect the symbol change.
> >> >
> >> > >
> >> > > So in the end, I think it will be fully possible to produce .a files
> >> > > that only has global symbols for the functions that are part of the API
> >> > > exposed by that library, and have all other symbols local, and make 
> >> > > this
> >> > > is in a way that is consistent with the rest of the build system.
> >> > >
> >> > > Finally, a note on Hotspot. Due to debugging reasons, we export
> >> > > basically all symbols in hotspot as global. This is not reasonable to 
> >> > > do
> >> > > for a static build. The effect of not exporting those symbols will be
> >> > > that SA will not function to 100%. On the other hand, I have no idea if
> >> > > SA works at all with a static build. Have you tested this? Is this part
> >> > > of the plan to support, or will it be officially dropped for Hermetic 
> >> > > Java?
> >> >
> >> > We have done some testing with jtreg SA related tests for the fully
> >> > statically linked `javastatic`.
> >> >
> >> > If we use objcopy to localize symbols in hotspot, it's not yet clear
> >> > what's the impact on SA. We could do some tests. The other question
> >> > that I raised is the supported gcc versions (for partial linking)
> >> > related to the solution.
> >> >
> >> > Best,
> >> > Jiangli
> >> >
> >> > >
> >> > > /Magnus
> >> > >

Reply via email to