On Mon, 30 Jun 2025 23:27:49 GMT, David Beaumont <d...@openjdk.org> wrote:

> Refactoring `ImageReader` to make it easy to add preview mode functionality 
> for Valhalla.
> 
> This PR is a large change to `ImageReader` (effectively a rewrite) but 
> reduces the surface area of the API significantly, reduces code complexity 
> and increases performance/memory efficiency. The need for this sort of change 
> comes from the need to support "preview mode" in Valhalla for classes loaded 
> from core modules.
> 
> ### Preview Mode
> 
> In the proposed preview mode support for Valhalla, a resource with the name 
> `/modules/<module>/<path>` may come from one of two locations in the 
> underlying jimage file; `/<module>/<path>` or 
> `/<module>/META-INF/preview/<path>`. Furthermore, directories (represented as 
> directory nodes via the API) will have a potentially different number of 
> child nodes depending on whether preview mode is in effect, and some 
> directories may only exist at all in preview mode.
> 
> Furthermore, the directories and symbolic link nodes in `/packages/...` will 
> also be affected by the existence of new package directories. To retain 
> consistency and avoid "surprises" later, all of this needs to be taken into 
> account.
> 
> Below is a summary of the main changes for mainline JDK to better support 
> preview mode later:
> 
> ### 1: Improved encapsulation for preview mode
> 
> The current `ImageReader` code exposes the data from the jimage file in 
> several ways; via the `Node` abstraction, but also with methods which return 
> an `ImageLocation` instance directly. In preview mode it would not be 
> acceptable to return the `ImageLocation`, since that would leak the existence 
> of resources in the `META-INF/preview` namespace and lets users see resource 
> nodes with names that don't match the underlying `ImageLocation` from which 
> their contents come.
> 
> As such, the PR removes all methods which can leak `ImageLocation` or other 
> "underlying" semantics about the jimage file. Luckily most of these are 
> either used minimally and easily migrated to using nodes, or they were not 
> being used at all. This PR also removes any methods which were already unused 
> across the OpenJDK codebase (if I discover any issues with over-trimming of 
> the API during full CI testing, it will be easy to address).
> 
> ### 2. Simplification of directory child node handling
> 
> The current `ImageReader` code attempts to create parent directories "on 
> demand" for any child nodes it creates. This results in parent directories 
> having a non-empty but incomplete set of child nodes present. This is 
> referred to as the directory being "incomple...

Anything marked "TODO" is for discussion during the coming review. I don't 
intend to leave any of these in the code after, but they raise specific issues 
I'd like address.

With these pre-preview comments, I think this is finally ready for review.

make/test/BuildMicrobenchmark.gmk line 1:

> 1: #

Ignore this file, it's part of the PR to add the benchmark. I'll merge and sort 
everything out once that's in.

src/java.base/share/classes/jdk/internal/jimage/ImageReader.java line 1:

> 1: /*

I think that this class is best reviewed as if it were a new implementation, 
rather than trying to reason about the specific changes between the versions. 
Hopefully the comments will make the design clear and the reduced 
complexity/lines-of-code will help it be understood in a more stand-alone way.

src/java.base/share/classes/jdk/internal/jimage/ImageReader.java line 95:

> 93:     }
> 94: 
> 95:     // directory management interface

A large number of the methods below were either:
1. Unused or effectively no-ops.
2. Breaking encapsulation and leaking underlying types such as `ImageLocation` 
(not a big problem now, but prevents cleanly implementing the preview mode 
logic for Valhalla).

The new API has essentially 3 top level methods here:
1. findNode()
2. getResource(Node)
3. getResourceBuffer(Node)

Any navigation of the node hierarchy is done via `getChildNames()` and unlike 
now, no user can obtain a reference to an "incomplete" node.

src/java.base/share/classes/jdk/internal/jimage/ImageReader.java line 157:

> 155:     }
> 156: 
> 157:     /**

This method and the `getResourceBuffer()` method below are, I think, only 
called in one place and could, in theory, be factored out of this class 
altogether.

src/java.base/share/classes/jdk/internal/jimage/ImageReader.java line 255:

> 253:                 if (openers.isEmpty()) {
> 254:                     close();
> 255:                     // TODO (review note): Should this be synchronized 
> by "this" ??

I genuinely this this might be an existing issue.

src/java.base/share/classes/jdk/internal/jimage/ImageReader.java line 601:

> 599:         }
> 600: 
> 601:         /**

Most of these were either never called or no longer needed in the new 
implementation.

src/java.base/share/classes/jdk/internal/jrtfs/ExplodedImage.java line 122:

> 120: 
> 121:         @Override
> 122:         public List<Node> getChildren() {

ExplodedImage makes a custom subclass of Node to "fake" the ImageReader 
behaviour. While this will need to change for preview mode, for now it's enough 
to just modify the one affected API.

src/java.base/share/classes/jdk/internal/jrtfs/JrtFileAttributes.java line 52:

> 50:     @Override
> 51:     public FileTime creationTime() {
> 52:         return node.creationTime();

There's no benefit to having all these one-line getter methods duplicated in 
`ImageReader` since it already provides the file attributes object directly.

src/java.base/share/classes/jdk/internal/jrtfs/JrtFileSystem.java line 228:

> 226:         }
> 227:         if (filter == null) {
> 228:             return node.getChildren()

Same logic, just based on the child names instead of the nodes.

src/java.base/share/classes/jdk/internal/jrtfs/SystemImage.java line 61:

> 59:             // open a .jimage and build directory structure
> 60:             final ImageReader image = ImageReader.open(moduleImageFile);
> 61:             image.getRootDirectory();

This method no longer serves any purpose (it used to initialize the root 
directory entries, but it's not needed now.

src/java.base/share/classes/jdk/internal/module/SystemModuleFinders.java line 
224:

> 222:         ImageReader reader = SystemImage.reader();
> 223:         for (String mn : reader.getModuleNames()) {
> 224:             ImageLocation loc = reader.findLocation(mn, 
> "module-info.class");

It is unsafe to let users see the underlying `ImageLocation` for a resource in 
preview mode, so this code had to be migrated to using a `Node` based view of 
the data. This is probably the most complex change not part of `ImageReader` 
itself.

src/java.base/share/classes/jdk/internal/module/SystemModuleFinders.java line 
509:

> 507:                     if (node.isDirectory()) {
> 508:                         // build node
> 509:                         ImageReader.Node dir = 
> SystemImage.reader().findNode(name);

This is an example of the fragility of the current "partially complete 
directory" behaviour. The `node` variable is exactly the same instance as the 
`dir` variable, and the call to `findNode()` serves only to ensure the 
directory is "complete" before use. In the new API only the name is available, 
so `findNode()` *must* be called, ensuring there's no risk of accidentally 
using an incomplete directory node.

test/jdk/jdk/internal/jimage/ImageReaderTest.java line 78:

> 76: 
> 77:     @Test
> 78:     public void testModuleDirectories() throws IOException {

I could make these more data driven (i.e. parametrized) if people want.

test/jdk/jdk/internal/jimage/ImageReaderTest.java line 217:

> 215: 
> 216:     /// Loads and performs actions on classes stored in a given 
> `ImageReader`.
> 217:     private static class ImageClassLoader extends ClassLoader {

Loading and running class bytes to get a `toString()` value will be super 
important in the preview mode tests to prove that the correct class bytes were 
used.

test/jdk/jdk/internal/jimage/JImageReadTest.java line 28:

> 26:  * @summary Unit test for libjimage JIMAGE_Open/Read/Close
> 27:  * @modules java.base/jdk.internal.jimage
> 28:  * @run testng/othervm JImageReadTest

Looking back, I don't recall exactly why this was needed. I will see if it's 
still necessary, but I think it was.

test/jdk/tools/jimage/ImageReaderDuplicateChildNodesTest.java line 1:

> 1: /*

Just uses child names rather than nodes.

-------------

PR Review: https://git.openjdk.org/jdk/pull/26054#pullrequestreview-2973065956
PR Review: https://git.openjdk.org/jdk/pull/26054#pullrequestreview-2975530216
PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177654333
PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177702506
PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2176142533
PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177705221
PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177707657
PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177710965
PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177658457
PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177660750
PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177664240
PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177671307
PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177676879
PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177687414
PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177691003
PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177692842
PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177694731
PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177697691

Reply via email to