On Mon, 30 Jun 2025 23:27:49 GMT, David Beaumont <d...@openjdk.org> wrote:
> Refactoring `ImageReader` to make it easy to add preview mode functionality > for Valhalla. > > This PR is a large change to `ImageReader` (effectively a rewrite) but > reduces the surface area of the API significantly, reduces code complexity > and increases performance/memory efficiency. The need for this sort of change > comes from the need to support "preview mode" in Valhalla for classes loaded > from core modules. > > ### Preview Mode > > In the proposed preview mode support for Valhalla, a resource with the name > `/modules/<module>/<path>` may come from one of two locations in the > underlying jimage file; `/<module>/<path>` or > `/<module>/META-INF/preview/<path>`. Furthermore, directories (represented as > directory nodes via the API) will have a potentially different number of > child nodes depending on whether preview mode is in effect, and some > directories may only exist at all in preview mode. > > Furthermore, the directories and symbolic link nodes in `/packages/...` will > also be affected by the existence of new package directories. To retain > consistency and avoid "surprises" later, all of this needs to be taken into > account. > > Below is a summary of the main changes for mainline JDK to better support > preview mode later: > > ### 1: Improved encapsulation for preview mode > > The current `ImageReader` code exposes the data from the jimage file in > several ways; via the `Node` abstraction, but also with methods which return > an `ImageLocation` instance directly. In preview mode it would not be > acceptable to return the `ImageLocation`, since that would leak the existence > of resources in the `META-INF/preview` namespace and lets users see resource > nodes with names that don't match the underlying `ImageLocation` from which > their contents come. > > As such, the PR removes all methods which can leak `ImageLocation` or other > "underlying" semantics about the jimage file. Luckily most of these are > either used minimally and easily migrated to using nodes, or they were not > being used at all. This PR also removes any methods which were already unused > across the OpenJDK codebase (if I discover any issues with over-trimming of > the API during full CI testing, it will be easy to address). > > ### 2. Simplification of directory child node handling > > The current `ImageReader` code attempts to create parent directories "on > demand" for any child nodes it creates. This results in parent directories > having a non-empty but incomplete set of child nodes present. This is > referred to as the directory being "incomple... Anything marked "TODO" is for discussion during the coming review. I don't intend to leave any of these in the code after, but they raise specific issues I'd like address. With these pre-preview comments, I think this is finally ready for review. make/test/BuildMicrobenchmark.gmk line 1: > 1: # Ignore this file, it's part of the PR to add the benchmark. I'll merge and sort everything out once that's in. src/java.base/share/classes/jdk/internal/jimage/ImageReader.java line 1: > 1: /* I think that this class is best reviewed as if it were a new implementation, rather than trying to reason about the specific changes between the versions. Hopefully the comments will make the design clear and the reduced complexity/lines-of-code will help it be understood in a more stand-alone way. src/java.base/share/classes/jdk/internal/jimage/ImageReader.java line 95: > 93: } > 94: > 95: // directory management interface A large number of the methods below were either: 1. Unused or effectively no-ops. 2. Breaking encapsulation and leaking underlying types such as `ImageLocation` (not a big problem now, but prevents cleanly implementing the preview mode logic for Valhalla). The new API has essentially 3 top level methods here: 1. findNode() 2. getResource(Node) 3. getResourceBuffer(Node) Any navigation of the node hierarchy is done via `getChildNames()` and unlike now, no user can obtain a reference to an "incomplete" node. src/java.base/share/classes/jdk/internal/jimage/ImageReader.java line 157: > 155: } > 156: > 157: /** This method and the `getResourceBuffer()` method below are, I think, only called in one place and could, in theory, be factored out of this class altogether. src/java.base/share/classes/jdk/internal/jimage/ImageReader.java line 255: > 253: if (openers.isEmpty()) { > 254: close(); > 255: // TODO (review note): Should this be synchronized > by "this" ?? I genuinely this this might be an existing issue. src/java.base/share/classes/jdk/internal/jimage/ImageReader.java line 601: > 599: } > 600: > 601: /** Most of these were either never called or no longer needed in the new implementation. src/java.base/share/classes/jdk/internal/jrtfs/ExplodedImage.java line 122: > 120: > 121: @Override > 122: public List<Node> getChildren() { ExplodedImage makes a custom subclass of Node to "fake" the ImageReader behaviour. While this will need to change for preview mode, for now it's enough to just modify the one affected API. src/java.base/share/classes/jdk/internal/jrtfs/JrtFileAttributes.java line 52: > 50: @Override > 51: public FileTime creationTime() { > 52: return node.creationTime(); There's no benefit to having all these one-line getter methods duplicated in `ImageReader` since it already provides the file attributes object directly. src/java.base/share/classes/jdk/internal/jrtfs/JrtFileSystem.java line 228: > 226: } > 227: if (filter == null) { > 228: return node.getChildren() Same logic, just based on the child names instead of the nodes. src/java.base/share/classes/jdk/internal/jrtfs/SystemImage.java line 61: > 59: // open a .jimage and build directory structure > 60: final ImageReader image = ImageReader.open(moduleImageFile); > 61: image.getRootDirectory(); This method no longer serves any purpose (it used to initialize the root directory entries, but it's not needed now. src/java.base/share/classes/jdk/internal/module/SystemModuleFinders.java line 224: > 222: ImageReader reader = SystemImage.reader(); > 223: for (String mn : reader.getModuleNames()) { > 224: ImageLocation loc = reader.findLocation(mn, > "module-info.class"); It is unsafe to let users see the underlying `ImageLocation` for a resource in preview mode, so this code had to be migrated to using a `Node` based view of the data. This is probably the most complex change not part of `ImageReader` itself. src/java.base/share/classes/jdk/internal/module/SystemModuleFinders.java line 509: > 507: if (node.isDirectory()) { > 508: // build node > 509: ImageReader.Node dir = > SystemImage.reader().findNode(name); This is an example of the fragility of the current "partially complete directory" behaviour. The `node` variable is exactly the same instance as the `dir` variable, and the call to `findNode()` serves only to ensure the directory is "complete" before use. In the new API only the name is available, so `findNode()` *must* be called, ensuring there's no risk of accidentally using an incomplete directory node. test/jdk/jdk/internal/jimage/ImageReaderTest.java line 78: > 76: > 77: @Test > 78: public void testModuleDirectories() throws IOException { I could make these more data driven (i.e. parametrized) if people want. test/jdk/jdk/internal/jimage/ImageReaderTest.java line 217: > 215: > 216: /// Loads and performs actions on classes stored in a given > `ImageReader`. > 217: private static class ImageClassLoader extends ClassLoader { Loading and running class bytes to get a `toString()` value will be super important in the preview mode tests to prove that the correct class bytes were used. test/jdk/jdk/internal/jimage/JImageReadTest.java line 28: > 26: * @summary Unit test for libjimage JIMAGE_Open/Read/Close > 27: * @modules java.base/jdk.internal.jimage > 28: * @run testng/othervm JImageReadTest Looking back, I don't recall exactly why this was needed. I will see if it's still necessary, but I think it was. test/jdk/tools/jimage/ImageReaderDuplicateChildNodesTest.java line 1: > 1: /* Just uses child names rather than nodes. ------------- PR Review: https://git.openjdk.org/jdk/pull/26054#pullrequestreview-2973065956 PR Review: https://git.openjdk.org/jdk/pull/26054#pullrequestreview-2975530216 PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177654333 PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177702506 PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2176142533 PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177705221 PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177707657 PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177710965 PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177658457 PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177660750 PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177664240 PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177671307 PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177676879 PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177687414 PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177691003 PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177692842 PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177694731 PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177697691