On Thu, 24 Oct 2024 06:14:18 GMT, John R Rose <jr...@openjdk.org> wrote:

> A thought for a possible cleanup, after this PR is done…
> 
> The scratch mirror logic had me… scratching my head. It seems to me that a 
> more descriptive name would make the code explain itself better. I suggest 
> (for a future cleanup) calling a mirror structure which is being 
> aot-assembled (just for the archive) a "future mirror" (or maybe "production 
> mirror" or "mirror asset"). This is in distinction to the current "live" 
> mirror, which is also the AOT phase mirror. In general, from the point of 
> view of the assembly phase, when we build new structures not created by the 
> JVM as a result of post-training Java execution, we might want to give them a 
> common descriptive term. But I suppose most items that make it into the AOT 
> cache are faithful copies of live data, present in the VM at dump time (end 
> of assembly phase). In that case, it's more like "the same structure" in all 
> cases, and there's no need to simultaneously work on both present and future 
> versions of the same structure.
> 
> (When I say "structure" I mean mirror object for now, but perhaps the pattern 
> might expand to something else? Or, maybe we will get rid of the two-mirror 
> solution, in which case every future structure is also completely present and 
> live in the assembly-phase VM.)

The cached heap objects are mostly copied as-is, with a recursive walk from a 
set of roots. However, in some cases, we need to perform transformation in some 
of the objects. The transformation is implemented by substituting some of the 
discovered objects with a "scratch" (or "future") version.

For example, for java mirrors:

- If a class K1 *is not* aot-initialized, we need to zero out most of the 
fields inside `K1->java_mirror()`, but keep the injected `klass` and 
`array_klass` native pointers.
- If a class K2 *is* aot-initialized, we need to also keep the static fields 
declared in Java code in `K2->java_mirror()`

For example, here are the contents of the aot-cached mirror for the 
java/lang/String class:


 - ---- fields (total size 17 words):
 - private volatile transient 'classRedefinedCount' 'I' @12  0 (0x00000000)
 - injected 'klass' 'J' @16  2684621920 (0x00000000a0041460)
 - injected 'array_klass' 'J' @24  2684707368 (0x00000000a0056228)
 - injected 'oop_size' 'I' @32  17 (0x00000011)
 - injected 'static_oop_field_count' 'I' @36  2 (0x00000002)
 - private volatile transient 'cachedConstructor' 
'Ljava/lang/reflect/Constructor;' @40  null (0x00000000)
 - private transient 'name' 'Ljava/lang/String;' @44  null (0x00000000)
 - private transient 'module' 'Ljava/lang/Module;' @48  null (0x00000000)
 - private final 'classLoader' 'Ljava/lang/ClassLoader;' @52  null (0x00000000)
 - private transient 'classData' 'Ljava/lang/Object;' @56  null (0x00000000)
 - private transient 'signers' '[Ljava/lang/Object;' @60  null (0x00000000)
 - private transient 'packageName' 'Ljava/lang/String;' @64  null (0x00000000)
 - private final 'componentType' 'Ljava/lang/Class;' @68  null (0x00000000)
 - private volatile transient 'reflectionData' 'Ljava/lang/ref/SoftReference;' 
@72  null (0x00000000)
 - private volatile transient 'genericInfo' 
'Lsun/reflect/generics/repository/ClassRepository;' @76  null (0x00000000)
 - private volatile transient 'enumConstants' '[Ljava/lang/Object;' @80  null 
(0x00000000)
 - private volatile transient 'enumConstantDirectory' 'Ljava/util/Map;' @84  
null (0x00000000)
 - private volatile transient 'annotationData' 
'Ljava/lang/Class$AnnotationData;' @88  null (0x00000000)
 - private volatile transient 'annotationType' 
'Lsun/reflect/annotation/AnnotationType;' @92  null (0x00000000)
 - transient 'classValueMap' 'Ljava/lang/ClassValue$ClassValueMap;' @96  null 
(0x00000000)
 - injected 'protection_domain' 'Ljava/lang/Object;' @100  null (0x00000000)
 - injected 'source_file' 'Ljava/lang/Object;' @104  null (0x00000000)
 - injected '<init_lock>' 'Ljava/lang/Object;' @108  [I{0x000000060ec016b0} 
(0xc1d802d6)
 - signature: Ljava/lang/String;
 - ---- static fields (2):
 - private static final 'serialVersionUID' 'J' @120  -6849794470754667710 
(0xa0f0a4387a3bb342)
 - static final 'COMPACT_STRINGS' 'Z' @130  true (0x01)
 - private static final 'serialPersistentFields' '[Ljava/io/ObjectStreamField;' 
@112  a 'java/io/ObjectStreamField'[0] {0x000000060ec0e468} (0xc1d81c8d)
 - private static final 'REPL' 'C' @128    65533 (0xfffd)
 - public static final 'CASE_INSENSITIVE_ORDER' 'Ljava/util/Comparator;' @116  
a 'java/lang/String$CaseInsensitiveComparator'{0x000000060ec0e7a8} (0xc1d81cf5)
 - static final 'LATIN1' 'B' @131  0 (0x00)
 - static final 'UTF16' 'B' @132  1 (0x01)


Here's a matrix for deciding when a field is kept or zeroed out:


Field              Type                     Not-AOT-inited   AOT-inited
-------------------------------------------------------------------------       
                        
"klass"            injected by HotSpot      keep             keep
"COMPACT_STRINGS"  declared in String.java  zero             keep
"module"           field of j.l.Class       zero             zero


After JEP 483 is integrated, I will investigate if the transformation can be 
done inside the recursive copying code without the substitution of scratch 
(future) objects.

An alternative to substitution would be to modify the original Java mirrors (to 
zero out fields that we don't want to cache), but that will damage the state of 
the VM. That's something CDS has been avoiding doing -- we want the VM to be 
useable after the AOT cache is written.

Of course, the holy grail is to avoid any transformation and simply cache 
everything as-is, in a snapshot style. We can't do that yet as the AOT cache 
still requires some "loading" operations in the production run. Some of the 
loading code, for example, might be confused if it sees `module` to be non-null 
when an aot-inited class is loaded into the JVM.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/21642#issuecomment-2448132377

Reply via email to