On Thu, 10 Jul 2025 20:54:24 GMT, John R Rose <jr...@openjdk.org> wrote:
>> Validate input in `java.lang.StringCoding` intrinsic Java wrappers, improve >> their documentation, enhance the checks in the associated IR or assembly >> code, and adapt them to cause VM crash on invalid input. >> >> ## Implementation notes >> >> The goal of the associated umbrella issue >> [JDK-8156534](https://bugs.openjdk.org/browse/JDK-8156534) is to, for >> `java.lang.String*` classes, >> >> 1. Move `@IntrinsicCandidate`-annotated `public` methods<sup>1</sup> (in >> Java code) to `private` ones, and wrap them with a `public` ["front door" >> method](https://github.com/openjdk/jdk/pull/24982#discussion_r2087493446) >> 2. Since we moved the `@IntrinsicCandidate` annotation to a new method, >> intrinsic mappings – i.e., associated `do_intrinsic()` calls in >> `vmIntrinsics.hpp` – need to be updated too >> 3. Add necessary input validation (range, null, etc.) checks to the newly >> created public front door method >> 4. Place all input validation checks in the intrinsic code (add if missing!) >> behind a `VerifyIntrinsicChecks` VM flag >> >> Following preliminary work needs to be carried out as well: >> >> 1. Add a new `VerifyIntrinsicChecks` VM flag >> 2. Update `generate_string_range_check` to produce a `HaltNode`. That is, >> crash the VM if `VerifyIntrinsicChecks` is set and a Java wrapper fails to >> spot an invalid input. >> >> <sup>1</sup> `@IntrinsicCandidate`-annotated constructors are not subject >> to this change, since they are a special case. >> >> ## Functional and performance tests >> >> - `tier1` (which includes `test/hotspot/jtreg/compiler/intrinsics/string`) >> passes on several platforms. Further tiers will be executed after >> integrating reviewer feedback. >> >> - Performance impact is still actively monitored using >> `test/micro/org/openjdk/bench/java/lang/String{En,De}code.java`, among other >> tests. If you have suggestions on benchmarks, please share in the comments. >> >> ## Verification of the VM crash >> >> I've tested the VM crash scenario as follows: >> >> 1. Created the following test program: >> >> public class StrIntri { >> public static void main(String[] args) { >> Exception lastException = null; >> for (int i = 0; i < 1_000_000; i++) { >> try { >> >> jdk.internal.access.SharedSecrets.getJavaLangAccess().countPositives(new >> byte[]{1,2,3}, 2, 5); >> } catch (Exception exception) { >> lastException = exception; >> } >> } >> if (lastException != null) { >> lastException.printStackTrace... > > I disagree with a small part of the statement of goals: > >> Always validate all input at the intrinsic (but preferably behind a VM flag) > > As formulated above, this is a violation of DRY and if embraced the wrong way > will lead to code that is harder to review and prove bug-free. Performing > 100% accurate range/null/validation checks is deeply impractical for an > assembly-based or IR-based intrinsic. It’s too hard to verify by code > review, and coverage testing is suspect. > > We must frankly put all the weight of verification on Java code, including > Java bytecode intrinsic behaviors. Java code is high-level and can be read > mostly as a declarative spec, if clearly written (as straight-line code, then > the intrinsic call). Also, such simple Java code shapes (and their > underlying bytecodes) are tested many orders of magnitude more than any given > intrinsic. > > I see two bits of evidence that you agree with me on this: 1. The > intrinsic-local validation (IR or assembly) is allowed to Halt instead of > throw, and 2. the intrinsic-local validation is optional, turned on only by a > stress test mode. This tells me that the extra optional testing is also not > required to be 100%. > > Thus, I think the above goal would be better stated this way: > >> Validate input in the IR or assembly code of the intrinsic in an ad hoc >> manner to catch bugs in the Java validation. > >> Note: IR or assembly based validation code should not obscure the code or >> add large maintenance costs, and under a VM diagnostic flag (or debug flag), >> and causing a VM halt instead of a Java throw. > > I think I'm agreeing with you on the material points. It is important to > summarize our intentions accurately at the top, for those readers that are > reading only the top as a summary. @rose00, thanks so much for the feedback. I agree with your remarks and get your points on _"Always validate all input at the intrinsic"_ is a violation of DRY and an impractical goal. I incorporated your suggestions as follows: 1. Renamed the ticket to `Move input validation checks to Java for String-related intrinsics` (to better reflect the goal) 2. Replaced `Always validate all input at the intrinsic...` with your suggestion ------------- PR Comment: https://git.openjdk.org/jdk/pull/25998#issuecomment-3061201145