On Fri, 30 Aug 2024 02:07:54 GMT, David Holmes <dhol...@openjdk.org> wrote:
> This is the implementation of a new method added to the JNI specification. > > From the CSR request: > > The `GetStringUTFLength` function returns the length as a `jint` (`jsize`) > value and so is limited to returning at most `Integer.MAX_VALUE`. But a Java > string can itself consist of `Integer.MAX_VALUE` characters, each of which > may require more than one byte to represent them in modified UTF-8 format.** > It follows then that this function cannot return the correct answer for all > String values and yet the specification makes no mention of this, nor of any > possible error to report if this situation is encountered. > > **The modified UTF-8 format used by the VM can require up to six bytes to > represent one unicode character, but six byte characters are stored as UTF16 > surrogate pairs. Hence the most bytes per character is 3, and so the maximum > length is 3*`Integer.MAX_VALUE`. With compact strings this reduces to > 2*`Integer.MAX_VALUE`. > > Solution > > Deprecate the existing JNI `GetStringUTFLength` method noting that it may > return a truncated length, and add a new method, JNI > `GetStringUTFLengthAsLong` that returns the string length as a `jlong` value. > > --- > > We also add a truncation warning to `GetStringUTFLength` under -Xcheck:jni > > There are some incidental whitespace changes in > `src/hotspot/os/posix/dtrace/hotspot_jni.d` along with the new method entries. > > Testing: > - new test added > - tiers 1-3 sanity > > Thanks test/hotspot/jtreg/runtime/jni/checked/TestLargeUTF8Length.java line 27: > 25: * @bug 8328877 > 26: * @summary Test warning for GetStringUTFLength and functionality of > GetStringUTFLengthAsLong > 27: * @library /test/lib Shouldn't this test have: `@requires vm.bits == 64 ` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20784#discussion_r1737942541