On Fri, 21 Nov 2025 14:26:00 GMT, Liam Miller-Cushon <[email protected]> wrote:
>> This PR proposes adding a new overload to `MemorySegment::getString` that >> takes a known byte length of the content. >> >> This was previously proposed in https://github.com/openjdk/jdk/pull/20725, >> but the outcome of >> [JDK-8333843](https://bugs.openjdk.org/browse/JDK-8333843) was to update >> `MemorySegment#getString` to suggest >> >> >> byte[] bytes = new byte[length]; >> MemorySegment.copy(segment, JAVA_BYTE, offset, bytes, 0, length); >> return new String(bytes, charset); >> >> >> However this is less efficient than what the implementation of getString >> does after [JDK-8362893](https://bugs.openjdk.org/browse/JDK-8362893), it >> now uses `JavaLangAccess::uncheckedNewStringNoRepl` to avoid the copy. >> >> See also discussion in [this panama-dev@ >> thread](https://mail.openjdk.org/pipermail/panama-dev/2025-November/021193.html), >> and mcimadamore's document [Pulling the (foreign) >> string](https://cr.openjdk.org/~mcimadamore/panama/strings_ffm.html) >> >> Benchmark results: >> >> >> Benchmark (size) Mode Cnt Score Error >> Units >> ToJavaStringTest.jni_readString 5 avgt 30 55.339 ± 0.401 >> ns/op >> ToJavaStringTest.jni_readString 20 avgt 30 59.887 ± 0.295 >> ns/op >> ToJavaStringTest.jni_readString 100 avgt 30 84.288 ± 0.419 >> ns/op >> ToJavaStringTest.jni_readString 200 avgt 30 119.275 ± 0.496 >> ns/op >> ToJavaStringTest.jni_readString 451 avgt 30 193.106 ± 1.528 >> ns/op >> ToJavaStringTest.panama_copyLength 5 avgt 30 7.348 ± 0.048 >> ns/op >> ToJavaStringTest.panama_copyLength 20 avgt 30 7.440 ± 0.125 >> ns/op >> ToJavaStringTest.panama_copyLength 100 avgt 30 11.766 ± 0.058 >> ns/op >> ToJavaStringTest.panama_copyLength 200 avgt 30 16.096 ± 0.089 >> ns/op >> ToJavaStringTest.panama_copyLength 451 avgt 30 25.844 ± 0.054 >> ns/op >> ToJavaStringTest.panama_readString 5 avgt 30 5.857 ± 0.046 >> ns/op >> ToJavaStringTest.panama_readString 20 avgt 30 7.750 ± 0.046 >> ns/op >> ToJavaStringTest.panama_readString 100 avgt 30 14.109 ± 0.187 >> ns/op >> ToJavaStringTest.panama_readString 200 avgt 30 18.035 ± 0.130 >> ns/op >> ToJavaStringTest.panama_readString 451 avgt 30 35.896 ± 0.227 >> ns/op >> ToJavaStringTest.panama_readStringLength 5 avgt 30 4.565 ± 0.038 >> ns/op >> ToJavaStringTest.panama_readStringLength 20... > > Liam Miller-Cushon has updated the pull request incrementally with one > additional commit since the last revision: > > Use Utils.checkNonNegativeArgument src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 1333: > 1331: * sequences with this charset's default replacement string. The > {@link > 1332: * java.nio.charset.CharsetDecoder} class should be used when more > control > 1333: * over the decoding process is required. Should we say here, as you did for `copy` that this method ignores `\0` ? src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 2646: > 2644: * will appear truncated when read again. > 2645: * > 2646: * @param src the Java string to be written into this segment Suggestion: * @param src the Java string to be written into the destination segment src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 2649: > 2647: * @param dstEncoding the charset used to {@linkplain > Charset#newEncoder() encode} > 2648: * the string bytes. > 2649: * @param srcIndex the starting index of the source string Do we feel like saying words like "index" is good enough to say what we mean? E.g. by index we mean something that is compatible with `length` and `charAt`. I wonder if using a qualifier e.g. `character index` might help? (open to suggestions here, I'm not 100% sure) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28043#discussion_r2549950855 PR Review Comment: https://git.openjdk.org/jdk/pull/28043#discussion_r2549953662 PR Review Comment: https://git.openjdk.org/jdk/pull/28043#discussion_r2549945644
