On Mon, 2 Sep 2024 14:03:55 GMT, Shaojin Wen <s...@openjdk.org> wrote:
>> Use fast path for ascii characters 1 to 127 to improve the performance of >> writing Utf8Entry to BufferWriter. > > Shaojin Wen has updated the pull request with a new target base due to a > merge or a rebase. The incremental webrev excludes the unrelated changes > brought in by the merge/rebase. The pull request contains 21 additional > commits since the last revision: > > - Update src/java.base/share/classes/java/lang/StringCoding.java > > Co-authored-by: ExE Boss <3889017+exe-b...@users.noreply.github.com> > - vectorized countGreaterThanZero > - add comments > - optimization for none-ascii latin1 > - Revert "vectorized countGreaterThanZero" > > This reverts commit 88a77722c8f5401ac28572509d6a08b3e88e8e40. > - vectorized countGreaterThanZero > - copyright > - use JLA if length < 256 > - fix utf_len error > - code style > - ... and 11 more: https://git.openjdk.org/jdk/compare/66682133...2a36b443 src/java.base/share/classes/java/lang/StringCoding.java line 55: > 53: int i = off; > 54: for (; i < limit; i += 8) { > 55: long v = UNSAFE.getLong(ba, i + ARRAY_BYTE_BASE_OFFSET); Since `value` is a `byte[]`, `UNSAFE.getLong` could get bytes outside the array. Also, that’s not even considering the fact that the address might not even be `long` aligned for `(off % 8) != 0` or `(off % 8) != 4`, depending on the array header size (see [JDK‑8139457] and [JDK‑8314882]). [JDK‑8139457]: https://bugs.openjdk.org/browse/JDK-8139457 [JDK‑8314882]: https://bugs.openjdk.org/browse/JDK-8314882 Suggestion: for (int end = limit - 7; i < end; i += 8) { long v = UNSAFE.getLongUnaligned(ba, i + ARRAY_BYTE_BASE_OFFSET); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20772#discussion_r1740188542