Re: RFR: 8355177: Speed up StringBuilder::append(char[]) via UTF16::compress & Unsafe::copyMemory [v4]

Shaojin Wen Thu, 01 May 2025 23:44:20 -0700

On Fri, 2 May 2025 03:49:39 GMT, Shaojin Wen <s...@openjdk.org> wrote:


>> In BufferedReader.readLine and other similar scenarios, we need to use 
>> StringBuilder.append(char[]) to build the string.
>> 
>> For these scenarios, we can use the intrinsic method StringUTF16.compress 
>> and Unsafe.copyMemory instead of the character copy of the char-by-char loop 
>> to improve the speed.
>
> Shaojin Wen has updated the pull request with a new target base due to a 
> merge or a rebase. The pull request now contains seven commits:
> 
>  - Merge remote-tracking branch 'upstream/master' into 
> optim_sb_append_chars_202504
>    
>    # Conflicts:
>    #  src/java.base/share/classes/java/lang/AbstractStringBuilder.java
>  - Merge remote-tracking branch 'upstream/master' into 
> optim_sb_append_chars_202504
>    
>    # Conflicts:
>    #  src/java.base/share/classes/java/lang/StringUTF16.java
>  - putCharsUnchecked
>  - copyright
>  - Using StringUTF16.compress to speed up LATIN1 StringBuilder append(char[])
>  - Using Unsafe.copyMemory to speed up UTF16 StringBuilder append(char[])
>  - add append(char[]) benchmark

> > This might be helpful combined with #21730.
> 
> That implies creating a copy of the chars:
> 
> ```java
> private final void appendChars(CharSequence s, int off, int end) {
>     if (isLatin1()) {
>         byte[] val = this.value;
> 
>         // ----- Begin of Experimental Section -----
>         char[] ca = new char[end - off];
>         s.getChars(off, end, ca, 0);
>         int compressed = StringUTF16.compress(ca, 0, val, count, end - off);
>         count += compressed;
>         off += compressed;
>         // ----- End of Experimental Section -----
> 
>         for (int i = off, j = count; i < end; i++) {
>             char c = s.charAt(i);
>             if (StringLatin1.canEncode(c)) {
>                 val[j++] = (byte)c;
>             } else {
>                 count = j;
>                 inflate();
>                 // Store c to make sure sb has a UTF16 char
>                 StringUTF16.putCharSB(this.value, j++, c);
>                 count = j;
>                 i++;
>                 StringUTF16.putCharsSB(this.value, j, s, i, end);
>                 count += end - i;
>                 return;
>             }
>         }
>     } else {
>         StringUTF16.putCharsSB(this.value, count, s, off, end);
>     }
>     count += end - off;
> }
> ```
> 
> While I do _assume_ that it should faster to let machine code perform the 
> copy and compression over letting Java code perform a char-by-char approach, 
> to be sure there should be another benchmark to actually proof this claim.


>         char[] ca = new char[end - off];

Your code here has a memory allocation, which may cause slowdown

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24773#issuecomment-2846483320

Re: RFR: 8355177: Speed up StringBuilder::append(char[]) via UTF16::compress & Unsafe::copyMemory [v4]

Reply via email to