Re: RFR: 8299807: newStringNoRepl should avoid copying arrays for ASCII compatible charsets [v4]

Glavo Sun, 29 Jan 2023 18:19:55 -0800

On Sat, 28 Jan 2023 19:54:32 GMT, Glavo <d...@openjdk.org> wrote:

>> This is the javadoc of `JavaLangAccess::newStringNoRepl`:
>> 
>> 
>>     /**
>>      * Constructs a new {@code String} by decoding the specified subarray of
>>      * bytes using the specified {@linkplain java.nio.charset.Charset 
>> charset}.
>>      *
>>      * The caller of this method shall relinquish and transfer the ownership 
>> of
>>      * the byte array to the callee since the later will not make a copy.
>>      *
>>      * @param bytes the byte array source
>>      * @param cs the Charset
>>      * @return the newly created string
>>      * @throws CharacterCodingException for malformed or unmappable bytes
>>      */
>> 
>> 
>> It is recorded in the document that it should be able to directly construct 
>> strings with parameter byte array to reduce array allocation.
>> 
>> However, at present, `newStringNoRepl` always copies arrays for UTF-8 or 
>> other ASCII compatible charsets.
>> 
>> This PR fixes this problem.
>
> Glavo has updated the pull request incrementally with one additional commit 
> since the last revision:
> 
>   update


Benchmark results for large files (128KiB~256MiB) on the default file system:

![image](https://user-images.githubusercontent.com/20694662/215372915-2988bb4e-8c35-4def-9fe9-a6bc4f033708.png)

![image](https://user-images.githubusercontent.com/20694662/215372925-2a7995dc-00f4-4a91-9b2a-e73c97192008.png)

For files of about 1MiB, the throughput can be improved by more than 55%.

Original results: 
https://gist.github.com/Glavo/f3d2060d0bd13cd0ce2add70e6060ea0?permalink_comment_id=4452798#gistcomment-4452798

-------------

PR: https://git.openjdk.org/jdk/pull/12119

Re: RFR: 8299807: newStringNoRepl should avoid copying arrays for ASCII compatible charsets [v4]

Reply via email to