Redirecting to nio-dev which is the more appropriate forum for this topic.
> On Feb 26, 2023, at 3:39 PM, Carl M <j...@rkive.org> wrote:
>
> I'm looking into adding a fast path case for encoding Strings into
> ByteBuffers, and wanted to get feedback on a possible approach. My use case
> is taking mostly-ASCII, UTF-8 Strings and writing them to the disk/network.
> To do this today, there are two approaches which both have drawbacks:
>
> 1. Use String.getBytes(StandardCharsets.UTF_8), and call ByteBuffer.put().
> The downside of this approach is that I need to make a copy of the String's
> byte[] value. The upside of this approach is that ByteBuffer uses the
> intrinsic copy methods, which are fast.
>
> 2. Wrap the String in a CharBuffer, and call
> CharsetEncoder.encode(CharBuffer, ByteBuffer). This avoids copying the
> String value. However, when using the UTF_8 encoder, there is no fastpath
> for writing to direct ByteBuffers. sun.nio.cs.UTF_8.encodeLoop() only has
> fast paths for when the destination is array based. This allocates less
> memory, but is overall slower in my JMH benchmark.
>
> To fix this, I looked at adding an overload to CharsetEncoder to accept a
> String (or a CharSequence), and a ByteBuffer as a destination. However, this
> is not easily doable, since it's hard to call it in a loop. In the case that
> the String overflows the BB, the caller needs to be able to provide a new BB
> and resume from where they left off. The CharBuffer approach works here
> because it keeps the position last read, and can resume from there.
>
> To encode a String, we need to know that the character index written to
> resume with a larger buffer. However, the return type on CharsetEncoder's
> encode method is a CoderResult. The length() method on this can't be called
> for underflow cases. This means that there isn't a usable return type here
> (neither int nor CoderResult can be used).
>
> Another, almost-possible solution I was considering adding a special case to
> UTF_8 for direct buffer destinations, and a corresponding JLA.encodeASCII
> overload that accepts a ByteBuffer. The challenge here is that a wrapped
> CharBuffer doesn't have an array, and so doesn't get the fast path copying.
>
> The reason I am reaching out here is that I am looking for feedback on my
> analysis of the existing API. I am wondering what API compromises could be
> made to fast path writing Strings to direct buffers, which I feel is probably
> a common operation. The only reasonable way I can see to implement is a new
> return type, which also seems undesirable as well.
>
> Carl