+1 to option A. Option C is too limiting on uses for the target. It could be added as a default which calls through to option A, if there is meaningful demand.
Is it a follow up change/PR for various places which would benefit from using the new method? On Fri, Oct 25, 2024 at 1:34 PM Markus Karg <mar...@headcrashing.eu> wrote: > I hereby request for comments on the proposal to generalize the existing > method "String.getChars()"'s signature to become a new default interface > method "CharSequence.getChars()". > > > > Problem > > > > For performance reasons, many CharSequence implementations, in particular > String, StringBuilder, StringBuffer and CharBuffer, provide a way to > bulk-read a complete region of their characters content into a provided > char array. > > Unfortunately, there is no _uniform_ way to perform this, and it is not > guaranteed that there is bulk-reading implemented with _any_ CharSequence, > in particular custom ones. > > While String, StringBuilder and StringBuffer all share the same getChars() > method signature for this purpose, CharBuffer's way to perform the very > same is the get() method. > > Other implementations have other method signatures, or do not have any > solution to this problem at all. > > In particular, there is no method in their common interface, CharSequence, > to perform such an bulk-optimized read, as CharSequence only allows to read > one character after the next in a sequential way, either by iterating over > charAt() from 0 to length(), or by consuming the chars() Stream. > > > > As a result, code that wants to read from CharSequence in an > implementation-agnostic, but still bulk-optimized way, needs to know _each_ > possible implementation's specific method! > > Effectively this results in code like this (real-world example taken from > the implementation of Reader.of(CharSequence) in JDK 24): > > > > switch (cs) { > > case String s -> s.getChars(next, next + n, cbuf, off); > > case StringBuilder sb -> sb.getChars(next, next + n, cbuf, > off); > > case StringBuffer sb -> sb.getChars(next, next + n, cbuf, > off); > > case CharBuffer cb -> cb.get(next, cbuf, off, n); > > default -> { > > for (int i = 0; i < n; i++) > > cbuf[off + i] = > cs.charAt(next + i); > > } > > } > > > > The problem with this code is that it is bound and limited to exactly that > given set of CharSequence implementations. > > If a future CharSequence implementation shall get accessed in an > bulk-optimized way, the switch expression has to get extended and > recompiled _every time_. > > If some custom CharSequence implementation is used that this code is not > aware of, sequential read is applied, even if that implementation _does_ > provide some bulk-read method! > > > > Solution > > > > There are several possible alternative solutions: > > * (A) CharSequence.getChars(int srcBegin, int srcEnd, char[] dst, int > dstBegin) - As this signature is already supported by String, StringBuffer > and StringBuilder, I hereby propose to add this signature to CharSequence > and provide a default implementation that iterates over charAt(int) from 0 > to length(). > > * (B) Alternatively the same default method could be implemented using the > chars() Stream - I assume that might run slower, but correct me if I am > wrong. > > * (C) Alternatively we could go with the signature get(char[] dst, int > offset, int length) - Only CharBuffer implements that already, so more > changes are needed and more duplicate methods will exist in the end. > > * (D) Alternatively we could come up with a totally different signature - > That would be most fair to all existing implementations, but in the end it > will imply the most changes and the most duplicate methods. > > * (E) We could give up the idea and live with the situation as-is. - I > assume only few people really prefer that outcome. > > > > Please tell me if I missed a viable option! > > > > As a side benefit of CharSequence.getChars(), its existence might trigger > implementors to provide bulk-reading if not done yet, at least for those > cases where it is actually feasible. > > In the same way it might trigger callers of Reader to start making use of > bulk reading, at least in those cases where it does make sense but > application authors were reluctant to implement the switch-case shown above. > > > > Hence, if nobody vetoes, I will file Jira Issue, PR and CSR for > "CharSequence.getChars()" (alternative A) in the next days. > > > > -Markus >