Chen,
thank you for your comments! My ideas to address them are:
* flush(): If the Appendable implements Flushable, then perform
Flushable.flush() on it. Otherwise, Writer.flush() will be a no-op
(besides checking if Writer is open).
* close(): If the Appendable implements Closeable, then perform
Closeable.close() on it. Otherwise, Writer.close() will be a no-op
(besides calling this.flush() if open, and internally marking itself as
closed).
* Writer.of(Writer): The original sense of the new API is to create a
Writer wrapping non-Writers like StringBuilder, CharBuffer etc., but not
to reduce a Writer to an Appendable (that would rather be
Appendable.narrow(Writer) or so). IMHO there is neither any need nor
benefit to return a limited Writer instead of the actual writer. So
actually I would plea for directly returning the given writer itself, so
Writer.of(Writer) is a no-op. I do not see why someone would
intentionally pass in a Writer in the hope to get back a more limited,
non-flushing / non-closing variant of it, and I have a bad feeling about
returning a Writer which is deliberately cutting away the ability to
flush and close without any technical need. Maybe you could elaborate on
your idea if you have strong feelings about that use case?
* StringWriter: Writer.of() is -by intention- not a "fire and forget"
drop-in replacement, but a "real" Writer. It comes with a price, but in
do not see a big problem here. If one is such happy with StringWriter
that dealing with IOException would be a no-go, then simply keep the app
as-is. But if one really wants the benefits provided by Writer.of(),
then dealing with IOExcpetion should be worth it. This is a (IMHO very)
low price the programmer has to pay for the benefit of gaining non-sync,
non-copy behavior. In most code using StringWriter I have seen so far,
IOException was dealt with anyways, as the code was mostly IO-bound
already (it expects "some" Writer, not a StringWriter, as it wants to
perform I/O, but the target is "by incident" a String).
To sum up: IMHO still it sounds feasible and the benefits outweigh the
costs. :-)
-Markus
Am 28.12.2024 um 01:51 schrieb Chen Liang:
Hi Markus,
I think the idea makes sense, but it comes with more difficulties than
in the case of Reader.of. An Appendable is a higher abstraction
modeling only the character writing aspects, without concerns with
resource control (such as flush or close).
One detail of note is that Writer itself implements Appendable, but I
don't think the new method should return a Writer as-is; I think it
should return another writer whose close will not close the underlying
writer as we are only modelling the appendable behavior without
exporting the resource control methods. Not sure about flush.
One use case you have mentioned is StringWriter. StringWriter is
distinct from StringReader: its write and append methods do not throw
IOE while the base Writer does. So Writer.of cannot adequately replace
StringWriter without use-site ugliness, until we have generic types
that represent the bottom type.
Regards,
Chen Liang
On Fri, Dec 20, 2024, 11:12 PM Markus KARG <mar...@headcrashing.eu> wrote:
Dear Sirs,
JDK 24 comes with Reader.of(CharSequence), now let's provide the
symmetrical counterpart Writer.of(Appendable) in JDK 25! :-)
For performance reasons, hereby I like to propose the new public
factory
method Writer.of(Appendable). This will provide the same benefits for
writing, that Reader.of(CharSequence) provides for reading since
JDK 24
(see JDK-8341566). Before sharing a pull request, I'd kindly like to
request for comments.
Since Java 1.1 we have the StringWriter class. Since Java 1.5 we have
the Appendable interface. StringBuilder, StringBuffer and
CharBuffer are
first-class implementations of it in the JDK, and there might exist
third-party implementations of non-String text sinks. Until today,
however, we do not have a Writer for Appendables, but need to go
costly
detours.
Text sinks in Java are expected to implement the Writer interface.
Libraries and frameworks expect application code to provide
Writers to
consume text produced by the library or framework, for example.
Application code often wants to modify the received text, e. g. embed
received SVG text into in a larger HTML text document, or simply
forward
the text as-is to I/O, so StringBuilder or CharBuffer is what the
application code actually uses, but not Strings! In such cases,
taking
the StringWriter.toString() detour is common but inefficient: It
implies
duplicating the COMPLETE text for the sole sake of creating a
temporary
String, while the subsequent processing will copy the data anyways or
just uses a small piece of it. This eats up time and memory
uselessly,
and increases GC pressure. Also, StringWriter is synchronized (not
explicitly, but de-facto, as it uses StringBuffer), which implies
another needless slowdown. In many cases, the synchronization has
no use
at all, as in real-world applications least Writers are actually
accessed concurrently. As a result, today the major benefit of
StringBuilder over StringBuffer (being non-synchronized) vanishes as
soon as a StringWriter is used to provide its content. This means,
"stringBuilder.append(stringWriter.toString())" imposes slower
performance than essentially needed, in two ways: toString(),
synchronized.
In an attempt to improve performance of this rather typical use
case, I
like to contribute a pull request providing the new public factory
method java.io.Writer.of(Appendable). This is symmetrical to the
solution we implemented in JDK-8341566 for the reversed case:
java.io.Reader.of(CharSequence).
My idea is to mostly copy the existing code of StringWriter, but
wrap a
caller-provided Appendable instead of an internally created
StringBuilder; this strips synchronization; then add optimized use
for
the StringBuffer, StringBuilder and CharBuffer implementations (in
the
sense of write(char[],start,end) to prevent a char-by-char loop in
these
cases).
Alternatives:
- Applications could use Apache Commons IO's StringBuilderWriter,
which
is limited to StringBuilder, so is not usable for the CharBuffer or
custom Appendable case. As it is an open-source third-party
dependency,
some authors might not be allowed to use it, or may not want to carry
this additional burden just for the sake of this single performance
improvement. In addition, this library is not actively modernized;
its
Java baseline still is Java 8. There is no commercial support.
- Applications could write their own Writer implementation. Given the
assumption that this is a rather common use case, this imposes
unjustified additional work for the authors of thousands of
applications. It is hard to justify why there is a StringWriter
but not
a Writer for other Appendables.
- Instead of writing a new Writer factory method, we could slightly
modify StringWriter, so it uses StringBuilder (instead of
StringBuffer).
This (still) results in unnecessary duplication of the full text at
toString() and (now also) at getBuffer(), and it will break existing
applications due the missing synchronization.
- Instead of writing a new Writer factory method, we could write a
new
AppendableWriter class. This piles up the amount of public classes,
which was the main reason in JDK-8341566 to go with the
"Reader.of(CharSequence)" factory method instead of the
"CharSequenceReader" class. Also it would be confusing to have
Reader.of(...) but not Writer.of(...) in the API.
- We could go with a specific Appendable class (like StringBuilder)
instead of supporting all Appendable implementations. This would
reduce
the number of applicable use cases daramatically (in particular as
CharBuffer is not supported any more) without providing any
considerable
benefit (other than making the OpenJDK-internal source code a bit
shorter). In particular it makes it impossible to opt-in for the
below
option:
Option:
- Once we have Writer.of(Appendable), we could replace the full
implementation of StringWriter by synchronized calls to the new
Writer.
This would reduce duplicate code.
Kindly requesting comments.
-Markus Karg