Except in Java 18 we do need it, for two independent reasons: 1. UTF-8 is still not the guaranteed, runtime character set that the various methods will use. JDKs can be configured to use a different default character set. Bugs from incorrect default character set will now be even harder to find since they won't be as obviously reproducible on all systems with a particular JDK.
2. Even if UTF-8 were the guaranteed, runtime character set that the various methods will use, that doesn't make UTF-8 correct. It depends on the input you're reading and the relevant specifications. Some of these use UTF-8. Some of these use ASCII or ISO 8859-1. A few use UTF-16 or something else. Just because the default character set is UTF-8, does not make any particular file or stream magically UTF-8. It is necessary to consider the context of the input source and choose the character encoding that is appropriate for that one source. We know from decades of experience that default character sets are unsafe and buggy. The safest approach is to provide higher level libraries that only accept byte streams as input and do character set conversion themselves according to spec. This is how JSON and XML parsers usually operate. But that's not always possible, and when it isn't, the most secure and bug-resistant API requires developers to think about their choice of character encoding and make their choice explicit. On Thu, Jan 25, 2024 at 5:37 PM Rob Tompkins <chtom...@gmail.com> wrote: > > I think we should remove the deprecations and add more explicit Javadocs that > spell out that there are oddities with the defaultCharset() depending upon > the operating system. Note this problem has been in existence since Java 1.4, > and we did nothing about it for a considerable amount of time. Plus in Java > 18 forward we simply don’t need it, as Gary said. > > Cheers, > -Rob > > > On Jan 24, 2024, at 1:45 PM, Gary D. Gregory <ggreg...@apache.org> wrote: > > > > Hi All, > > > > In the context of https://issues.apache.org/jira/browse/IO-842 and in light > > of UTF-8 being the default Charset for Java 18 and up on all platforms -- > > https://openjdk.org/jeps/400 --, we need to figure out whether to: > > > > - Deprecate all non-Charset methods in favor of their Charset versions, or > > - Un-deprecate existing deprecated non-Charset methods. > > > > See the ticket, please reply there or here as convenient. > > > > Gary > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > > For additional commands, e-mail: dev-h...@commons.apache.org > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > -- Elliotte Rusty Harold elh...@ibiblio.org --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org