Hello,
It looks like XmlStreamReader is not correctly handling several encodings
in Commons IO 2.14.0 that previously worked in version 2.13.0.
Here's a self-contained snippet (Kotlin) that demonstrates the problem:
val xml = "Ç"
val stream = xml.byteInputStream(Charset.forName("437"))
On Tue, Oct 3, 2023 at 1:39 AM sebb wrote:
>
> The byte input stream does not carry any encoding information, so the
> XmlStreamReader has to guess what encoding was used.
Determining what encoding to use when reading XML from a byte stream
is the purpose of XmlStreamReader. From its documentatio
:'[A-Za-z]([A-Za-z0-9._]|-)*'))",
>
> This does not allow for an encoding that starts with a digit; i.e. it
> won't match encoding='437'
>
> AFAICT, no supported encodings start with a digit.
>
> The '437' encoding is actually kn
On Tue, Oct 3, 2023 at 1:50 PM sebb wrote:
> > Given this inconsistency, and the fact that there are XML documents "in the
> > wild" that use these encoding names, would it be reasonable to relax the
> > regex
> > just enough so that it'll work with these other names and aliases?
>
> I would say