Re: RFR: 8195686: ISO-8859-8-i charset cannot be decoded, should be mapped to ISO-8859-8

Jason Mehrens Thu, 12 Sep 2024 20:15:39 -0700

On Tue, 27 Aug 2024 17:01:19 GMT, Naoto Sato <na...@openjdk.org> wrote:


>> Mapping ISO-8859-8-I charset to ISO-8859-8.
>> Below mentioned 2 aliases are added as part of this:-
>> **ISO-8859-8-I**
>> **ISO8859-8-I**
>> 
>> The bug report for the same:- https://bugs.openjdk.org/browse/JDK-8195686
>
> I looked at this issue a bit more. Looking at the IANA Charset registry 
> (https://www.iana.org/assignments/character-sets/character-sets.xhtml) which 
> `Charset` class is based on, `ISO-8859-8-I` is not an alias to `ISO-8859-8`, 
> but it is defined as a distinct `Preferred MIME name`. So I don't think 
> current proposed solution is correct. (It would return ISO-8859-8-I as an 
> alias to ISO-8859-8). Also, looking at the RFC-1556, in which this 
> ISO-8859-8-I encoding is defined, there are other encodings, i.e., 
> ISO-8859-6-I, ISO-8859-6-E, and ISO-8859-8-E. Why are they not relevant, but 
> ISO-8859-8-I is?
> Considering these, I am still not sure to introduce these new encodings now, 
> also because there has not been any request from the time Bill Shannon worked 
> (circa 2018), unless Arabic/Hebrew speaking communities jumped in and provide 
> rationale to support them.

@naotoj does the mapping need to be removed from:

https://github.com/openjdk/jdk/blob/5e5942a282e14846404b68c65d43594d6b9226d9/src/java.xml/share/classes/com/sun/org/apache/xerces/internal/util/EncodingMap.java#L770

I ask because JakartaMail /Angus Mail is a similar usecase to this code.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/20690#issuecomment-2347953621

Re: RFR: 8195686: ISO-8859-8-i charset cannot be decoded, should be mapped to ISO-8859-8

Reply via email to