On Thu, 10 Apr 2025 07:31:37 GMT, Magnus Ihse Bursie wrote:
>> Right, that `å` looks to have been incorrectly converted during the
>> ISO-8859-1 to UTF-8 conversion. (I can't find the script used for conversion
>> as this change is from some time ago.)
>>
>> Since the change occurs in a commen
On Wed, 9 Apr 2025 21:26:15 GMT, Justin Lu wrote:
>> src/java.xml/share/classes/com/sun/org/apache/xml/internal/serializer/Encodings.properties
>> line 22:
>>
>>> 20: # Peter Smolik
>>> 21: Cp1250 WINDOWS-1250 0x00FF
>>> 22: # Patch attributed to hava...@underdusken.no (H�vard Wigtil)
>>
>> Th
On Thu, 10 Apr 2025 10:10:49 GMT, Magnus Ihse Bursie wrote:
> I have checked the entire code base for incorrect encodings, but luckily
> enough these were the only remaining problems I found.
>
> BOM (byte-order mark) is a method used for distinguishing big and little
> endian UTF-16 encoding
On Thu, 10 Apr 2025 11:46:45 GMT, Raffaello Giulietti
wrote:
> I guess the difference at L.1 in the various files is just the BOM?
Yes.
-
PR Review Comment: https://git.openjdk.org/jdk/pull/24566#discussion_r2037357899
I have checked the entire code base for incorrect encodings, but luckily enough
these were the only remaining problems I found.
BOM (byte-order mark) is a method used for distinguishing big and little endian
UTF-16 encodings. There is a special UTF-8 BOM, but it is discouraged. In the
words of
As a follow-up to [JDK-8354213](https://bugs.openjdk.org/browse/JDK-8354213), I
found some additional places where unicode characters are unnecessarily used
instead of pure ASCII.
-
Commit messages:
- 8354273: Restore even more pointless unicode characters to ASCII
Changes: https:
On Thu, 10 Apr 2025 10:10:49 GMT, Magnus Ihse Bursie wrote:
> I have checked the entire code base for incorrect encodings, but luckily
> enough these were the only remaining problems I found.
>
> BOM (byte-order mark) is a method used for distinguishing big and little
> endian UTF-16 encoding
> As a follow-up to [JDK-8354213](https://bugs.openjdk.org/browse/JDK-8354213),
> I found some additional places where unicode characters are unnecessarily
> used instead of pure ASCII.
Magnus Ihse Bursie has updated the pull request incrementally with one
additional commit since the last revis
On Thu, 10 Apr 2025 10:36:31 GMT, Magnus Ihse Bursie wrote:
>> As a follow-up to
>> [JDK-8354213](https://bugs.openjdk.org/browse/JDK-8354213), I found some
>> additional places where unicode characters are unnecessarily used instead of
>> pure ASCII.
>
> Magnus Ihse Bursie has updated the pul
On Thu, 10 Apr 2025 10:14:40 GMT, Magnus Ihse Bursie wrote:
>> I have checked the entire code base for incorrect encodings, but luckily
>> enough these were the only remaining problems I found.
>>
>> BOM (byte-order mark) is a method used for distinguishing big and little
>> endian UTF-16 enc
On Thu, 10 Apr 2025 07:32:18 GMT, Magnus Ihse Bursie wrote:
>> You don't have to do that, I'm working on an omnibus UTF-8 fixing PR right
>> now, where I will include a fix for this as well.
>
> If anything, I might be a bit worried that there are more incorrect
> conversions stemming from this
On Wed, 13 Sep 2023 17:38:28 GMT, Justin Lu wrote:
>> JDK .properties files still use ISO-8859-1 encoding with escape sequences.
>> It would improve readability to see the native characters instead of escape
>> sequences (especially for the L10n process). The majority of files changed
>> are l
On Wed, 13 Sep 2023 17:38:28 GMT, Justin Lu wrote:
>> JDK .properties files still use ISO-8859-1 encoding with escape sequences.
>> It would improve readability to see the native characters instead of escape
>> sequences (especially for the L10n process). The majority of files changed
>> are l
On Thu, 10 Apr 2025 10:10:49 GMT, Magnus Ihse Bursie wrote:
> I have checked the entire code base for incorrect encodings, but luckily
> enough these were the only remaining problems I found.
>
> BOM (byte-order mark) is a method used for distinguishing big and little
> endian UTF-16 encoding
On Thu, 10 Apr 2025 10:10:49 GMT, Magnus Ihse Bursie wrote:
> I have checked the entire code base for incorrect encodings, but luckily
> enough these were the only remaining problems I found.
>
> BOM (byte-order mark) is a method used for distinguishing big and little
> endian UTF-16 encoding
On Thu, 10 Apr 2025 17:09:27 GMT, Naoto Sato wrote:
>> I have checked the entire code base for incorrect encodings, but luckily
>> enough these were the only remaining problems I found.
>>
>> BOM (byte-order mark) is a method used for distinguishing big and little
>> endian UTF-16 encodings.
On Thu, 10 Apr 2025 10:10:49 GMT, Magnus Ihse Bursie wrote:
> I have checked the entire code base for incorrect encodings, but luckily
> enough these were the only remaining problems I found.
>
> BOM (byte-order mark) is a method used for distinguishing big and little
> endian UTF-16 encoding
On Thu, 10 Apr 2025 10:10:49 GMT, Magnus Ihse Bursie wrote:
> I have checked the entire code base for incorrect encodings, but luckily
> enough these were the only remaining problems I found.
>
> BOM (byte-order mark) is a method used for distinguishing big and little
> endian UTF-16 encoding
On Wed, 13 Sep 2023 17:38:28 GMT, Justin Lu wrote:
>> JDK .properties files still use ISO-8859-1 encoding with escape sequences.
>> It would improve readability to see the native characters instead of escape
>> sequences (especially for the L10n process). The majority of files changed
>> are l
On Thu, 10 Apr 2025 17:23:37 GMT, Raffaello Giulietti
wrote:
> If this is a French name, it's e acute: é.
Supported by this Wikipedia page listing S.L as an LCMS developer:
https://en.wikipedia.org/wiki/Little_CMS
-
PR Review Comment: https://git.openjdk.org/jdk/pull/24566#discus
On Thu, 10 Apr 2025 10:10:49 GMT, Magnus Ihse Bursie wrote:
> I have checked the entire code base for incorrect encodings, but luckily
> enough these were the only remaining problems I found.
>
> BOM (byte-order mark) is a method used for distinguishing big and little
> endian UTF-16 encoding
On Thu, 10 Apr 2025 08:44:28 GMT, Eirik Bjørsnøs wrote:
>> Justin Lu has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> Replace InputStreamReader with BufferedReader
>
> FWIW, I checked out the revision of the commit previous to this change
Remove forRemoval = true from @Deprecated annotation of Boolean, Byte,
Character, Double, Float, Integer, Long, Short.
And add `SuppressWarnings("deprecation") `where needed; and remove
`SuppressWarnings("removal")`
-
Commit messages:
- 8354335: No longer deprecate wrapper class co
On Thu, 10 Apr 2025 22:05:04 GMT, Roger Riggs wrote:
> Remove forRemoval = true from @Deprecated annotation of Boolean, Byte,
> Character, Double, Float, Integer, Long, Short.
> And add `SuppressWarnings("deprecation") `where needed; and remove
> `SuppressWarnings("removal")`
The wrapper class
On Thu, 10 Apr 2025 10:10:49 GMT, Magnus Ihse Bursie wrote:
> I have checked the entire code base for incorrect encodings, but luckily
> enough these were the only remaining problems I found.
>
> BOM (byte-order mark) is a method used for distinguishing big and little
> endian UTF-16 encoding
On Thu, 11 May 2023 20:21:57 GMT, Justin Lu wrote:
>> This PR converts Unicode sequences to UTF-8 native in .properties file.
>> (Excluding the Unicode space and tab sequence). The conversion was done
>> using native2ascii.
>>
>> In addition, the build logic is adjusted to support reading in t
On Thu, 10 Apr 2025 08:08:02 GMT, Eirik Bjørsnøs wrote:
>> If anything, I might be a bit worried that there are more incorrect
>> conversions stemming from this PR, that my automated tools and manual
>> scanning has not revealed.
>
> Some observations:
>
> 1: This PR seems to have been abondo
On Thu, 10 Apr 2025 10:10:49 GMT, Magnus Ihse Bursie wrote:
> I have checked the entire code base for incorrect encodings, but luckily
> enough these were the only remaining problems I found.
>
> BOM (byte-order mark) is a method used for distinguishing big and little
> endian UTF-16 encoding
On Thu, 10 Apr 2025 19:06:35 GMT, Eirik Bjørsnøs wrote:
> (BTW, I enjoyed seeing separate commits for the encoding and BOM changes,
> makes it easier to verify each!)
Thanks! I do very much like myself to review PRs that has separate logical
commits, so I try to produce such myself. I'm glad t
On Thu, 10 Apr 2025 18:30:22 GMT, Eirik Bjørsnøs wrote:
>> If this is a French name, it's e acute: é.
>
>> If this is a French name, it's e acute: é.
>
> Supported by this Wikipedia page listing S.L as an LCMS developer:
>
> https://en.wikipedia.org/wiki/Little_CMS
It's not a mistake in capita
30 matches
Mail list logo