Integrated: 8255244: HttpClient: Response headers contain incorrectly encoded Unicode characters

Daniel Fuchs Fri, 13 Nov 2020 07:17:23 -0800

On Wed, 11 Nov 2020 16:45:49 GMT, Daniel Fuchs <[email protected]> wrote:


> The HTTP/1.1 Header Parser of the new HttpClient currently assumes that all 
> headers (names and value) are US-ASCII and as a result mis-decode any byte 
> whose value is > 127; For instance, 0x80 (128) gets decoded as a U+FF80 
> (65408) instead of being either rejected or decoded as U+0080.
> 
> Historically, HTTP has allowed field content with text in the ISO-8859-1 
> charset.  The ISO-8859-1 charset is also supported by `HttpURLConnection`.
> 
> We could decide to reject responses whose headers contain non US-ASCII 
> characters out of hand, but for compatibility reasons, it seems preferable to 
> interpret and accept any byte > 127 in header values as an ISO-8859-1 (Latin 
> 1) character.
> For backward compatibility, this change proposes to update the HTTP/1.1 
> Header Parser to support ISO-8859-1 encoding.
> The HTTP/1.1 Header Parser will now apply the same validation than is already 
> applied by the HTTP/2 stack.

This pull request has now been integrated.

Changeset: 1c47244b
Author:    Daniel Fuchs <[email protected]>
URL:       https://git.openjdk.java.net/jdk/commit/1c47244b
Stats:     561 lines in 6 files changed: 535 ins; 0 del; 26 mod

8255244: HttpClient: Response headers contain incorrectly encoded Unicode 
characters

Reviewed-by: chegar, michaelm

-------------

PR: https://git.openjdk.java.net/jdk/pull/1169

Integrated: 8255244: HttpClient: Response headers contain incorrectly encoded Unicode characters

Reply via email to