On 5/6/25 2:06 AM, Willy Tarreau wrote:
> On Fri, May 02, 2025 at 04:54:47PM -0400, Demi Marie Obenour wrote:
>> On 5/1/25 1:40 AM, Willy Tarreau wrote:
>>> Hi,
>>>
>>> On Thu, May 01, 2025 at 01:08:22AM -0400, Demi Marie Obenour wrote:
>>>> I noticed that if HAProxy receives a message containing \x01 in a
>>>> field value, it will happily forward that message.  This appears
>>>> to be permitted by RFC9113 and RFC9114.  However, it might cause
>>>> a problem if the message is a response, the response is sent to
>>>> the client over HTTP/3, and the client uses nghttp3.  nghttp3 will
>>>> silently discard the field, causing HAProxy and the client to
>>>> disagree on whether the field exists.
>>>>
>>>> Is this a serious concern, or is it considered too theoretical to
>>>> matter?  From my reading of 
>>>> https://github.com/ngtcp2/nghttp3/discussions/346,
>>>> it seems that the nghttp3 maintainer considers this to be working
>>>> as intended (a valid decision).  Is there anything that should be
>>>> be done on the HAProxy side, or is HAProxy thinking that a header
>>>> exists when a client thinks it does not expected behavior?
>>>
>>> When H2 begun, some people saw a great opportunity in its ability
>>> to be fully binary-transparent, while others (like us) dealing with
>>> intermediaries saw a big danger in it. From what I remember (but I
>>> could be wrong, I'd need to recheck), the spec only requires that
>>> chars are filtered when converting from H2 to H1. And IIRC in the
>>> end only CR, LF and NUL are forbidden in header values in H2, as
>>> a protective measure for H2 to H1 translation, while HPACK takes
>>> care of permitting absolutely everything.
>>>
>>> The thing is that full H2 applications are explicitly permitted
>>> to rely on this. For example you could imagine sending payload's
>>> SHA256 digests directly in a header, escaping NUL with \0, LF
>>> with \n, CR with \r and '\' with '\\' an keep it compact like
>>> this. So while we have to perform the cleanup for H2,H3 -> H1,
>>> we cannot afford to be excessive and go beyond the spec where
>>> applications expect it to work as permitted by the spec.
>>
>> Which specification are you referring to?  RFC9110 limits the allowed
>> characters in an HTTP field value to 0x9, 0x20 ... 0x7E, and 0x80 ... 0xFF,
>> with the additional requirement that a field cannot start or end with
>> 0x9 or 0x20.  RFC9112 (HTTP/1.1), RFC9113 (HTTP/2), and RFC9114 (HTTP/3)
>> all refer to RFC9110 for permitted field values.  As far as I can tell,
>> the list of permitted characters has not changed since RFC2068
>> (January 1997).
> 
> Note, HTTP/2 was first specified in 7540, whose text was less clear
> than it currently is since by then there was no HTTP spec but an HTTP/1
> spec. For example that spec explicitly allowed "values in the same
> format as in 7230", except that 7230 permitted the obsolete header
> folding, thus LF or CRLF followed by an LWS, causing ambiguity
> depending how you read the spec or what existing pieces of code you
> would reuse for this purpose.
> 
> In 9113 it's much clearer: #8.2.1 Field Validity says:
> 
>  * A field value MUST NOT contain the zero value (ASCII NUL, 0x00),
>    line feed (ASCII LF, 0x0a), or carriage return (ASCII CR, 0x0d)
>    at any position.
> 
>  * A field value MUST NOT start or end with an ASCII whitespace
>    character (ASCII SP or HTAB, 0x20 or 0x09).
> 
> This part was essentially clarified as part of this work:
> 
>     https://github.com/httpwg/http2-spec/issues/815
>     https://github.com/httpwg/http2-spec/pull/846
> 
> The discussion about CTL existing in the wild is here:
> 
>     https://github.com/httpwg/http-core/issues/683
> 
> (and I've personally seen quite a number of them). We've even had
> complaints for not being binary-transparent on non-standard headers,
> from users who used to place raw binary (e.g. gzip, SHA256 or encrypted
> blocks) directly in a their own header fields between their client and
> their servers, and complaining that placing haproxy in the middle would
> break. At least with the clarifications above there's less room for the
> doubt. NUL, CR, LF are forbidden in values and there's no way around
> this. There have been some provisions for allowing to replace CR/LF/NUL
> with spaces for some time but I'm not seeing that in the final spec. I
> think it came from what the WHATWG does (the group that deals with how
> browsers should behave, because they're *way* more tolerant than any
> other component, due to having to work around errors everywhere). In
> any case, replacing with a space at the beginning or end of the value
> still is forbidden.
> 
> I tried, before 9113 was released, to introduce some extra validation
> of the pseudo-headers before reassembly, here:
> 
>     https://github.com/httpwg/http2-spec/pull/936
>     https://github.com/httpwg/http2-spec/issues/1015
> 
> It was generally welcome but a bit too late and the spec was emitted
> without it, the only thing we could save was a non-normative warning
> for implementers:
> 
>     https://github.com/httpwg/http2-spec/pull/1017
> 
> But that doesn't mean we cannot improve, like your suggestions about
> unwanted characters such as '/#?\\[]@\0' in :authority that I'm still
> on!

Thank you!

I think it would be best to validate Host and :authority identically.
The reason is that user configuration accesses :authority via the host
header.  For instance,

    http-request set-uri https://%[req.hdr(host)]%[pathq]

(from https://www.mail-archive.com/haproxy@formilux.org/msg43261.html)
is only safe if host is validated.
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)

Attachment: OpenPGP_0xB288B55FFF9C22C1.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature

Reply via email to