On 5/6/25 2:06 AM, Willy Tarreau wrote: > On Fri, May 02, 2025 at 04:54:47PM -0400, Demi Marie Obenour wrote: >> On 5/1/25 1:40 AM, Willy Tarreau wrote: >>> Hi, >>> >>> On Thu, May 01, 2025 at 01:08:22AM -0400, Demi Marie Obenour wrote: >>>> I noticed that if HAProxy receives a message containing \x01 in a >>>> field value, it will happily forward that message. This appears >>>> to be permitted by RFC9113 and RFC9114. However, it might cause >>>> a problem if the message is a response, the response is sent to >>>> the client over HTTP/3, and the client uses nghttp3. nghttp3 will >>>> silently discard the field, causing HAProxy and the client to >>>> disagree on whether the field exists. >>>> >>>> Is this a serious concern, or is it considered too theoretical to >>>> matter? From my reading of >>>> https://github.com/ngtcp2/nghttp3/discussions/346, >>>> it seems that the nghttp3 maintainer considers this to be working >>>> as intended (a valid decision). Is there anything that should be >>>> be done on the HAProxy side, or is HAProxy thinking that a header >>>> exists when a client thinks it does not expected behavior? >>> >>> When H2 begun, some people saw a great opportunity in its ability >>> to be fully binary-transparent, while others (like us) dealing with >>> intermediaries saw a big danger in it. From what I remember (but I >>> could be wrong, I'd need to recheck), the spec only requires that >>> chars are filtered when converting from H2 to H1. And IIRC in the >>> end only CR, LF and NUL are forbidden in header values in H2, as >>> a protective measure for H2 to H1 translation, while HPACK takes >>> care of permitting absolutely everything. >>> >>> The thing is that full H2 applications are explicitly permitted >>> to rely on this. For example you could imagine sending payload's >>> SHA256 digests directly in a header, escaping NUL with \0, LF >>> with \n, CR with \r and '\' with '\\' an keep it compact like >>> this. So while we have to perform the cleanup for H2,H3 -> H1, >>> we cannot afford to be excessive and go beyond the spec where >>> applications expect it to work as permitted by the spec. >> >> Which specification are you referring to? RFC9110 limits the allowed >> characters in an HTTP field value to 0x9, 0x20 ... 0x7E, and 0x80 ... 0xFF, >> with the additional requirement that a field cannot start or end with >> 0x9 or 0x20. RFC9112 (HTTP/1.1), RFC9113 (HTTP/2), and RFC9114 (HTTP/3) >> all refer to RFC9110 for permitted field values. As far as I can tell, >> the list of permitted characters has not changed since RFC2068 >> (January 1997). > > Note, HTTP/2 was first specified in 7540, whose text was less clear > than it currently is since by then there was no HTTP spec but an HTTP/1 > spec. For example that spec explicitly allowed "values in the same > format as in 7230", except that 7230 permitted the obsolete header > folding, thus LF or CRLF followed by an LWS, causing ambiguity > depending how you read the spec or what existing pieces of code you > would reuse for this purpose. > > In 9113 it's much clearer: #8.2.1 Field Validity says: > > * A field value MUST NOT contain the zero value (ASCII NUL, 0x00), > line feed (ASCII LF, 0x0a), or carriage return (ASCII CR, 0x0d) > at any position. > > * A field value MUST NOT start or end with an ASCII whitespace > character (ASCII SP or HTAB, 0x20 or 0x09). > > This part was essentially clarified as part of this work: > > https://github.com/httpwg/http2-spec/issues/815 > https://github.com/httpwg/http2-spec/pull/846 > > The discussion about CTL existing in the wild is here: > > https://github.com/httpwg/http-core/issues/683 > > (and I've personally seen quite a number of them). We've even had > complaints for not being binary-transparent on non-standard headers, > from users who used to place raw binary (e.g. gzip, SHA256 or encrypted > blocks) directly in a their own header fields between their client and > their servers, and complaining that placing haproxy in the middle would > break. At least with the clarifications above there's less room for the > doubt. NUL, CR, LF are forbidden in values and there's no way around > this. There have been some provisions for allowing to replace CR/LF/NUL > with spaces for some time but I'm not seeing that in the final spec. I > think it came from what the WHATWG does (the group that deals with how > browsers should behave, because they're *way* more tolerant than any > other component, due to having to work around errors everywhere). In > any case, replacing with a space at the beginning or end of the value > still is forbidden. > > I tried, before 9113 was released, to introduce some extra validation > of the pseudo-headers before reassembly, here: > > https://github.com/httpwg/http2-spec/pull/936 > https://github.com/httpwg/http2-spec/issues/1015 > > It was generally welcome but a bit too late and the spec was emitted > without it, the only thing we could save was a non-normative warning > for implementers: > > https://github.com/httpwg/http2-spec/pull/1017 > > But that doesn't mean we cannot improve, like your suggestions about > unwanted characters such as '/#?\\[]@\0' in :authority that I'm still > on!
Thank you! I think it would be best to validate Host and :authority identically. The reason is that user configuration accesses :authority via the host header. For instance, http-request set-uri https://%[req.hdr(host)]%[pathq] (from https://www.mail-archive.com/haproxy@formilux.org/msg43261.html) is only safe if host is validated. -- Sincerely, Demi Marie Obenour (she/her/hers)
OpenPGP_0xB288B55FFF9C22C1.asc
Description: OpenPGP public key
OpenPGP_signature.asc
Description: OpenPGP digital signature