New submission from Riccardo Schirone <rschi...@redhat.com>:
Copy-pasted from https://bugs.python.org/issue30458#msg347282 ================ The commit b7378d77289c911ca6a0c0afaf513879002df7d5 is incomplete: it doesn't seem to check for control characters in the "host" part of the URL, only in the "path" part of the URL. Example: --- try: from urllib import request as urllib_request except ImportError: import urllib2 as urllib_request import socket def bug(*args): raise Exception(args) # urlopen() must not call create_connection() socket.create_connection = bug urllib_request.urlopen('http://127.0.0.1\r\n\x20hihi\r\n :11211') --- The URL comes from the first message of this issue: https://bugs.python.org/issue30458#msg294360 Development branches 2.7 and master produce a similar output: --- Traceback (most recent call last): ... Exception: (('127.0.0.1\r\n hihi\r\n ', 11211), ..., None) --- So urllib2/urllib.request actually does a real network connection (DNS query), whereas it should reject control characters in the "host" part of the URL. *** A second problem comes into the game. Some C libraries like glibc strip the end of the hostname (strip at the first newline character) and so HTTP Header injection is still possible is this case: https://bugzilla.redhat.com/show_bug.cgi?id=1673465 *** According to the RFC 3986, the "host" grammar doesn't allow any control character, it looks like: host = IP-literal / IPv4address / reg-name ALPHA (letters) DIGIT (decimal digits) unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" pct-encoded = "%" HEXDIG HEXDIG sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "=" reg-name = *( unreserved / pct-encoded / sub-delims ) IP-literal = "[" ( IPv6address / IPvFuture ) "]" IPvFuture = "v" 1*HEXDIG "." 1*( unreserved / sub-delims / ":" ) IPv6address = 6( h16 ":" ) ls32 / "::" 5( h16 ":" ) ls32 / [ h16 ] "::" 4( h16 ":" ) ls32 / [ *1( h16 ":" ) h16 ] "::" 3( h16 ":" ) ls32 / [ *2( h16 ":" ) h16 ] "::" 2( h16 ":" ) ls32 / [ *3( h16 ":" ) h16 ] "::" h16 ":" ls32 / [ *4( h16 ":" ) h16 ] "::" ls32 / [ *5( h16 ":" ) h16 ] "::" h16 / [ *6( h16 ":" ) h16 ] "::" h16 = 1*4HEXDIG ls32 = ( h16 ":" h16 ) / IPv4address IPv4address = dec-octet "." dec-octet "." dec-octet "." dec-octet ================ CVE-2019-18348 was assigned to this flaw, which is similar to CVE-2019-9947 and CVE-2019-9740 but it is about the *host* part of a url. ---------- messages: 355294 nosy: rschiron priority: normal severity: normal status: open title: CVE-2019-18348 CRLF injection via the host part of the url passed to urlopen() type: security _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue38576> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com