[issue16564] email.generator.BytesGenerator fails with bytes payload
Alexander Kruppa added the comment: It seems to me that this issue is not fixed correctly yet. I've tried Python 3.3.2: ~/build/Python-3.3.2$ ./python --version Python 3.3.2 When modifying the test case in Lib/test/test_email/test_email.py like this: --- Lib/test/test_email/test_email.py 2013-05-15 18:32:55.0 +0200 +++ Lib/test/test_email/test_email_mine.py 2013-09-10 14:22:08.160089440 +0200 @@ -1461,17 +1461,17 @@ # Issue 16564: This does not produce an RFC valid message, since to be # valid it should have a CTE of binary. But the below works in # Python2, and is documented as working this way. -bytesdata = b'\xfa\xfb\xfc\xfd\xfe\xff' +bytesdata = b'\x0b\xfa\xfb\xfc\xfd\xfe\xff' msg = MIMEApplication(bytesdata, _encoder=encoders.encode_noop) # Treated as a string, this will be invalid code points. -self.assertEqual(msg.get_payload(), '\uFFFD' * len(bytesdata)) +# self.assertEqual(msg.get_payload(), '\uFFFD' * len(bytesdata)) self.assertEqual(msg.get_payload(decode=True), bytesdata) s = BytesIO() g = BytesGenerator(s) g.flatten(msg) wireform = s.getvalue() msg2 = email.message_from_bytes(wireform) -self.assertEqual(msg.get_payload(), '\uFFFD' * len(bytesdata)) +# self.assertEqual(msg.get_payload(), '\uFFFD' * len(bytesdata)) self.assertEqual(msg2.get_payload(decode=True), bytesdata) then running: ./python ./Tools/scripts/run_tests.py test_email results in: == FAIL: test_binary_body_with_encode_noop (test_email_mine.TestMIMEApplication) -- Traceback (most recent call last): File "/localdisk/kruppaal/build/Python-3.3.2/Lib/test/test_email/test_email_mine.py", line 1475, in test_binary_body_with_encode_noop self.assertEqual(msg2.get_payload(decode=True), bytesdata) AssertionError: b'\x0b\n\xfa\xfb\xfc\xfd\xfe\xff' != b'\x0b\xfa\xfb\xfc\xfd\xfe\xff' The '\x0b' byte is incorrectly translated to '\x0b\n', i.e., a New Line character is inserted. Encoding the bytes array: bytes(range(256)) results output data (MIME Header stripped): 000: 0001 0203 0405 0607 0809 0a0b 0a0c 0a0a 010: 0e0f 1011 1213 1415 1617 1819 1a1b 1c0a 020: 1d0a 1e0a 1f20 2122 2324 2526 2728 292a . !"#$%&'()* 030: 2b2c 2d2e 2f30 3132 3334 3536 3738 393a +,-./0123456789: 040: 3b3c 3d3e 3f40 4142 4344 4546 4748 494a ;<=>?@ABCDEFGHIJ 050: 4b4c 4d4e 4f50 5152 5354 5556 5758 595a KLMNOPQRSTUVWXYZ 060: 5b5c 5d5e 5f60 6162 6364 6566 6768 696a [\]^_`abcdefghij 070: 6b6c 6d6e 6f70 7172 7374 7576 7778 797a klmnopqrstuvwxyz 080: 7b7c 7d7e 7f80 8182 8384 8586 8788 898a {|}~ 090: 8b8c 8d8e 8f90 9192 9394 9596 9798 999a 0a0: 9b9c 9d9e 9fa0 a1a2 a3a4 a5a6 a7a8 a9aa 0b0: abac adae afb0 b1b2 b3b4 b5b6 b7b8 b9ba 0c0: bbbc bdbe bfc0 c1c2 c3c4 c5c6 c7c8 c9ca 0d0: cbcc cdce cfd0 d1d2 d3d4 d5d6 d7d8 d9da 0e0: dbdc ddde dfe0 e1e2 e3e4 e5e6 e7e8 e9ea 0f0: ebec edee eff0 f1f2 f3f4 f5f6 f7f8 f9fa 100: fbfc fdfe ff . That is, a '\n' is inserted after '\x0b', '\x1c', '\x1d', and '\x1e', and '\x0d' is replaced by '\n\n'. -- ___ Python tracker <http://bugs.python.org/issue16564> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue19003] email.generator.BytesGenerator corrupts data by changing line endings
New submission from Alexander Kruppa: This is a follow-up to #16564. In that issue, BytesGenerator was changed to accept a bytes payload, however processing binary data that way leads to data corruption. Repost of the update I posted in #16564: *** ~/build/Python-3.3.2$ ./python --version Python 3.3.2 When modifying the test case in Lib/test/test_email/test_email.py like this: --- Lib/test/test_email/test_email.py 2013-05-15 18:32:55.0 +0200 +++ Lib/test/test_email/test_email_mine.py 2013-09-10 14:22:08.160089440 +0200 @@ -1461,17 +1461,17 @@ # Issue 16564: This does not produce an RFC valid message, since to be # valid it should have a CTE of binary. But the below works in # Python2, and is documented as working this way. -bytesdata = b'\xfa\xfb\xfc\xfd\xfe\xff' +bytesdata = b'\x0b\xfa\xfb\xfc\xfd\xfe\xff' msg = MIMEApplication(bytesdata, _encoder=encoders.encode_noop) # Treated as a string, this will be invalid code points. -self.assertEqual(msg.get_payload(), '\uFFFD' * len(bytesdata)) +# self.assertEqual(msg.get_payload(), '\uFFFD' * len(bytesdata)) self.assertEqual(msg.get_payload(decode=True), bytesdata) s = BytesIO() g = BytesGenerator(s) g.flatten(msg) wireform = s.getvalue() msg2 = email.message_from_bytes(wireform) -self.assertEqual(msg.get_payload(), '\uFFFD' * len(bytesdata)) +# self.assertEqual(msg.get_payload(), '\uFFFD' * len(bytesdata)) self.assertEqual(msg2.get_payload(decode=True), bytesdata) then running: ./python ./Tools/scripts/run_tests.py test_email results in: == FAIL: test_binary_body_with_encode_noop (test_email_mine.TestMIMEApplication) -- Traceback (most recent call last): File "/localdisk/kruppaal/build/Python-3.3.2/Lib/test/test_email/test_email_mine.py", line 1475, in test_binary_body_with_encode_noop self.assertEqual(msg2.get_payload(decode=True), bytesdata) AssertionError: b'\x0b\n\xfa\xfb\xfc\xfd\xfe\xff' != b'\x0b\xfa\xfb\xfc\xfd\xfe\xff' The '\x0b' byte is incorrectly translated to '\x0b\n', i.e., a New Line character is inserted. Encoding the bytes array: bytes(range(256)) results output data (MIME Header stripped): 000: 0001 0203 0405 0607 0809 0a0b 0a0c 0a0a 010: 0e0f 1011 1213 1415 1617 1819 1a1b 1c0a 020: 1d0a 1e0a 1f20 2122 2324 2526 2728 292a . !"#$%&'()* 030: 2b2c 2d2e 2f30 3132 3334 3536 3738 393a +,-./0123456789: 040: 3b3c 3d3e 3f40 4142 4344 4546 4748 494a ;<=>?@ABCDEFGHIJ 050: 4b4c 4d4e 4f50 5152 5354 5556 5758 595a KLMNOPQRSTUVWXYZ 060: 5b5c 5d5e 5f60 6162 6364 6566 6768 696a [\]^_`abcdefghij 070: 6b6c 6d6e 6f70 7172 7374 7576 7778 797a klmnopqrstuvwxyz 080: 7b7c 7d7e 7f80 8182 8384 8586 8788 898a {|}~ 090: 8b8c 8d8e 8f90 9192 9394 9596 9798 999a 0a0: 9b9c 9d9e 9fa0 a1a2 a3a4 a5a6 a7a8 a9aa 0b0: abac adae afb0 b1b2 b3b4 b5b6 b7b8 b9ba 0c0: bbbc bdbe bfc0 c1c2 c3c4 c5c6 c7c8 c9ca 0d0: cbcc cdce cfd0 d1d2 d3d4 d5d6 d7d8 d9da 0e0: dbdc ddde dfe0 e1e2 e3e4 e5e6 e7e8 e9ea 0f0: ebec edee eff0 f1f2 f3f4 f5f6 f7f8 f9fa 100: fbfc fdfe ff . That is, a '\n' is inserted after '\x0b', '\x1c', '\x1d', and '\x1e', and '\x0d' is replaced by '\n\n'. *** I suspect this is due to the use of self._write_lines(msg._payload) in BytesGenerator._handle_text(); since _write_lines() mangles line endings. -- components: email messages: 197476 nosy: Alexander.Kruppa, barry, r.david.murray priority: normal severity: normal status: open title: email.generator.BytesGenerator corrupts data by changing line endings type: behavior versions: Python 3.2, Python 3.3 ___ Python tracker <http://bugs.python.org/issue19003> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16564] email.generator.BytesGenerator fails with bytes payload
Alexander Kruppa added the comment: Opened #19003. -- ___ Python tracker <http://bugs.python.org/issue16564> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16564] email.generator.BytesGenerator fails with bytes payload
New submission from Alexander Kruppa: I'm trying to use the email.* functions to craft HTTP POST data for file upload. Trying something like filedata = open("data", "rb").read() postdata = MIMEMultipart() fileattachment = MIMEApplication(filedata, _encoder=email.encoders.encode_noop) postdata.attach(fileattachment) fp = BytesIO() g = BytesGenerator(fp) g.flatten(postdata, unixfrom=False) fails with Traceback (most recent call last): File "./minetest.py", line 30, in g.flatten(postdata, unixfrom=False) File "/usr/lib/python3.2/email/generator.py", line 91, in flatten self._write(msg) File "/usr/lib/python3.2/email/generator.py", line 137, in _write self._dispatch(msg) File "/usr/lib/python3.2/email/generator.py", line 163, in _dispatch meth(msg) File "/usr/lib/python3.2/email/generator.py", line 224, in _handle_multipart g.flatten(part, unixfrom=False, linesep=self._NL) File "/usr/lib/python3.2/email/generator.py", line 91, in flatten self._write(msg) File "/usr/lib/python3.2/email/generator.py", line 137, in _write self._dispatch(msg) File "/usr/lib/python3.2/email/generator.py", line 163, in _dispatch meth(msg) File "/usr/lib/python3.2/email/generator.py", line 192, in _handle_text raise TypeError('string payload expected: %s' % type(payload)) TypeError: string payload expected: This is because BytesGenerator._handle_text() expects str payload in which byte values that are non-printable in ASCII have been replaced by surrogates. The example above creates a bytes payload, however, for which super(BytesGenerator,self)._handle_text(msg) = Generator._handle_text(msg) throws the exception. Note that using any email.encoders other than encode_noop does not really fit the HTTP POST bill, as those define a Content-Transfer-Encoding which HTTP does not know. It would seem better to me to let BytesGenerator accept a bytes payload and just copy that to the output, rather than making the application encode the bytes as a string, hopefully in a way that s.encode('ascii', 'surrogateescape') can invert. E.g., a workaround class I use now does class FixedBytesGenerator(BytesGenerator): def _handle_bytes(self, msg): payload = msg.get_payload() if payload is None: return if isinstance(payload, bytes): self._fp.write(payload) elif isinstance(payload, str): super(FixedBytesGenerator,self)._handle_text(msg) else: # Payload is neither bytes not string - this can't be right raise TypeError('bytes or str payload expected: %s' % type(payload)) _writeBody = _handle_bytes -- components: Library (Lib) messages: 176476 nosy: Alexander.Kruppa priority: normal severity: normal status: open title: email.generator.BytesGenerator fails with bytes payload type: behavior versions: Python 3.2 ___ Python tracker <http://bugs.python.org/issue16564> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue19435] Directory traversal attack for CGIHTTPRequestHandler
New submission from Alexander Kruppa: An error in separating the path and filename of the CGI script to run in http.server.CGIHTTPRequestHandler allows running arbitrary executables in the directory under which the server was started. The problem is that in CGIHTTPRequestHandler we have: def run_cgi(self): """Execute a CGI script.""" path = self.path dir, rest = self.cgi_info i = path.find('/', len(dir) + 1) where path is the uncollapsed path in the URL, but cgi_info contains the first path segment and the rest from the *collapsed* path as filled in by is_cgi(), so indexing into path via len(dir) is incorrect. An example exploit is giving the request path: ///badscript.sh/../cgi-bin/cgi.sh Note that Firefox and wget at least simplify the path in the request; to make sure this exact path is used, do for example: (echo "GET ///badscript.sh/../cgi-bin/cgi.sh HTTP/1.1"; echo) | telnet localhost 4443 This causes the CGIHTTPRequestHandler to execute the badscript.sh file in the directory in which the server was started, so script execution is not restricted to the cgi-bin/ or htbin/ subdirectories. -- components: Library (Lib) messages: 201645 nosy: Alexander.Kruppa priority: normal severity: normal status: open title: Directory traversal attack for CGIHTTPRequestHandler type: security versions: Python 3.2 ___ Python tracker <http://bugs.python.org/issue19435> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com