[issue795081] email.Message param parsing problem II
Tony Nelson added the comment: If I understand RFC2822 3.2.2. Quoted characters (heh), unquoting must be done in one pass, so the current replace().replace() is wrong. It will change '\\"' to '"', but it should become '\"' when unquoted. This seems to work: re.sub(r'\\(.)',r'\1',s) I haven't encountered a problem with this; I just came across it while looking at the file Utils.py (Python 2.4, but unchanged in trunk). I will submit a new bug if desired. -- nosy: +tony_nelson Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue795081> ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1522237] _threading_local.py logic error in _localbase __new__
Tony Nelson added the comment: Thanks, Amaury. The new test works here on Python2.6.1, failing without the fix and passing with it. (Passing MyLocal(a=1) and failing MyLocal(1), as expected.) With the fix, _threading_local.py supports positional arguments to subclass __init__, as well as keyword arguments. -- ___ Python tracker <http://bugs.python.org/issue1522237> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5610] email feedparser.py CRLFLF bug: $ vs \Z
New submission from Tony Nelson : feedparser.py does not pares mixed newlines properly. NLCRE_eol, which is used to search for the various newlines at End Of Line, uses $ to match the end of string, but $ also matches \n$, due to a wise long-ago patch by the Effbot. This causes feedparser to match '\r\n\n' at '\r\n', and then to remove the last two characters, leaving '\r', thus eating up a line. Such mixed line endings can occur if a message with CRLF line endings is parsed, written out, and then parsed again. When explicitly searching for various newlines, the \Z end-of-string marker should be used instead. There are two improper uses of $ in feedparser.py. I don't see any others in the email package. NLCRE_eol = re.compile('(\r\n|\r|\n)$') should be: NLCRE_eol = re.compile('(\r\n|\r|\n)\Z') and boundary_re also needs the fix. I can write a test. Where exactly should it be put? -- components: Library (Lib) files: feedparser_crlflf.patch keywords: patch messages: 84595 nosy: barry, tony_nelson severity: normal status: open title: email feedparser.py CRLFLF bug: $ vs \Z versions: Python 2.6 Added file: http://bugs.python.org/file13476/feedparser_crlflf.patch ___ Python tracker <http://bugs.python.org/issue5610> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5610] email feedparser.py CRLFLF bug: $ vs \Z
Tony Nelson added the comment: make test still passes all tests except test_httpservers on my Python 2.6.1 build. The network resource was not enabled and tk is not available. The new test for CRLFLF at the end of a message body is added to Lib/email/test_email at the end of the TestParsers class. It passes with the fix patch and fails without it. What other tests do you want? -- versions: -Python 3.1 Added file: http://bugs.python.org/file13506/feedparser_crlflf_test.patch ___ Python tracker <http://bugs.python.org/issue5610> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5638] test_httpservers fails CGI tests if --enable-shared
New submission from Tony Nelson : test_httpservers fails the CGI tests if Python was built as a shared library (./config --enable-shared) and not yet installed. To run such a Python without installing it, the command line must define LD_LIBRARY_PATH to point to the build directory. I see that the new environment for the child CGI process still has LD_LIBRARY_PATH set. The child process is not using that when the CGI is invoked. After the new shared Python (or one like it) is installed, the test passes, but the CGIs aren't using the correct copy of Python. I'm doing this with Python 2.6.1, but the version probably doesn't matter. -- components: Tests messages: 84969 nosy: tony_nelson severity: normal status: open title: test_httpservers fails CGI tests if --enable-shared type: behavior versions: Python 2.6 ___ Python tracker <http://bugs.python.org/issue5638> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1555570] email parser incorrectly breaks headers with a CRLF at 8192
Tony Nelson added the comment: The OP's diagnosis of a buffer boundary problem is correct, but incomplete. The problem can be reproduced by calling feedparser FeedParser.feed() directly, or as my patch test does, by calling BufferedSubFile.push() directly. The proper fix is for push() to treat a last line ending in CR as a partial line, as it does if no part of a line ending is present. The OP's patch only works when FeedParser is called through the old Parser interface. -- nosy: +tony_nelson Added file: http://bugs.python.org/file13586/feedparser_pushcr_pushlf.patch ___ Python tracker <http://bugs.python.org/issue170> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3169] email/header.py doesn't handle Base64 headers that have been insufficiently padded.
Tony Nelson added the comment: Postel's law suggests that, as bad padding can be repaired, decode_header ought to do so. The patch does that, adds a test for it, and alters another test to still properly fail on really bad encoded data. The test doesn't check a single character encoded string, as such does not specify a complete octet and I felt that base64 decoders might reasonably differ on what to do then. The issue exists in Python2.6.1 (where I made it) and trunk. -- keywords: +patch nosy: +tony_nelson Added file: http://bugs.python.org/file13589/header_B_padding.patch ___ Python tracker <http://bugs.python.org/issue3169> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3169] email/header.py doesn't handle Base64 headers that have been insufficiently padded.
Changes by Tony Nelson : -- nosy: +barry ___ Python tracker <http://bugs.python.org/issue3169> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4487] Add utf8 alias for email charsets
Tony Nelson added the comment: This seems entirely reasonable, helpful, and in accord with the mapping of ascii to us-ascii. I recommend accepting this patch or a slightly fancier one that would also do "utf_8". There are pobably other encoding names with the same issue of being accepted by Python but not be understood by other email clients. This issue also affects 2.6.1 and 2.7trunk. I haven't checked 3.x. -- nosy: +barry, tony_nelson ___ Python tracker <http://bugs.python.org/issue4487> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4487] Add utf8 alias for email charsets
Changes by Tony Nelson : -- versions: +Python 2.6, Python 2.7 ___ Python tracker <http://bugs.python.org/issue4487> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1079] decode_header does not follow RFC 2047
Tony Nelson added the comment: I think the problem is best viewed as headers are not being parsed according to RFC2822 and decoded after that, so the recognition of encoded words should be looser, and not require whitespace around them, as it is not required in all contexts. Patch and test, tested on 2.6.1, 2.7trunk. The test mostly just reverses the sense of test_rfc2047_without_whitespace(). -- keywords: +patch nosy: +barry, tony_nelson versions: +Python 2.6, Python 2.7 Added file: http://bugs.python.org/file13608/header_encwd_nows.patch ___ Python tracker <http://bugs.python.org/issue1079> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4491] email.Header.decode_header() doesn't work if encoded-word was separeted by CRLF
Tony Nelson added the comment: See patch in issue1079. I don't think email.header can require whitespace until it decodes parsed headers, as whitespace is not always required. -- nosy: +barry, tony_nelson versions: +Python 2.7 ___ Python tracker <http://bugs.python.org/issue4491> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1079] decode_header does not follow RFC 2047
Tony Nelson added the comment: The email package does not follow the RFCs in anything to do with header parsing or decoding. This is a known deficiency. So no, I am not thinking of atoms at all -- and neither is email.header.decode_header()! :-( Until email.header actually parses headers into atoms and then decodes atoms, it doesn't matter what parsed atoms would look like. Currently, email.header.decode_header() just stumbles through raw text, and doesn't know if it is looking at atoms or not, or usually even what header the text came from. In order to interpret the RFC correctly, email.header.decode_header() needs either a parser and the name of the header it is decoding, or parsed header data. I think the latter is being considered for a redesign of the email package for 3.1 or 3.2 (3 months to a year or so, and not for 2.x at all), but until then, it is better to decode every likely encoded-word than to skip encoded-words that, for example, have a parenthesis on one side or the other. -- ___ Python tracker <http://bugs.python.org/issue1079> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com