[issue795081] email.Message param parsing problem II

2008-03-02 Thread Tony Nelson

Tony Nelson added the comment:

If I understand RFC2822 3.2.2. Quoted characters (heh), unquoting must
be done in one pass, so the current replace().replace() is wrong.  It
will change '\\"' to '"', but it should become '\"' when unquoted.

This seems to work:

re.sub(r'\\(.)',r'\1',s)

I haven't encountered a problem with this; I just came across it while
looking at the file Utils.py (Python 2.4, but unchanged in trunk).  I
will submit a new bug if desired.

--
nosy: +tony_nelson


Tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue795081>

___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1522237] _threading_local.py logic error in _localbase __new__

2009-03-30 Thread Tony Nelson

Tony Nelson  added the comment:

Thanks, Amaury.  The new test works here on Python2.6.1, failing without
the fix and passing with it.  (Passing MyLocal(a=1) and failing
MyLocal(1), as expected.)  With the fix, _threading_local.py supports
positional arguments to subclass __init__, as well as keyword arguments.

--

___
Python tracker 
<http://bugs.python.org/issue1522237>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5610] email feedparser.py CRLFLF bug: $ vs \Z

2009-03-30 Thread Tony Nelson

New submission from Tony Nelson :

feedparser.py does not pares mixed newlines properly.  NLCRE_eol, which
is used to search for the various newlines at End Of Line, uses $ to
match the end of string, but $ also matches \n$, due to a wise long-ago
patch by the Effbot.  This causes feedparser to match '\r\n\n' at
'\r\n', and then to remove the last two characters, leaving '\r', thus
eating up a line.  Such mixed line endings can occur if a message with
CRLF line endings is parsed, written out, and then parsed again.

When explicitly searching for various newlines, the \Z end-of-string
marker should be used instead.  There are two improper uses of $ in
feedparser.py.  I don't see any others in the email package.

NLCRE_eol = re.compile('(\r\n|\r|\n)$')

should be:

NLCRE_eol = re.compile('(\r\n|\r|\n)\Z')

and boundary_re also needs the fix.

I can write a test.  Where exactly should it be put?

--
components: Library (Lib)
files: feedparser_crlflf.patch
keywords: patch
messages: 84595
nosy: barry, tony_nelson
severity: normal
status: open
title: email feedparser.py CRLFLF bug: $ vs \Z
versions: Python 2.6
Added file: http://bugs.python.org/file13476/feedparser_crlflf.patch

___
Python tracker 
<http://bugs.python.org/issue5610>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5610] email feedparser.py CRLFLF bug: $ vs \Z

2009-03-30 Thread Tony Nelson

Tony Nelson  added the comment:

make test still passes all tests except test_httpservers on my Python
2.6.1 build.  The network resource was not enabled and tk is not available.

The new test for CRLFLF at the end of a message body is added to
Lib/email/test_email at the end of the TestParsers class.  It passes
with the fix patch and fails without it.

What other tests do you want?

--
versions:  -Python 3.1
Added file: http://bugs.python.org/file13506/feedparser_crlflf_test.patch

___
Python tracker 
<http://bugs.python.org/issue5610>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5638] test_httpservers fails CGI tests if --enable-shared

2009-03-31 Thread Tony Nelson

New submission from Tony Nelson :

test_httpservers fails the CGI tests if Python was built as a shared
library (./config --enable-shared) and not yet installed.  To run such a
Python without installing it, the command line must define
LD_LIBRARY_PATH to point to the build directory.  I see that the new
environment for the child CGI process still has LD_LIBRARY_PATH set. 
The child process is not using that when the CGI is invoked.

After the new shared Python (or one like it) is installed, the test
passes, but the CGIs aren't using the correct copy of Python.

I'm doing this with Python 2.6.1, but the version probably doesn't matter.

--
components: Tests
messages: 84969
nosy: tony_nelson
severity: normal
status: open
title: test_httpservers fails CGI tests if --enable-shared
type: behavior
versions: Python 2.6

___
Python tracker 
<http://bugs.python.org/issue5638>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1555570] email parser incorrectly breaks headers with a CRLF at 8192

2009-04-02 Thread Tony Nelson

Tony Nelson  added the comment:

The OP's diagnosis of a buffer boundary problem is correct, but
incomplete.  The problem can be reproduced by calling feedparser
FeedParser.feed() directly, or as my patch test does, by calling
BufferedSubFile.push() directly.  The proper fix is for push() to treat
a last line ending in CR as a partial line, as it does if no part of a
line ending is present.  The OP's patch only works when FeedParser is
called through the old Parser interface.

--
nosy: +tony_nelson
Added file: http://bugs.python.org/file13586/feedparser_pushcr_pushlf.patch

___
Python tracker 
<http://bugs.python.org/issue170>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3169] email/header.py doesn't handle Base64 headers that have been insufficiently padded.

2009-04-02 Thread Tony Nelson

Tony Nelson  added the comment:

Postel's law suggests that, as bad padding can be repaired,
decode_header ought to do so.  The patch does that, adds a test for it,
and alters another test to still properly fail on really bad encoded data.

The test doesn't check a single character encoded string, as such does
not specify a complete octet and I felt that base64 decoders might
reasonably differ on what to do then.

The issue exists in Python2.6.1 (where I made it) and trunk.

--
keywords: +patch
nosy: +tony_nelson
Added file: http://bugs.python.org/file13589/header_B_padding.patch

___
Python tracker 
<http://bugs.python.org/issue3169>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3169] email/header.py doesn't handle Base64 headers that have been insufficiently padded.

2009-04-02 Thread Tony Nelson

Changes by Tony Nelson :


--
nosy: +barry

___
Python tracker 
<http://bugs.python.org/issue3169>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4487] Add utf8 alias for email charsets

2009-04-03 Thread Tony Nelson

Tony Nelson  added the comment:

This seems entirely reasonable, helpful, and in accord with the mapping
of ascii to us-ascii.  I recommend accepting this patch or a slightly
fancier one that would also do "utf_8".

There are pobably other encoding names with the same issue of being
accepted by Python but not be understood by other email clients.

This issue also affects 2.6.1 and 2.7trunk.  I haven't checked 3.x.

--
nosy: +barry, tony_nelson

___
Python tracker 
<http://bugs.python.org/issue4487>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4487] Add utf8 alias for email charsets

2009-04-03 Thread Tony Nelson

Changes by Tony Nelson :


--
versions: +Python 2.6, Python 2.7

___
Python tracker 
<http://bugs.python.org/issue4487>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1079] decode_header does not follow RFC 2047

2009-04-03 Thread Tony Nelson

Tony Nelson  added the comment:

I think the problem is best viewed as headers are not being parsed
according to RFC2822 and decoded after that, so the recognition of
encoded words should be looser, and not require whitespace around them,
as it is not required in all contexts.

Patch and test, tested on 2.6.1, 2.7trunk.  The test mostly just
reverses the sense of test_rfc2047_without_whitespace().

--
keywords: +patch
nosy: +barry, tony_nelson
versions: +Python 2.6, Python 2.7
Added file: http://bugs.python.org/file13608/header_encwd_nows.patch

___
Python tracker 
<http://bugs.python.org/issue1079>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4491] email.Header.decode_header() doesn't work if encoded-word was separeted by CRLF

2009-04-03 Thread Tony Nelson

Tony Nelson  added the comment:

See patch in issue1079.  I don't think email.header can require
whitespace until it decodes parsed headers, as whitespace is not always
required.

--
nosy: +barry, tony_nelson
versions: +Python 2.7

___
Python tracker 
<http://bugs.python.org/issue4491>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1079] decode_header does not follow RFC 2047

2009-04-04 Thread Tony Nelson

Tony Nelson  added the comment:

The email package does not follow the RFCs in anything to do with header
parsing or decoding.  This is a known deficiency.  So no, I am not
thinking of atoms at all -- and neither is email.header.decode_header()! :-(

Until email.header actually parses headers into atoms and then decodes
atoms, it doesn't matter what parsed atoms would look like.  Currently,
email.header.decode_header() just stumbles through raw text, and doesn't
know if it is looking at atoms or not, or usually even what header the
text came from.

In order to interpret the RFC correctly, email.header.decode_header()
needs either a parser and the name of the header it is decoding, or
parsed header data.  I think the latter is being considered for a
redesign of the email package for 3.1 or 3.2 (3 months to a year or so,
and not for 2.x at all), but until then, it is better to decode every
likely encoded-word than to skip encoded-words that, for example, have a
parenthesis on one side or the other.

--

___
Python tracker 
<http://bugs.python.org/issue1079>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com