New submission from bpoaugust :
The email headerregistry class MessageIDHeader is too strict when parsing
existing Message-Ids. It can truncate Message-Ids that are valid according to
the obsolete rules.
As the saying has it:
"Be liberal in what you accept, and conservative in what you
bpoaugust added the comment:
The easiest might be for me to provide some test cases, but I have not been
able to work out where the existing unit tests are.
One failure which I believe should be permitted under current rules is:
- i.e. trailing space
The space gets added AFTER the
bpoaugust added the comment:
When the library is being used to parse existing emails, I think it needs to do
the minimum validation and canonicalisation.
It may be useful in some circumstances to report where the input is not
syntactically correct, but I'm not sure it is helpful to tru
bpoaugust added the comment:
I think an id of the form
should be allowed, but it generates
obs-id-left => local-part => obs-local-part => word *("." word)
word => atom => [CFWS] 1*atext [CFWS]
'' should also be allowed
bpoaugust added the comment:
Sorry, I think '' is not valid, as spaces are not allowed between
words.
However I am not seeing the original unfolded source if there is an error,
unless I am misunderstanding the API.
For example:
--- cut here ---
import email.header
import e
New submission from bpoaugust :
The Message-ID parser can crash on truncated input.
For example:
import email.policy
message=email.message_from_string("Message-id:
message['Message-id']
File
"/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/emai
bpoaugust added the comment:
subprocess.getoutput does not currently work at all on Windows.
So it's not necessary to maintain backwards compatibility.
The following fix works for me on WinXP/Python 3.2.2.
Replace
pipe = os.popen('{ ' + cmd + '; } 2>&1',
bpoaugust added the comment:
A better fix, which supports multiple windows commands:
if mswindows:
pipe = os.popen('( ' + cmd + ' ) 2>&1', 'r') # Windows uses () rather
than { }
else:
pipe = os.popen('{ ' + cmd
Changes by bpoaugust :
--
versions: -Python 3.3
___
Python tracker
<http://bugs.python.org/issue10197>
___
___
Python-bugs-list mailing list
Unsubscribe:
bpoaugust added the comment:
I got the () syntax from:
http://technet.microsoft.com/en-us/library/cc737438%28WS.10%29.aspx
which refers to grouping, not subshell.
--
___
Python tracker
<http://bugs.python.org/issue10
New submission from bpoaugust :
It looks like mailbox uses email.message_from_... for parsing emails.
However it does not allow for passing any options to the parser.
In particular the policy cannot be provided.
It would be useful if there was a way to pass such options
New submission from bpoaugust:
See:
https://github.com/python/cpython/blob/master/Lib/mailbox.py#L787
The code should be
self.get_bytes(key, from_)).as_string(unixfrom=from_)
--
components: email
messages: 302564
nosy: barry, bpoaugust, r.david.murray
priority: normal
severity: normal
bpoaugust added the comment:
https://github.com/python/cpython/blob/master/Lib/mailbox.py#L778
The code here reads the first line, but fails to save it as the unixfrom line.
Alternatively perhaps it should reset the file back to the start so the message
factory has sight of the envelope.
The
Changes by bpoaugust :
--
title: mailbox._mboxMMDF.get_message throws away From envelope ->
_mboxMMDF.get_string() fails to pass param to get_bytes()
___
Python tracker
<https://bugs.python.org/issu
New submission from bpoaugust:
https://github.com/python/cpython/blob/master/Lib/mailbox.py#L778
The code here reads the first line, but fails to save it as the unixfrom line.
Alternatively perhaps it should reset the file back to the start so the message
factory has sight of the envelope
bpoaugust added the comment:
Ignore msg302569 - that was supposed to be a new issue.
--
___
Python tracker
<https://bugs.python.org/issue31522>
___
___
Python-bug
bpoaugust added the comment:
It is not saving the unix from line.
#!/usr/bin/env python3
with open("test.mbox",'w') as f:
f.write("From sender@invalid Thu Nov 17 00:49:30 2016\n")
f.write("Subject: Test\n")
f.write("\n")
f.w
bpoaugust added the comment:
I believe that setting the file back to the start is probably the best solution.
The message as provided by e.g. postfix will include the From header and the
parser is able to deal with that successfully, so I'm not sure why the mbox
reader removes it b
New submission from bpoaugust:
The default mailbox factory is mailbox.mboxMessage so I expect the following
two statements to work the same:
messages = mailbox.mbox("test.mbox")
messages = mailbox.mbox("test.mbox", mailbox.mboxMessage)
However they do not.
The attache
New submission from bpoaugust:
At present both _mboxMMDF#get_message and get_bytes read the file directly.
However the code in get_bytes duplicates some of the code in get_message.
get_message should be recoded to use get_bytes.
It would then be possible to override get_bytes (which is also
bpoaugust added the comment:
On further investigation it sppears that overriding the get_bytes function does
not help with unmangling >From.
However it would still be worth re-using the code.
--
___
Python tracker
<http://bugs.python.org/issu
bpoaugust added the comment:
Rather that change unquote to deal with such malformed input, why not just
enhance get/set boundary? That would reduce the impact of any changes.
Also it should be easier to detect trailing rubbish in the value if you know it
is a boundary value.
--
nosy
New submission from bpoaugust:
get_boundary calls get_param('boundary') which unquotes the value.
It then calls utils.collapse_rfc2231_value which also calls unquote.
This causes problems for boundaries that have two sets of quotes.
For example, I have seen the following in the wild
bpoaugust added the comment:
I agree that strictly speaking the boundary is invalid.
However:
'Be strict in what you generate, be liberal in what you accept'
The mail package should never create such boundaries.
However it should process them if possible.
If the boundary definition
Changes by bpoaugust :
--
type: -> behavior
___
Python tracker
<http://bugs.python.org/issue29020>
___
___
Python-bugs-list mailing list
Unsubscrib
New submission from bpoaugust:
collapse_rfc2231_value unquotes the value before returning it except here:
rawbytes = bytes(text, 'raw-unicode-escape')
return str(rawbytes, charset, errors)
Why is the text not unquoted in this case?
Actually I wonder whether the function sho
bpoaugust added the comment:
It looks like a simpler alternative is to just change
boundary = self.get_param('boundary', missing)
to
boundary = self.get_param('boundary', missing, unquote=False)
and let collapse_rfc2231_va
bpoaugust added the comment:
According to RFC822, a quoted-string should only be wrapped in double-quotes.
So I'm not sure why unquote treats <> as quotes. If it did not, then again this
issue would not arise.
However maybe utils.unquote is needed by other code that uses <&
bpoaugust added the comment:
I have just discovered the same problem with get_filename.
Not surprising as its code is basically the same as get_boundary.
Unix paths can contain anything, so it's not correct to remove special
characters. [It's up to the receiving file system to dec
bpoaugust added the comment:
Note: it's easy to create test e-mails with attachments using mutt.
echo test | mutt -s "test" -a files... -- user@domain
I did some testing with the following names:
<>
><
""
""
<"abc">
>abc<
&
New submission from bpoaugust:
The email package implements mboxo From_ mangling on output by default.
However there is no provision to unmangle >From_ on input.
This means that it's not possible to import mboxo files correctly.
--
components: email
messages: 283879
nos
bpoaugust added the comment:
Is there any way to override the current behaviour?
--
___
Python tracker
<http://bugs.python.org/issue29053>
___
___
Python-bug
bpoaugust added the comment:
Attached please find patch which works for me.
To use it independently of email, do something like:
messages = mailbox.mbox(filename, MboxoFactory)
where:
class MboxoFactory(mailbox.mboxMessage):
def __init__(self, message=None):
super().__init__
bpoaugust added the comment:
Another case is get_filename.
The second call to unquote will only change the incoming parameter if the
original value was enclosed in <> or "". This is not a common scenario, and was
only discovered because a mailer used the form <<>>
Changes by bpoaugust :
--
versions: +Python 3.5 -Python 3.7
___
Python tracker
<http://bugs.python.org/issue29053>
___
___
Python-bugs-list mailing list
Unsub
bpoaugust added the comment:
I have just checked and AFAICT collapse_rfc2231_value is only called by
get_filename and get_boundary in message.py.
Both of these call get_param and default to unquote=True.
So in all cases the parameter value passed to collapse_rfc2231_value will
already have
bpoaugust added the comment:
This is actually a bug in collapse_rfc2231_value: issue29020
--
___
Python tracker
<http://bugs.python.org/issue28945>
___
___
Pytho
bpoaugust added the comment:
The patch can be simplified by just looking for b'\n' in the last 6 chars, and
caching from b'\n' if found.
This would mean more file seeking in exchange for less buffer matching.
--
___
P
bpoaugust added the comment:
If there are concerns about 3rd party code relying on the current behaviour of
the function, then just create a new function without the unquoting, and
deprecate the original function.
The existing function does not always unquote, so any code which relies on it
bpoaugust added the comment:
The patch is incomplete.
There is another unquote call:
336 return unquote(text)
--
___
Python tracker
<http://bugs.python.org/issue28
40 matches
Mail list logo