R. David Murray <rdmur...@bitdance.com> added the comment:
Here's a patch that makes the example work correctly. This is not a fix, a real fix will be more complicated. This just demonstrates the kind of thing that needs fixing and where. The existing parser produces a sub-optimal parse tree as its result...the parse tree is hard to inspect and manipulate because there are so many special cases. A good fix here would create some sort of function that could be passed an existing TokenList, the new token to add to that list, and the function would check all the special cases and do the EWWhiteSpaceTerminal substitution when and as appropriate. This could then be used in the unstructured parser as well as Phrase...and some thought should be given to where else it might be needed. It has been long enough since I've held the RFCs in my head that I don't remember if there is anywhere else. I haven't looked at the actual character string, so I don't know if we need to also be detecting and posting a defect about a split character or not, but we don't *have* to answer that question to fix this. diff --git a/Lib/email/_header_value_parser.py b/Lib/email/_header_value_parser.py index e805a75..d5d5986 100644 --- a/Lib/email/_header_value_parser.py +++ b/Lib/email/_header_value_parser.py @@ -199,6 +199,10 @@ class CFWSList(WhiteSpaceTokenList): class Atom(TokenList): + @property + def has_encoded_word(self): + return any(t.token_type=='encoded-word' for t in self) + token_type = 'atom' @@ -1382,6 +1386,12 @@ def get_phrase(value): "comment found without atom")) else: raise + if token.has_encoded_word: + assert phrase[-1].token_type == 'atom', phrase[-1] + assert phrase[-1][-1].token_type == 'cfws' + assert phrase[-1][-1][-1].token_type == 'fws' + if phrase[-1].has_encoded_word: + phrase[-1][-1] = EWWhiteSpaceTerminal(phrase[-1][-1][-1], 'fws') phrase.append(token) return phrase, value ---------- ______________________________________________ Python tracker <cpyt...@roundup.psfhosted.org> <https://bugs.python.org/issue35547> ______________________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com