I have a custom mail filter in python that uses the mailbox package to open a mail message and give me access to the headers.
So I have the following code to open each mail message:- # # # Read the message from standard input and make a message object from it # msg = mailbox.MaildirMessage(sys.stdin.buffer.read()) and then later I have (among many other bits and pieces):- # # # test for string in Subject: # if searchTxt in str(msg.get("subject", "unknown")): do various things This works exactly as intended most of the time but occasionally a message whose subject should match the test is missed. I have just realised when this happens, it's when the Subject: has accented characters in it (this is from a mailing list about canals in France). So, for example, the latest case of this happening has:- Subject: aka Marne à la Saône (Waterways Continental Europe) where the searchTxt in the code above is "Waterways Continental Europe". Is there any way I can work round this issue? E.g. is there a way to strip out all extended characters from a string? Or maybe it's msg.get() that isn't managing to handle the accented string correctly? Yes, I know that accented characters probably aren't allowed in Subject: but I'm not going to get that changed! :-) -- Chris Green · -- https://mail.python.org/mailman/listinfo/python-list