Re: Decode email subjects into unicode

2017-04-18 Thread akomack100
On Wednesday, March 19, 2008 at 3:24:45 AM UTC-7, Laszlo Nagy wrote: > Gertjan Klein wrote: > > Laszlo Nagy wrote: > > > > > >> However, there are malformed emails and I have to put them into the > >> database. What should I do with this: > >> > > [...] > > > >> There is no encoding giv

Re: Decode email subjects into unicode

2008-03-19 Thread Laszlo Nagy
Gertjan Klein wrote: > Laszlo Nagy wrote: > > >> However, there are malformed emails and I have to put them into the >> database. What should I do with this: >> > [...] > >> There is no encoding given in the subject but it contains 0x92. When I >> try to insert this into the database,

Re: Decode email subjects into unicode

2008-03-19 Thread Gertjan Klein
Laszlo Nagy wrote: >However, there are malformed emails and I have to put them into the >database. What should I do with this: [...] >There is no encoding given in the subject but it contains 0x92. When I >try to insert this into the database, I get: This is indeed malformed email. The content

Re: Decode email subjects into unicode

2008-03-18 Thread John Machin
On Mar 18, 9:09 pm, Laszlo Nagy <[EMAIL PROTECTED]> wrote: > Sorry, meanwhile i found that "email.Headers.decode_header" can be used > to convert the subject into unicode: > > > def decode_header(self,headervalue): > > val,encoding = decode_header(headervalue)[0] > > if encoding: > > return val.dec

Re: Decode email subjects into unicode

2008-03-18 Thread Jeffrey Froman
Laszlo Nagy wrote: > I know that "=?UTF-8?B" means UTF-8 + base64 encoding, but I wonder if > there is a standard method in the "email" package to decode these > subjects? The standard library function email.Header.decode_header will parse these headers into an encoded bytestring paired with the

RE: Decode email subjects into unicode

2008-03-18 Thread Ryan Ginstrom
> On Behalf Of Laszlo Nagy > > =?koi8-r?B?4tnT1NLP19nQz8zOyc3PIMkgzcHMz9rB1NLB1M7P?= > > [Fwd: re:Flags Of The World, Us States, And Military] > > =?ISO-8859-2?Q?=E9rdekes?= =?UTF-8?B?aGliw6Fr?= Try this code: from email.header import decode_header def getheader(header_text, default="ascii"):

Re: Decode email subjects into unicode

2008-03-18 Thread Laszlo Nagy
Sorry, meanwhile i found that "email.Headers.decode_header" can be used to convert the subject into unicode: > def decode_header(self,headervalue): > val,encoding = decode_header(headervalue)[0] > if encoding: > return val.decode(encoding) > else: > return val However, there are malformed emails

Decode email subjects into unicode

2008-03-18 Thread Laszlo Nagy
Hi All, 'm in trouble with decoding email subjects. Here are some examples: > =?koi8-r?B?4tnT1NLP19nQz8zOyc3PIMkgzcHMz9rB1NLB1M7P?= > [Fwd: re:Flags Of The World, Us States, And Military] > =?ISO-8859-2?Q?=E9rdekes?= > =?UTF-8?B?aGliw6Fr?= I know that "=?UTF-8?B" means UTF-8 + base64 encoding,