On Wednesday, March 19, 2008 at 3:24:45 AM UTC-7, Laszlo Nagy wrote:
> Gertjan Klein wrote:
> > Laszlo Nagy wrote:
> >
> >
> >> However, there are malformed emails and I have to put them into the
> >> database. What should I do with this:
> >>
> > [...]
> >
> >> There is no encoding giv
Gertjan Klein wrote:
> Laszlo Nagy wrote:
>
>
>> However, there are malformed emails and I have to put them into the
>> database. What should I do with this:
>>
> [...]
>
>> There is no encoding given in the subject but it contains 0x92. When I
>> try to insert this into the database,
Laszlo Nagy wrote:
>However, there are malformed emails and I have to put them into the
>database. What should I do with this:
[...]
>There is no encoding given in the subject but it contains 0x92. When I
>try to insert this into the database, I get:
This is indeed malformed email. The content
On Mar 18, 9:09 pm, Laszlo Nagy <[EMAIL PROTECTED]> wrote:
> Sorry, meanwhile i found that "email.Headers.decode_header" can be used
> to convert the subject into unicode:
>
> > def decode_header(self,headervalue):
> > val,encoding = decode_header(headervalue)[0]
> > if encoding:
> > return val.dec
Laszlo Nagy wrote:
> I know that "=?UTF-8?B" means UTF-8 + base64 encoding, but I wonder if
> there is a standard method in the "email" package to decode these
> subjects?
The standard library function email.Header.decode_header will parse these
headers into an encoded bytestring paired with the
> On Behalf Of Laszlo Nagy
> > =?koi8-r?B?4tnT1NLP19nQz8zOyc3PIMkgzcHMz9rB1NLB1M7P?=
> > [Fwd: re:Flags Of The World, Us States, And Military]
> > =?ISO-8859-2?Q?=E9rdekes?= =?UTF-8?B?aGliw6Fr?=
Try this code:
from email.header import decode_header
def getheader(header_text, default="ascii"):
Sorry, meanwhile i found that "email.Headers.decode_header" can be used
to convert the subject into unicode:
> def decode_header(self,headervalue):
> val,encoding = decode_header(headervalue)[0]
> if encoding:
> return val.decode(encoding)
> else:
> return val
However, there are malformed emails
Hi All,
'm in trouble with decoding email subjects. Here are some examples:
> =?koi8-r?B?4tnT1NLP19nQz8zOyc3PIMkgzcHMz9rB1NLB1M7P?=
> [Fwd: re:Flags Of The World, Us States, And Military]
> =?ISO-8859-2?Q?=E9rdekes?=
> =?UTF-8?B?aGliw6Fr?=
I know that "=?UTF-8?B" means UTF-8 + base64 encoding,