On Fri, May 01, 2015 at 04:47:55PM +1000, Erik Christiansen wrote:
> My only related experience is trying to display html emails generated
> by an unenlightened vendor when firefox can't deal with the embedded
> "=E5"-style gibberish.

that "gibberish" is called quoted-printable[1]. it's not only used for
html email. it's not even primarily used for that, it's one of the
common methods for encoding messages containing 8-bit characters into
7-bit characters suitable for transmission by/through older MTAs that
can't handle 8-bit mail.

the most likely cause of firefox not being able to display it correctly
is because the sender's MUA didn't set the correct mime-type header
when creating the email...probably outlook or one of the many crappy,
half-arsed MUAs on windows that don't bother implementing standards
correctly. firefox has no problem with QP if the mime headers are set
correctly...in fact, most if not all linux browsers and mail clients can
decode and display it...most modern mail clients on any OS should be
able to read and display QP, if not send it.


[1] http://en.wikipedia.org/wiki/Quoted-printable

    "Quoted-Printable, or QP encoding, is an encoding using printable
    ASCII characters (alphanumeric and the equals sign "=") to transmit
    8-bit data over a 7-bit data path or, generally, over a medium which
    is not 8-bit clean.[1] It is defined as a MIME content transfer
    encoding for use in e-mail.

    QP works by using the equals sign "=" as an escape character. It
    also limits line length to 76, as some software has limits on line
    length."

the rest of the article is worth reading for an understanding of what QP
is and how to encode/decode it correctly (in short, the number after the
= sign is a 2-hex-digit number, 00-FF, specifying the 8-bit character) .

> I just put those messages through a couple of lines of awk, and the
> problem goes away.
>
> And the question did refer to sed, which is as old-unix as awk, and so
> nullifies __any__ hint of OT-ishness, I assure you.

it would probably be better and more reliable to write a simple
perl filter using MIME::Decoder[2], which uses subclasss
MIME::QuotedPrint::Perl[3] - the MIME:Decoder docs have an example
filter in 3 lines of perl:

    use MIME::Decoder;

    $decoder = new MIME::Decoder 'quoted-printable' or die "unsupported";
    $decoder->decode(\*STDIN, \*STDOUT);


awk and/or sed can probably handle 90+% of cases, or at least make them
less ugly to view. a decoder script should handle 100%, and convert them
back to 8-bit text.


[2] http://search.cpan.org/~dskoll/MIME-tools-5.505/lib/MIME/Decoder.pm
[3] http://search.cpan.org/~gaas/MIME-Base64-3.15/QuotedPrint.pm

craig

-- 
craig sanders <[email protected]>

BOFH excuse #360:

Your parity check is overdrawn and you're out of cache.
_______________________________________________
luv-main mailing list
[email protected]
http://lists.luv.asn.au/listinfo/luv-main

Reply via email to