On Tue, Dec 11, 2012 at 12:06:03PM -0600, Derek Martin wrote:
> HTML provides all of the features [...]

HTML also provides all of the bugs.

"Nothin's free." (c.f. "Crossfire", Stevie Ray Vaughan, 1989)

Several points (and these aren't exhaustive, merely illustrative):

1. It is very difficult to conduct a technical exploit against a plaintext
MUA.  Not impossible, of course, but very difficult.  Adding all the
complexity of an HTML engine vastly increases the attack surface.
And at a time when the Internet-wide state of security is best described
as "horrific and getting worse" one of the very last things anyone
should want/advocate is an increase in aggregate vulnerability.

This clearly gets still worse if users utilize a full-blown web browser,
and yet worse if that browser incorporates Java, Javascript, Flash, etc.
And it probably (but not certainly; jury still out) gets worse again
once HTML5 is widely deployed.

2. HTML email is a spammer/phisher's best friend:

        <a href="http://notyourbank.example.com";>YourBank</a>

Add a million variations using scripting and redirectors and
shorteners and typosquatting and everything else.  It works all day,
every day, and I think by now we can safely conclude that no amount
of end-user education will ever stop it from working.  (See Marcus Ranum's
brilliant essay-rant, "The Six Dumbest Ideas in Computer Security",
and note #5.)

3. HTML enables web bugs, an excellent means to invade the privacy of
users...and worse.  Consider, among many scenarios: what happens when all
the data accrued by an entity using these is acquired by a third party,
either because they hacked it or because it was accidentally disclosed or
because it was sold over or under the table?  You might trust YourBank
with this information -- although you shouldn't -- but how do you feel
about having all of it handed over to random third parties?  Especially
ones that have already demonstrated hostile intent?  How do you feel about
The Bad People having every IP address you've ever used to read mail
from YourBank, the MUA/browser versions/extensions you've used, the
timestamps, etc.? [1]

Anyway, my point is that a huge amount of data is being gleaned
from web bugs all day, every day.  Given the size of that corpus
and the number of collectors, it's absurd to suggest that it'll
all stay put.  Of course it won't.  As you're reading this, some
junior server engineer at Example Corp. is writing all the 2008-2012
web bug logs in compressed form onto a 32G USB stick and getting ready
to walk out the door with them at 6 o'clock.  Tax-free holiday bonus
for him/her, too bad for you and a bazillion other people.

4. HTML markup in email is uniformly awful.  Really.  Go look at
some of it.  Feed it to htmltidy.  Feed it to the W3C validator.
Feed it whatever tool you like.  It's not uncommon for the ratio
of errors/line to exceed unity.

This has many consequences, security and otherwise, but one of the more
obvious things that it means is that the HTML-marked-up email message
which looks glorious to the sender may end up being unintelligible rubbish
to the receiver -- depending on which HTML engine both are using.

5. HTML markup in email bloats it.  Horribly.  I've seen messages expanded
by 2000%.  This has impacts all along the way: more bandwidth eaten
at every step.  More storage.  More CPU chewed up sending, receiving,
scanning, etc.

Do you really think a message marked up with HTML communicates 480% 
or 820% or 1750% better than one in plaintext?  I don't.

6. Of course point #5 gets still worse when user A responds to user B's
HTML-marked-up message, and user A's MUA tries to mark *that* up, not
recognizing that it was already, and then...   There is a military term
for this, and it rhymes with busterduck.  I've watched in mute (okay,
and not-so-mute) horror as email exchanges consisting of 1 or 2 lines
each mushroomed into 300K messages.  This makes a mere 2000% look
desirable by comparison.

Mobile carriers love all this, of course: makes the cash registers ring
and ring and ring with bandwidth charges.

But in terms of providing efficient, usable, cheap, etc.  communication,
it's a disaster.  300K for a "me too" reply?  Really?!

7. There are some people (and I think it's fair to say there will
be more) who encrypt their email for various social, economic, personal,
political, etc. reasons.

What's a crytographer's best friend when they're trying to crack
a message or messages?

Known plaintext.

What do HTML markup engines in MUA's insert?

Known plaintext.

Lots of it.

Oops.


I'm going to stop this laundry-list and move on to the following point,
but only because this message is already long enough.  So please note:
there are many more issues with HTML email than I've covered here.

8. Let's use your message, the one I'm replying to here, as an example.
It's literate.  It's articulate.  It clearly communicates.  EXACTLY how
would that message be improved by HTML markup?  And is that marginal
improvement worth the enormous cost in decreased security, increased
attack surface, message bloat, and everything else?

I don't think so.  Not even remotely close.

Now go find some other message by some other person -- 1 that reeds
lik dis u no? -- how, EXACTLY, is that illiterate tripe going to
be improved by HTML markup?  (Unless it's redacted by overstrikes,
which would be welcome relief.)

I'm not suggesting, by the way, that all human communication can
be handled by plain text.  I'm suggesting that there are numerous
compelling reasons why *email* should be so and that when we find
ourselves faced with those (relatively infrequent) occasions where
that solution is inadequate, HTML markup is not the answer, because
the cost/benefit ratio is hideous.

---rsk

[1] Let's not pretend that YourBank will adequately safeguard this
information.  Why should they?  It's not *their* data.  And the
consequences for losing it are negligible: nobody gets fired, nobody
misses a bonus.  And well, good security costs money, money that could
instead be corporate or personal profit.  Besides, if it leaks, they
can hold a press conference and use the favorite phrase of organizations
which have just leaked data they should never have had: "nobody could have
foreseen" and the press will dutifully play their role as stenographers
and report that...despite the fact that lots of people foresaw it years
ago and said so.

Reply via email to