Theo Van Dinter wrote:
But why waste the extra 30 bytes, when most people (especially if you know they can recieve HTML mail). Most clients automatically reply in the format they recieve in. Most users don't even realize the difference between formats.It depends on your client. I've seen 30 byte text messages take 4k in HTML.
The problem I see is HTML mail is intended, and is tending to replace plain text email. As a result, more false postive emails will occur in 6 months, 1 year, and so on. Why focus and rely so heavily on what we know will be causing this, rather than work to perfect spam detection in ways that ONLY spammers can be detected. SpamAssassin should not add so many points for what can be legitimate email fully compliant with all specifications and standards simply based on the format. It's asking for false positives. The rule should exist. But should not be so heavy, since it will become more of an issue as time goes on.I don't know, being able to catch 59% of my spam with a 99.4% correct hit rate is in no way "outdated".Does that mean you'll have the same results? No. But that's why you can submit mass-check results. ;)
Rumor is the new version of Outlook in the works has expanded HTML mail capabilities as well. As HTML becomes more common for use, plain text will disappear, since there is no need for it. It doesnt't take to much to strip the tags and turn images into [img]. It's very easy to go to plain text if the software isn't designed for HTML mail. I even saw some module for a mail server (don't remember which) that can do this automatically, on the server side (for businesses running Eudora and allow WAP email checking, which isn't good with HTML).
The rule is pushing for more good email to go to jail.
As far as the amount of spam caught by this rule, from a visual examination of about 30 HTML emails, all are caught by:
BODY: HTML link text says "click here" MAILTO_LINK (0.2 points) BODY: Includes a URL link to send an email MAILTO_TO_SPAM_ADDR (0.6 points) URI: Includes a link to a likely spammer email address FORGED_MUA_OUTLOOK (1.0 points) Forged mail pretending to be from MS Outlook CLICK_BELOW (0.3 points) Asks you to click below HTML_IMAGE_RATIO_06 (1.0 points) BODY: HTML has a low ratio of text to image area and other HTML based rules. They tend to be getting the job done with detecting them. Since spammers like HTML for a reason: Images, and links. And SpamAssassin knows that.
It is, but that's what it should be relying on. Rules that are detect spam, not a bad (or modern) email client.I must be crazy, I thought that is how SA worked.
Good, then you'll get a lot of them as I analyze more mail.been really analyzing spam by hand over the past few weeks to create a few bugs with some improved rule ideas. Expect to see quite a few overGood, we like new rule ideas.
Very true, but how many heed that advice. And how many check their convicted spam more than once a day? And how many look closely. Appealing a conviction is hard. Most serve life sentances in SpamJail. Perhaps the justice system is corrupt, or the jury is to lazy to look good, but it seems the best approach is to zero in better on the bad spam, and keep the good out.Yes, the 97% catch rate that I have now is horrible...That's why we tell people not to delete their messages by default.
--
smime.p7s
Description: S/MIME Cryptographic Signature