On Tue, 2011-03-29 at 13:14 -0500, Max wrote: > For a while we were getting spam messages that had images embedded as > text and not an attachment. Those are marked as spam but couldn't the > random characters of the image data increase the entropy of the database > and cause some less than definitive scores? > > That aside. It seems like all my ham is bellow 0 so would changing the > cut off to something like 2.0 be bad practice? > As the others have found, that message scores a lot higher here:
Content analysis details: (9.5 points, 6.0 required) pts rule name description ---- ---------------------- ------------------------------------------- 1.9 URIBL_JP_SURBL Contains an URL listed in the JP SURBL blocklist [URIs: dailynewdesign.com] 1.7 URIBL_DBL_SPAM Contains an URL listed in the DBL blocklist [URIs: dailynewdesign.com] -0.0 SPF_PASS SPF: sender matches SPF record 3.6 FB_THIS_ADVERT BODY: Phrase: this advertiser 1.0 MG_MEDPHRASE Medication phrase 1.3 RDNS_NONE Delivered to internal network by a host with no rDNS It also hit three local rules, which added a total of 0.2 to the score - they are all low scoring as they are used to trigger rather specific meta rules. I've edited them out and adjusted the score accordingly. I notice you're still running SA 3.2.5 while the current version (for Fedora 14 packages) is 3.3.2. Time to upgrade? Apart from the Bayes training that others have commented on, I notice you haven't had either of the URIBL hits I got. That could be for either of two reasons: - you're not using URI blacklists - the URI blacklist databases got updated in the interval between your check and when I scanned the message. If you're not using blacklists, might it be time to start doing so? Martin