> -----Original Message----- > From: Larry Gilson [mailto:[EMAIL PROTECTED] > Sent: Thursday, August 28, 2003 12:15 AM > To: 'Martin Radford' > Cc: [EMAIL PROTECTED] > Subject: RE: [SAtalk] Message ID > > > > > > -----Original Message----- > > From: Martin Radford > > > > On Tue, Aug 26, 2003 at 11:21:46AM +0100, Martin Radford wrote: > > > > >From my own collections: > > > > > > > > with FQDN with hostname only > > > > ham: 2331 (85.6%) 391 (14.4%) > > > > spam: 1925 (76%) 608 (24%) > > > > > > > > While I'm not very good with statistics, this rule doesn't > > > > look very good for distinguishing ham from spam. > > > Thinking about it, we need to flip the figures around a bit > > to get this: > > > > ham spam > > with FQDN: 2331 (54.8%) 1925 (45.2%) > > hostname only: 391 (39.1%) 608 (60.9%) > > > > So, if a mail has an FQDN after the '@' the chances of it > > being spam are 45.2%. If it doesn't, then the chances of it > > being spam are 60.9%. These are both *far* too close to 0.5 > > for me to want to pay attention to it as a rule. > > Now that is an interesting perspective. It really is too > close for comfort. > > The one thing that has worked for me in using a Message-Id and a > Resent-Message-Id has been a rule I use to test if my gateway > added it. > Since my gateway is a relay only, I should never see a Message-Id or a > Resent-Message-Id created by it. > > I think I am mostly barking up the wrong tree with > Message-Id. However, > there is a pattern that I have yet to figure out with a > "fake" Message-Id. > When you look at the Message-Id on a spam message, you can > just tell it is > not right. However, a regex just will not distinguish it > from the real > thing. > > --Larry >
Larry, I couldn't agree more with it not being able to distinguish it from the real thing. I to have been looking at the message-id field. I know you have been wanting to see what I have found, but I just can't share it yet. I need more info first to really nail it down. I've found that a particular spammer is using a particular pattern in the message-id header. However this pattern can be completely legit from other emails. So it alone is not the tag. I'm trying to use a bunch of meta rules with the test for this message-id included. I just need more time to go over it. I also want to add a raw rule, but need to get these guys into my corpus first. So far they haven't scored above a 7, and haven't been added. They score a 5.5 right now. Which means I only have the munged outlook/exchange version :( The most recent spam had the subject "I'm back, are you?" from last night. If anyone has a a copy of this, I would love the raw base64 code of it pasted in an email for me to see. --Chris Santerre ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk