> Now I really want to do this. I'll see what I'm up to this weekend. :-)
heh, it all looks good to me. I think I'm just not quite sure what you're up to (that, and understores in field names confuse me for some reason ;). > What really can you track with this besides scoring and the correlation of > current email styles and how the tests react to them? I was also thinking of > maybe adding some data from the headers which would track where the email > came from but then again I don't want to recreate the razor or another SA > clone. :-) well, by using this data to make spamassassin into a much more accurate detector, you could start collecting data on the spammers who send out those messages.. could generate blacklists/filters based on information that they put into their spams (phone numbers, web pages, ip's, etc).. Not to mention making it easier for those of us who bother reporting them to isp's and (for WA residents) attorney generals. > Offhand, how does Razor get false positives? I thought that it was MD5-based > and the email had to be exact? it does. but md5 doesn't generate a unique id... there's no way that a smallish number can be used to identify an infinite number of possible email combinations.. so while md5 can be used to check integrity of data (since the value will change when even one bit in the checked files changes), it becomes inaccurate when you're trying to compare DIFFERENT things, since you can have two vastly different source files that end up with the same checksum. although this is a bit off topic, a similar system could be desgned that would work around this.. maybe by using the sa score and some kind of unique id generated by a part of the message headers that wouldn't change for each user... > Yes, that is why I'm thinking of creating this database -- we can see what > tests are consistently bad and modify/eliminate them. I have a terrible > problem with opt-in lists being tagged, as well as financial lists. yeah, it's not easy... another thing I've considered is (and I'm shamed to admit that microsoft seems to have come up with the idea) to create whitelists based on your addressbook's... so messages from people who are already in your addressbook can be flagged as not-spam (or just given a hefty negative score).. The problem with this is that this was a feature IN an email client, and it'd be a hassle to write importers for the various email clients used by *nix users (that, and evolution has a HORRID exporter for their addressbook).. -Chris _______________________________________________________________ Hundreds of nodes, one monster rendering program. Now that’s a super model! Visit http://clustering.foundries.sf.net/ _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk