-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I have mass-check and hit-frequencies working now on my system, and am digging into some of the hit-frequencies results. Doing so, I found three copies of the same email in three different ham files. I cleaned that up.
I then did an analysis by message id of other duplicates and found a bunch, ham and spam, which I'm also cleaning up. I also found some not-quite duplicates. Example: Spam [EMAIL PROTECTED] From: "Savvy Investor" <[EMAIL PROTECTED]> exists 3 times in my Dec 20 spam file. Though the message is the same, and the message ID is the same, the message is dated Date: Fri, 19 Dec 2003 16:24:29 -0600 Date: Fri, 19 Dec 2003 16:23:56 -0600 Date: Fri, 19 Dec 2003 16:23:07 -0600 I'm going to treat these as identical and delete the extra copies. I've found similar copies a month or two apart. Also deleted the extra copies. I'm assuming that the development team has also run into this while working on the GA run. What do you do with these types of almost duplicates? Bob Menschel -----BEGIN PGP SIGNATURE----- Version: PGP 8.0.3 iQA/AwUBP/fFhpebK8E4qh1HEQIXiACfbvgE0CFzr3CEOIZGRQXjptY07+4AoPsC mb7Eh/1K0Edty0frdA6T0pL1 =i4uM -----END PGP SIGNATURE----- ------------------------------------------------------- This SF.net email is sponsored by: IBM Linux Tutorials. Become an expert in LINUX or just sharpen your skills. Sign up for IBM's Free Linux Tutorials. Learn everything from the bash shell to sys admin. Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk