-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I have mass-check and hit-frequencies working now on my system, and am
digging into some of the hit-frequencies results.  Doing so, I found
three copies of the same email in three different ham files. I cleaned
that up.

I then did an analysis by message id of other duplicates and found a
bunch, ham and spam, which I'm also cleaning up.

I also found some not-quite duplicates. Example:
Spam [EMAIL PROTECTED] From: "Savvy Investor"
<[EMAIL PROTECTED]> exists 3 times in my Dec 20 spam file. Though the
message is the same, and the message ID is the same, the message is dated
Date: Fri, 19 Dec 2003 16:24:29 -0600
Date: Fri, 19 Dec 2003 16:23:56 -0600
Date: Fri, 19 Dec 2003 16:23:07 -0600

I'm going to treat these as identical and delete the extra copies.

I've found similar copies a month or two apart. Also deleted the extra
copies.

I'm assuming that the development team has also run into this while
working on the GA run. What do you do with these types of almost
duplicates?

Bob Menschel



-----BEGIN PGP SIGNATURE-----
Version: PGP 8.0.3

iQA/AwUBP/fFhpebK8E4qh1HEQIXiACfbvgE0CFzr3CEOIZGRQXjptY07+4AoPsC
mb7Eh/1K0Edty0frdA6T0pL1
=i4uM
-----END PGP SIGNATURE-----





-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to