On 04/12, John Hardin wrote: > >Can you remind me how far below the threshold we are for corpora? If I hand > >qualify another couple of thousand hams or so would that be significant? Or > >is our deficit significantly larger than that?
> The current corpora are ham=50658, spam=245341. > > I don't remember what the thresholds currently are, but the numbers > used in the past have been a multiple of 50k, so 100k, 150k, 200k or > 250k. Darxus, you're more in tune with this than I am, what are the > current thresholds? Thresholds for both are 150000. Graph here, updated weekly: http://www.chaosreigns.com/dnswl/tot.svg According to that, we're at 29003 spams. That matches the latest net run, which it's based on: http://ruleqa.spamassassin.org/20120407-r1310705-n So as of Saturday, we're at 19.3% of the spam corpora we need. Spam age limit is 2 months. The dev list gets an alert every day (from me) if updates haven't been generated. It says: "SpamAssassin version 3.3.2 has not had a rule update since 2012-02-25." It's pretty obnoxious, but I think it's a big enough problem to justify it being posted once a day (and I'm apparently not the only one). New contributors aren't currently allowed due to https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6694 which has restricted visibility due to being a security bug. For the past 69 days, it has been waiting for a reply from Warren Togami to okay declaring it not actually a security problem (which I am in favor of). It seems this requires another member of the PMC (project management committee) to step in and declare this not a security bug. Or for someone with sufficient access to otherwise "fix" it, which I suspect is a very small set of people. Once that's cleared up, new people would be able to contribute data (just logs of rule hits, not actual email) via https://wiki.apache.org/spamassassin/NightlyMassCheck -- "Go forth, and be excellent to one another." - http://www.jhuger.com/fredski.php http://www.ChaosReigns.com