I've been using Justin Mason's auto-generated rule set since mid October and am fairly happy with it. Up until Jan 11, false positives averaged about 10% of the hits and I can live with that.
I noticed a surprising change on Jan 11, 2008. Before that day many of the hits were on low scoring (< 20) spam which was very helpful. And I would see many of these every day. Since Jan 10 I've only seen 4 messages that hit on low scoring spam and the rest on very high scoring spam. I don't get anymore FPs but as the spam scores for these messages are already through the roof, at the moment, the usefulness of the current rule sets have diminished. Though I assume the methods for creating the rules are still under development and am looking forward to more improvements. Was there a big change in the way rules were created around that time period? Thanks for the great work! Nedry