Ok, I'll go look at what was in CVS and build the word list from there. I agree on the number of words thing. We can probably get around that by calculating the %age of words which are on the list, instead of having a hard threshold. ie more like the spam phrases stuff where it comes up with a "porn phrase" score...
C Michael Moncur wrote: MM> > Yeah, while incorporating it then running make test I've found a MM> > few issues :) MM> > For starters, the \b$word\b is not right in all cases. Want to MM> > trap -ed, -ing, MM> > -es, etc suffixes on the verbs among other things. It's a good MM> > starting base MM> > though. MM> MM> While you're looking at it, here are a couple more issues - first, isn't the MM> eval test looking for three porn-like words *in the entire body* while the MM> current PORN_3 looks for three words separated by 0-15 characters? Wouldn't MM> this make it more likely to trigger as a false positive? Perhaps a count MM> higher than 3 would be better? MM> MM> Second, the @porn_words Daniel posted doesn't include all of the words that MM> PORN_3 does. It's missing everything from, er, "whore" to "titties" in the MM> current CVS. MM> MM> I'm sure it's better than the current PORN_3 regardless. _______________________________________________________________ Have big pipes? SourceForge.net is looking for download mirrors. We supply the hardware. You get the recognition. Email Us: [EMAIL PROTECTED] _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk