Re: [SAtalk] Porn scoring

2002-02-07 Thread dman
On Thu, Feb 07, 2002 at 01:48:46PM -0500, James Golovich wrote: | On Thu, 7 Feb 2002, dman wrote: | > On Thu, Feb 07, 2002 at 09:12:23AM -0600, Shane Williams wrote: | > | I was looking at the porn expressions and scoring, and thought of an | > | idea to shoot by everybody. | > | | > | If I'm rea

Re: [SAtalk] Porn scoring

2002-02-07 Thread Craig Hughes
Hmm, I just ran a test run with the so-far-accumulated new nonspam corpus and the spam corpus, and PORN_3 comes out with a negative score. So clearly the rule needs work. C On Thu, 2002-02-07 at 10:48, James Golovich wrote: > > > On Thu, 7 Feb 2002, dman wrote: > > > On Thu, Feb 07, 2002 at

Re: [SAtalk] Porn scoring

2002-02-07 Thread Craig Hughes
I was actually thinking of removing the 3-in-a-line restriction, but splitting the rule in to at least 2 pieces: naughty words, and definite signs of porn spam. "fuck" falls in the former category, "cum" in the latter. I'll basically check each words' frequency in the corpus and separate the wor

Re: [SAtalk] Porn scoring

2002-02-07 Thread James Golovich
On Thu, 7 Feb 2002, dman wrote: > On Thu, Feb 07, 2002 at 09:12:23AM -0600, Shane Williams wrote: > | I was looking at the porn expressions and scoring, and thought of an > | idea to shoot by everybody. > | > | If I'm reading the PORN_3 rule correctly, > > I had set all the PORN_* rules to 10

Re: [SAtalk] Porn scoring

2002-02-07 Thread dman
On Thu, Feb 07, 2002 at 09:12:23AM -0600, Shane Williams wrote: | I was looking at the porn expressions and scoring, and thought of an | idea to shoot by everybody. | | If I'm reading the PORN_3 rule correctly, I had set all the PORN_* rules to 10.0 in my config. I kept getting a significant nu

[SAtalk] Porn scoring

2002-02-07 Thread Shane Williams
I was looking at the porn expressions and scoring, and thought of an idea to shoot by everybody. If I'm reading the PORN_3 rule correctly, you must have three of the listed strings within 15 characters of each other, and this scores .7 if caught. Two things seem strange about this. First, how o