Re: Proposed rule for too many dots in From

RW Fri, 21 Dec 2018 06:29:15 -0800

On Thu, 20 Dec 2018 21:12:33 -0700
Grant Taylor wrote:

> On 12/20/18 8:34 PM, Grant Taylor wrote:
> > I'm going back through and analyzing how I'm extracting data and
> > trying to satisfactorily explain some oddities.  
> 
> Out of 244,921 messages there are 16,528 unique addresses, this is
> how the messages break down for
> 
> Here's how the dots in the user parts of 16,528 unique addresses out
> of 244,921 messages break down:
> 
>    13,277               (no dots 80.3%)
>     2,936 .             ( 1 dot  17.7%)
>       281 ..            ( 2 dots  1.7%)
>        29 ...           ( 3 dots  0.2%)
>         3 ....          ( 4 dots  0.0%)
>         1 .....         ( 5 dots  0.0%)
>         1 ...........   (11 dots  0.0%)
> 
> So, in light of this information, I would be willing to concede 3 or 
> more dots is possibly and indicator of spam.


I think you are a bit premature there, without having separate figures
for spam  and ham, you can't say even whether any of these are good spam
indicator - even in isolation.

> My previous log methodology 

Isn't a sound method for scoring. For one thing it assumes that more
dots are more spammy. It could be that the S/O peaks at 4. 

For another, scoring should be about the balance of extra TPs and FPs
that the rule creates. Sometimes the more spammy looking rule hits
higher scoring spam and warrants a lower score.

Re: Proposed rule for too many dots in From

Reply via email to