Re: FSL_HELO_HOME: deep headers again

John Hardin Sat, 14 May 2016 10:11:43 -0700

On Sat, 14 May 2016, Reindl Harald wrote:

Am 14.05.2016 um 04:50 schrieb John Hardin:
 On Sat, 14 May 2016, Reindl Harald wrote:
>  Am 14.05.2016 um 04:04 schrieb John Hardin:
> > How would a webservice be better? That would still be sending> > customer
> >   emails to a third party for processing.
>> uhm you missed "and only give the rules which hitted and spam/ham flag
>  out"
 Ah, OK, I misunderstood what you were suggesting.

 That wouldn't work. That tells you the rules they hit at the time they
 were scanned, not which rules they would hit from the current testing
 rules.
on the other hand it would reflect the complete mail-flow and not justhand-crafted samples

It's not hand *crafted* samples, it's hand *classified* samples. Themessage needs to be classified by a reliable human as ham or spam for theanalysis of the rules that it hits to have any use, or even be possible.

That's why doing something like having an SA install that's based on thecurrent SVN sandbox rules, and that gets a forked copy of your mailstream, and that captures the hits, is still not useful for anything otherthan gross "this rule didn't hit anything" analysis - you don't know whata given message *should* have been, so you can't say anything about therules that hit it - whether they aid that result, or hider it.

Unless your mail stream prior to SA is *guaranteed* 100% ham (which ishugely unlikely or why would you be running SA at all?) or 100% spam(which might be the case for a clean honeypot), you need to review andclassify the messages manually before performing the scan and reportingthe rule hits, and that means keeping copies of the pristine messages, atleast for a while.

I don't know whether statutory requirement make this impossible for youeven if you did obtain consent from some of your clients to use their mailstream in that manner.

should be chained in a minimum negative score to count as ham and a minimumpositive to count as spam - configureable because it depends on the localenvironment and adjustments which scores are clear classifications, 7.0 wouldhere not be 100% spam, 12.0 would be as example

That's probably still not reliable enough for use in masscheck. Ham is abit more important; what would you recommend as a lower limit forconsidering a message as ham? How many actual hams would meet thatrequiement? It might be a lot of work for little final benefit. Whatpercentage actual FNs would you see with that setting? Those would damagethe masscheck analysis.

it would at least help in the current situation and with a rule likeFSL_HELO_HOME when it hits only clear ham and has a high spam-score and whenit only needs to be enabled, collects the information through scanning andsubmit the results once per day a lot of people running milter like setupswith reject and no access to rejected mails could help to improve to auto-QAwithout collecting whole mails

Potentially. You'd have to be willing to set up a parallel mail processingstream using the current SVN sandbox rules as I described above.Performing analysis on the released rules provides no benefit tomasscheck.

> >   Corpora with headers stripped does present a problem. The masscheck
> >   corpora should be complete as received
>> and that is not possible - samples are stripped and anonymized


--
 John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
 jhar...@impsec.org    FALaholic #11174     pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  Maxim IX: Never turn your back on an enemy.
-----------------------------------------------------------------------
 144 days since the first successful real return to launch site (SpaceX)

Re: FSL_HELO_HOME: deep headers again

Reply via email to