Re: FSL_HELO_HOME: deep headers again

John Hardin Sat, 14 May 2016 14:13:23 -0700

On Sat, 14 May 2016, Reindl Harald wrote:

Am 14.05.2016 um 19:10 schrieb John Hardin:
 On Sat, 14 May 2016, Reindl Harald wrote:
>  Am 14.05.2016 um 04:50 schrieb John Hardin:
> >   On Sat, 14 May 2016, Reindl Harald wrote:
> > >   Am 14.05.2016 um 04:04 schrieb John Hardin:
> > > > How would a webservice be better? That would still be> > > > sending customer emails to a third party for processing. uhm> > > > you missed "and only give the rules which hitted and
> > > >  spam/ham flag out"
> >> > Ah, OK, I misunderstood what you were suggesting.> >> > That wouldn't work. That tells you the rules they hit at the time> > they were scanned, not which rules they would hit from the> > current testing rules.>> on the other hand it would reflect the complete mail-flow and not just
>  hand-crafted samples

 It's not hand *crafted* samples, it's hand *classified* samples. The
 message needs to be classified by a reliable human as ham or spam for
 the analysis of the rules that it hits to have any use, or even be
 possible.
that's just nitpicking - i can correct you the same way in german for most ofyou would try to express :-)


Yes, probably.

 That's why doing something like having an SA install that's based on the
 current SVN sandbox rules, and that gets a forked copy of your mail
 stream, and that captures the hits, is still not useful for anything
 other than gross "this rule didn't hit anything" analysis - you don't
 know what a given message *should* have been, so you can't say anything
 about the rules that hit it - whether they aid that result, or hider it.


how do you imagine such a setup *in practice*


Somewhat stream-of-consciousness:

In addition to your normal deliver-to-the-user MTA, have another MTA thatis running against an SA that is configured from SVN. Note that thiswouldn't be a backup MTA, it would have to get a copy of your inbound mailstream. Not sure how you'd fork the mail delivery process, that's probablyMTA-dependent.

The masscheck MTA would deliver to SA, record the rule hits andclassification in the masscheck upload format, and discard the message.


Normal delivery would usually be suspended so that messages queue.

When the masscheck start time is reached, update from SVN, recompile therules, clear the log and enable MTA delivery. The queued messages would bescanned and recorded until the upload time is reached, at which timedelivery is suspended again. This may or may not be long enough to clearthe queue.


The results would then be uploaded.

As you noted, there would have to be some minimum score for recording themessage as spam, and some maximum score for recording it as ham. Anythingin between would have to be discarded as ambiguous. There might also needto be some kind of weighting on the results when they are incorporatedinto masscheck to reflect that they are not hand-classified and thus theirreliability isn't as good as we'd like, however there have beenmisclassifications in hand-classified corpora before so if the thresholdsare well-chosen that may not be an issue.

But note, this would probably not help offset a high-scoring FP rule asthe message would be auto-classified as spam or, at best, ambiguous - itmight actually be self-reinforcing and make the situation worse, ratherthan help it be self-correcting as hand-classified corpora would. It alsowon't probably help much with new rules.

I don't really think there's any way around having hand-classified cleanand complete corpora for running masschecks.

 Unless your mail stream prior to SA is *guaranteed* 100% ham (which is
 hugely unlikely or why would you be running SA at all?) or 100% spam
 (which might be the case for a clean honeypot), you need to review and
 classify the messages manually before performing the scan and reporting
 the rule hits, and that means keeping copies of the pristine messages,
 at least for a while.

 I don't know whether statutory requirement make this impossible for you
 even if you did obtain consent from some of your clients to use their
 mail stream in that manner.

i don't have access to the whole mailflow to classify it nor is there atechnical way to mirror it on a different setup

OK

nor would SA or even smtpd ever see 95% of junk because content filtersare the last ressort by definition

It's not too difficult for masscheck to get spam, as there are honeypotsfeeding masscheck. It's harder to get ham, especially non-English ham, socontributing to masscheck from a 99% clean feed is still helpful.

>  should be chained in a minimum negative score to count as ham and a
>  minimum positive to count as spam - configureable because it depends
>  on the local environment and adjustments which scores are clear
>  classifications, 7.0 would here not be 100% spam, 12.0 would be as
>  example

 That's probably still not reliable enough for use in masscheck. Ham is a
 bit more important; what would you recommend as a lower limit for
 considering a message as ham? How many actual hams would meet that
 requiement? It might be a lot of work for little final benefit. What
 percentage actual FNs would you see with that setting? Those would
 damage the masscheck analysis.


i would agree if we could call the current masscheck results reliable

>  it would at least help in the current situation and with a rule like
>  FSL_HELO_HOME when it hits only clear ham and has a high spam-score
>  and when it only needs to be enabled, collects the information through
>  scanning and submit the results once per day a lot of people running
>  milter like setups with reject and no access to rejected mails could
>  help to improve to auto-QA without collecting whole mails

 Potentially. You'd have to be willing to set up a parallel mail
 processing stream using the current SVN sandbox rules as I described
 above. Performing analysis on the released rules provides no benefit to
 masscheck

why would it provide no benefit when one part of the "sa-update" whichcurrently don't get any updates most of the time is to re-score badly socredrules - that's really not only about sandbox rules

Because the rules in question may have changed since the last update wasreleased. The analysis needs to be of the current state of the rules inSVN - take a snapshot, masscheck it and generate scores, and those rulesand their scores are released as an update if the corpora are large enoughfor the results to be considered reliable. (Note that "reliability" isbased on the *size* of the corpora. We sadly don't have any way to judgeit based on broadness of content.)


--
 John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
 jhar...@impsec.org    FALaholic #11174     pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  ...much of our country's counterterrorism security spending is not
  designed to protect us from the terrorists, but instead to protect
  our public officials from criticism when another attack occurs.
                                                    -- Bruce Schneier
-----------------------------------------------------------------------
 144 days since the first successful real return to launch site (SpaceX)

Re: FSL_HELO_HOME: deep headers again

Reply via email to