On Sat, 14 May 2016, Reindl Harald wrote:
Am 14.05.2016 um 19:10 schrieb John Hardin:
On Sat, 14 May 2016, Reindl Harald wrote:
> Am 14.05.2016 um 04:50 schrieb John Hardin:
> > On Sat, 14 May 2016, Reindl Harald wrote:
> > > Am 14.05.2016 um 04:04 schrieb John Hardin:
> > > > How would a webservice be better? That would still be
> > > > sending customer emails to a third party for processing. uhm
> > > > you missed "and only give the rules which hitted and
> > > > spam/ham flag out"
> >
> > Ah, OK, I misunderstood what you were suggesting.
> >
> > That wouldn't work. That tells you the rules they hit at the time
> > they were scanned, not which rules they would hit from the
> > current testing rules.
>
> on the other hand it would reflect the complete mail-flow and not just
> hand-crafted samples
It's not hand *crafted* samples, it's hand *classified* samples. The
message needs to be classified by a reliable human as ham or spam for
the analysis of the rules that it hits to have any use, or even be
possible.
that's just nitpicking - i can correct you the same way in german for most of
you would try to express :-)
Yes, probably.
That's why doing something like having an SA install that's based on the
current SVN sandbox rules, and that gets a forked copy of your mail
stream, and that captures the hits, is still not useful for anything
other than gross "this rule didn't hit anything" analysis - you don't
know what a given message *should* have been, so you can't say anything
about the rules that hit it - whether they aid that result, or hider it.
how do you imagine such a setup *in practice*
Somewhat stream-of-consciousness:
In addition to your normal deliver-to-the-user MTA, have another MTA that
is running against an SA that is configured from SVN. Note that this
wouldn't be a backup MTA, it would have to get a copy of your inbound mail
stream. Not sure how you'd fork the mail delivery process, that's probably
MTA-dependent.
The masscheck MTA would deliver to SA, record the rule hits and
classification in the masscheck upload format, and discard the message.
Normal delivery would usually be suspended so that messages queue.
When the masscheck start time is reached, update from SVN, recompile the
rules, clear the log and enable MTA delivery. The queued messages would be
scanned and recorded until the upload time is reached, at which time
delivery is suspended again. This may or may not be long enough to clear
the queue.
The results would then be uploaded.
As you noted, there would have to be some minimum score for recording the
message as spam, and some maximum score for recording it as ham. Anything
in between would have to be discarded as ambiguous. There might also need
to be some kind of weighting on the results when they are incorporated
into masscheck to reflect that they are not hand-classified and thus their
reliability isn't as good as we'd like, however there have been
misclassifications in hand-classified corpora before so if the thresholds
are well-chosen that may not be an issue.
But note, this would probably not help offset a high-scoring FP rule as
the message would be auto-classified as spam or, at best, ambiguous - it
might actually be self-reinforcing and make the situation worse, rather
than help it be self-correcting as hand-classified corpora would. It also
won't probably help much with new rules.
I don't really think there's any way around having hand-classified clean
and complete corpora for running masschecks.
Unless your mail stream prior to SA is *guaranteed* 100% ham (which is
hugely unlikely or why would you be running SA at all?) or 100% spam
(which might be the case for a clean honeypot), you need to review and
classify the messages manually before performing the scan and reporting
the rule hits, and that means keeping copies of the pristine messages,
at least for a while.
I don't know whether statutory requirement make this impossible for you
even if you did obtain consent from some of your clients to use their
mail stream in that manner.
i don't have access to the whole mailflow to classify it nor is there a
technical way to mirror it on a different setup
OK
nor would SA or even smtpd ever see 95% of junk because content filters
are the last ressort by definition
It's not too difficult for masscheck to get spam, as there are honeypots
feeding masscheck. It's harder to get ham, especially non-English ham, so
contributing to masscheck from a 99% clean feed is still helpful.
> should be chained in a minimum negative score to count as ham and a
> minimum positive to count as spam - configureable because it depends
> on the local environment and adjustments which scores are clear
> classifications, 7.0 would here not be 100% spam, 12.0 would be as
> example
That's probably still not reliable enough for use in masscheck. Ham is a
bit more important; what would you recommend as a lower limit for
considering a message as ham? How many actual hams would meet that
requiement? It might be a lot of work for little final benefit. What
percentage actual FNs would you see with that setting? Those would
damage the masscheck analysis.
i would agree if we could call the current masscheck results reliable
> it would at least help in the current situation and with a rule like
> FSL_HELO_HOME when it hits only clear ham and has a high spam-score
> and when it only needs to be enabled, collects the information through
> scanning and submit the results once per day a lot of people running
> milter like setups with reject and no access to rejected mails could
> help to improve to auto-QA without collecting whole mails
Potentially. You'd have to be willing to set up a parallel mail
processing stream using the current SVN sandbox rules as I described
above. Performing analysis on the released rules provides no benefit to
masscheck
why would it provide no benefit when one part of the "sa-update" which
currently don't get any updates most of the time is to re-score badly socred
rules - that's really not only about sandbox rules
Because the rules in question may have changed since the last update was
released. The analysis needs to be of the current state of the rules in
SVN - take a snapshot, masscheck it and generate scores, and those rules
and their scores are released as an update if the corpora are large enough
for the results to be considered reliable. (Note that "reliability" is
based on the *size* of the corpora. We sadly don't have any way to judge
it based on broadness of content.)
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhar...@impsec.org FALaholic #11174 pgpk -a jhar...@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
...much of our country's counterterrorism security spending is not
designed to protect us from the terrorists, but instead to protect
our public officials from criticism when another attack occurs.
-- Bruce Schneier
-----------------------------------------------------------------------
144 days since the first successful real return to launch site (SpaceX)