Re: drop of score after update tonight

Reindl Harald Mon, 25 Aug 2014 13:14:12 -0700

first - thank you for your feedback
SA is a new beat to me

Am 25.08.2014 um 22:00 schrieb Daniel Staal:
> --As of August 25, 2014 7:49:39 PM +0200, Reindl Harald is alleged to have 
> said:
>> Am 25.08.2014 um 19:35 schrieb Daniel Staal:
>>> --As of August 25, 2014 7:06:32 PM +0200, Reindl Harald is alleged to
>>> have said:
>>>
>>>>> masscheck ties to ensure spams score at least 5 points, but doesn't
>>>>> care beyond that
>>>>
>>>> yes, but given that the intention is to flag message above
>>>> 5 with [SPAM] and reject messages above 7 which is the
>>>> intention running SA as milter the reduced score matters
>>>
>>> Who sets that policy?  Is it something you could think about
>>> changing (if it's a problem).
>>
>> finally i do that - which values needs to be found out and honestly
>> seeing that change i am unsure how to set score limits for both
>> (flag and reject) to prevent too mach messages passing through
>> and at the same time if such a large change happens introduce
>> false positives from one day to another
> 
> Based on a quick check of my email, if you consider 'flagged' as non-spam 
> (but possible), then I'd probably set
> flag at 3 or 4, and reject (as spam) at 5.  Personally I use a 'probably 
> spam' and 'definitely spam' system (both
> are set aside), with cutoffs at 5 and 10, respectively.


that is indeed something i had in mind as valid possibility
after looking what happens with real mail flow on my personal
domain

frankly, i am building up config tools for the whole system
the last few days until around 5:00 AM each night and can't
await to see it doing something but need some postfix
configurations since it will be a inbound-only for
multiple targets with and without rcpt-list :-)

currently i am at implement postfix "rcpt verification"
for the domains with no access to RCPT databases....

> But part of the point is that 7.5 to 5.3 is *not* a large change, as far a 
> spamassassin is concerned.  5.1 to 4.9 would be a large change. ;)

i didn't get the joke completly but fine :-)

> I have rarely ever had a false positive with spamassassin - I get maybe 
> two-three a year.  I get that in false
> negatives a day, when things are working well.  (Which amounts to about 1% of 
> the spam I get as false negative.)

sounds good

>> i admit not have that much expierience but want to avoid
>> major mistakes in the setup as good as possible before
>> going live
> 
> My advice: Don't over-think it.  Spamassassin normally does a good job, with 
> base settings and things turned on. 
> Train your bayes well, and watch for new things, but in general don't try 
> messing with a lot of settings unless you
> have problems with a live mail stream.

agreed - what i currently try is to implement a webinterface based
on other existing inhouse solutions to adjust params but feeded
with defaults so that later if all goes well anybody without
touching the device itself can adjust things in case some aeroplane
kills me from on day to another :-)

they bayes is trained well i think based on a few testmessages which are spam

without: score around 1.4
with: score around 5

since with milter there is only one userhome and so i feeded
the folders below with 1000 messages ham as well as spam

[root@mail-gw:~]$ cat /scripts/sa-learn.sh
#!/usr/bin/bash
chown root:sa-milt -R /var/lib/spamass-milter/training/ham/
chown root:sa-milt -R /var/lib/spamass-milter/training/spam/
chmod 750 /var/lib/spamass-milter/training/ham/
chmod 750 /var/lib/spamass-milter/training/spam/
chmod 640 /var/lib/spamass-milter/training/ham/*.eml
chmod 640 /var/lib/spamass-milter/training/spam/*.eml
fdupes -r -f /var/lib/spamass-milter/training/ham/ | grep -v '^$' | xargs rm -v 
2> /dev/null
fdupes -r -f /var/lib/spamass-milter/training/spam/ | grep -v '^$' | xargs rm 
-v 2> /dev/null
/usr/bin/su -c "/var/lib/spamass-milter/training/learn.sh" sa-milt

[root@mail-gw:~]$ cat /var/lib/spamass-milter/training/learn.sh
#!/usr/bin/bash
akt_user=`whoami`
if test $akt_user = "sa-milt"
then
 /bin/echo "" > /dev/null
else
 /bin/echo "Das Script 'learn.sh' muss als Benutzer 'sa-milt' aufgerufen werden"
 exit
fi
MY_TIME=$(/bin/date "+%d-%m-%Y %H:%M:%S")
echo "$MY_TIME: Verarbeite SPAM Samples"
/usr/bin/sa-learn --progress --spam /var/lib/spamass-milter/training/spam/*.eml
echo ""
MY_TIME=$(/bin/date "+%d-%m-%Y %H:%M:%S")
echo "$MY_TIME: Verarbeite HAM Samples"
/usr/bin/sa-learn --progress --ham /var/lib/spamass-milter/training/ham/*.eml
echo ""
MY_TIME=$(/bin/date "+%d-%m-%Y %H:%M:%S")
echo "$MY_TIME: Done"

>>> Did the percentage of spam flagged vs. rejected change overall?
>>
>> i am at early testing of SA and there is no active mail flow
>> since i am about finsish admin backends and how to generate
>> config files for SA/ClamAV/Postfix which is now at a nearly
>> "well, for my private doamin as public test good enough"
>>
>>> Every time the rules update some rules will be scored higher and
>>> some lower, so figuring out each individual case is going to be
>>> pointless, but if the overall percentages remain stable your system
>>> hasn't actually changed how it operates
>>
>> as said - i am about implement SA, saw the message from the
>> update cronjob the first time for some days and looked a
>> bit deeper if things changed
> 
> And I think you ended up over-thinking it.  It was marked as spam before, 
> it's marked as spam now.  Some other
> emails would probably have scored higher than they used to.  We've actually 
> had a long break in updates - usually
> they are multiple times a week, if not every day, but it's been around a 
> month since they last updated.  Rules
> probably changed scores more than normal - but it still scored the mail as 
> spam

i will see and also think even if there are bad impacts there would
be another update soon

signature.asc
Description: OpenPGP digital signature

Re: drop of score after update tonight

Reply via email to