On 03/09/2017 03:11 PM, mar...@mejor.pl wrote:
W dniu 09.03.2017 o 15:05, mar...@mejor.pl pisze:
W dniu 09.03.2017 o 14:42, Axb pisze:
On 03/09/2017 02:31 PM, mar...@mejor.pl wrote:
W dniu 08.03.2017 o 17:30, Axb pisze:
On 03/08/2017 04:55 PM, mar...@mejor.pl wrote:
W dniu 08.03.2017 o 16:33, Axb pisze:
On 03/08/2017 04:16 PM, mar...@mejor.pl wrote:
W dniu 08.03.2017 o 16:06, Axb pisze:
On 03/08/2017 03:58 PM, mar...@mejor.pl wrote:
W dniu 08.03.2017 o 15:27, Axb pisze:
As your command below shows you're using --reqpatlength 0

Start off with some sane as for example --reqpatlength 40

you may also want to play with --maxtextread
( I use --maxtextread 8192  for FRAUD rules)

But with --reqpatlength 10, 40, 100 or 1000 I've go no hit. Reading
help
( "--reqpatlength: required pattern length, in characters
(default: 0)"
) I understand that pattern in generated rule will be longer than
reqpatlength (shorter strings will be ignored). Do I correctly
assume
how the parameter works?

--reqpatlength 40  tells seekphrases to ignore any phrases which are
smaller than 40 chars

just checked by line which is using
 --reqpatlength 37

Any value>0 makes that no rule is generated.

body __AXB_FRAUD_LAF076  /It has come to our attention that you /
body __AXB_FRAUD_UPVTRT  / in order to confirm your disbursement\./
body __AXB_FRAUD_NOFUX2  / approval, your funds will be deposited
directly into your /
body __AXB_FRAUD_Z4ZZ7D  / in order to accept your disbursement\./
body __AXB_FRAUD_CUXJ6X  / approval, your funds will be direct
deposited
into your /
body __AXB_FRAUD_NHWXKL  /: You Are Eligible to Receive Funds up to
\$.,000\. /

hard to guess what is not working on your side without full insight

What can I do to help more? Should I share all_w.h and all_w.s files?

before we go that way pls answer these questions

how many spams/hams are you processing?

ham: ~1400
spam: ~8200

do you have a file named assemble.state ? if yes, how large?

Yes, I've got this file, it has ~9MB size.

and pls zip & send me the full script you're using to generate the
rules, OFFLIST! do NOT post to list

Ok, I'll choose tar.bz2 ;)
Thanks for help.

replying on list as much as I can so it's  archived FTR

first thin I see is that your logs do not contain a list of rules which
hit on each message.

for example my "w.s" file has lines which look like:

 53 /home/mc/Maildir/cur/1487823401.M695422P29583.ruler,S=7602,W=7747:2,
ADVANCE_FEE_2_NEW_MONEY,ADVANCE_FEE_3_NEW,ADVANCE_FEE_3_NEW_MONEY,ADVANCE_FEE_4_NEW,ADVANCE_FEE_4_NEW_MONEY,ADVANCE_FEE_5_NEW,ADVANCE_FEE_5_NEW_MONEY,AXB_XM2600,AXB_XMAILER_MIMEOLE_OL_024C2,CM_XRCVD_VOOZER4,DEAR_WINNER,FORGED_MUA_OUTLOOK,FROM_MISSPACED,FROM_MISSP_MSFT,FROM_MISSP_REPLYTO,FROM_MISSP_URI,FSL_419_FP1,FSL_CTYPE_WIN1251,FSL_MISSP_REPLYTO,FSL_NEW_HELO_USER,FSL_RCVD_USER,FSL_UA,FSL_XM_419,HK_NAME_MR_MRS,LOTS_OF_MONEY,LOTTO_DEPT,MONEY_FRAUD_3,MONEY_FRAUD_5,MONEY_FROM_MISSP,MSOE_MID_WRONG_CASE,NSL_RCVD_HELO_USER,TO_NO_BRKTS_FROM_MSSP,T_AXB_XM2600,T_BIG_HEADERS_5K,T_CM_XRCVD_VOOZER4,T_FSL_FREEMAIL_1,T_FSL_HELO_NON_FQDN_2,T_HK_MUCHMONEY,T_LOTTO_AGENT,T_SINGLE_HEADER_1K,T_TO_NO_BRKTS_MSFT,__419_FROM_SIG,__ADVANCE_FEE_2_NEW,__ADVANCE_FEE_2_NEW_MONEY,__ADVANCE_FEE_3_NEW,__ADVANCE_FEE_3_NEW_MONEY,__ADVANCE_FEE_4_NEW,__ADVANCE_FEE_4_NEW_MONEY,__ADVANCE_FEE_5_NEW,__ADVANCE_FEE_5_NEW_MONEY,__AFF_LOTTERY,__ANY_OUTLOOK_MUA,__ANY_TEXT_ATTACH,__ANY_TEXT_ATTACH_DOC,__AXB_MO_OL_024C2,__AXB_MO_OL_D8ACC,__AXB_XM_OL_024C2,__AXB_XM_OL_080C4,__AXB_XM_OL_424A6,__AXB_XM_OL_B9D6C,__BOUNCE_RPATH_NULL,__CONGRADULAT,__CT,__CTE,__CTYPE_CHARSET_QUOTED,__CT_TEXT_PLAIN,__DOS_HAS_ANY_URI,__DOS_RCVD_THU,__DOS_RCVD_WED,__DOS_RELAYED_EXT,__FB_CONGRADS,__FH_HAS_XMSMAIL,__FH_HAS_XPRIORITY,__FORGED_OE,__FRAUD_DBI,__FRAUD_FCW,__FROM_FULL_NAME,__FROM_MISSPACED,__FROM_MISSP_REPLYTO,__FROM_MISSP_URI,__FROM_RUNON,__FSL_419_1,__FSL_419_2,__FSL_419_3,__FSL_419_4,__FSL_419_5,__FSL_HELO_USER_1,__FSL_HELO_USER_3,__FSL_UA_2,__HAS_ANY_EMAIL,__HAS_ANY_URI,__HAS_DATE,__HAS_FROM,__HAS_MESSAGE_ID,__HAS_MIMEOLE,__HAS_MSGID,__HAS_MSMAIL_PRI,__HAS_RCVD,__HAS_REPLY_TO,__HAS_SUBJECT,__HAS_TO,__HAS_URI,__HAS_XMAIL,__HAS_X_MAILER,__HK_NAME_MR_MRS,__LAST_EXTERNAL_RELAY_NO_AUTH,__LAST_UNTRUSTED_RELAY_NO_AUTH,__LOTSA_MONEY_04,__LOTTO_ADMITS,__LOTTO_ADMITS_1,__LOTTO_WIN_01,__MIMEOLE_MS,__MIME_VERSION,__MISSING_REF,__MISSING_REPLY,__MISSING_THREAD,__MONEY_FRAUD,__MONEY_FRAUD_3,__MONEY_FRAUD_5,__MONEY_LOTTERY,__MSGID_OK_DIGITS,__MSOE_MID_WRONG_CASE,__M_NOTIFIC,__NAKED_TO,__NONEMPTY_BODY,__NO_INR_YES_REF,__OE_MUA,__RCVD_VIA_APNIC_E,__RCVD_VIA_ARIN_E,__RCVD_VIA_RIPE,__RCVD_VIA_RIPE_E,__RDNS_SHORT,__REPLYTO_EXISTS,__REPLY_FREEMAIL,__SANE_MSGID,__SARE_FRAUD_BARRISTER,__SINGLE_HEADER_1K,__SUBJ_2UPPER,__SUBJ_4LOWER,__SUBJ_HAS_WORDS,__SUBJ_NOT_SHORT,__TOCC_EXISTS,__TO_NO_ARROWS_R,__TO_NO_BRKTS_FROM_MSSP,__TO_NO_BRKTS_FROM_RUNON,__TO_NO_BRKTS_MSFT,__TO_NO_BRKTS_NOTLIST,__TVD_BODY,__TVD_MIME_ATT_TP,__URI_MAILTO,__XM_MSOE6,__XM_MS_IN_GENERAL,__XM_OUTLOOK_EXPRESS,__XPRIO,__YOU_WON,__YOU_WON_01,__YOU_WON_02,__YOU_WON_SOMTIN,__hk_million,__hk_win_1,__hk_win_5,__hk_win_6,__hk_win_b

time=0,scantime=0,format=f,reuse=no,set=0

so apparently your masschecker is not seeing rules.

I don't use --cache &  --cachedir (don't remember why) - for starters
maybe remove

I started without cache.

I have  --cf='use_bayes 0' (speeds up processing) and make sure you use
  --cf='required_score 5'

you'll have to play with your setup till your logs show SA rule hits.

Therea are no SA rules because parameter "-C=/dev/null" is set.

I don't understand something. Why do I need to check
mails-that-i-classified-as-spam-or-ham against rules? If I understand
how creating auto rules works masscheck only dumps strings from ham and
spam.

the routine is supposed to create rules based from msgs in your spam
folder and needs the ham folder to counterweight against potential FPs
so for example, you don't start producing rules based on phrases in
disclaimers.

in the log, each line starts with Y/N and a score - not sure how
necessary it is, I've always had it that way and it "works for me"

And next seek-phrases-in-log should create rules using found strings.
I'm using script from svn with some changes in path. So I assumed that
it should be more or less working:)

a wise man once said: "to assume is not to know"
why not try avoiding modifications till you get some usefull results and
the start doing mods, one at a time.

I just modified "run" script, other perl scripts are untouched.

Btw, I removed -C=/dev/null , rules hit are in logs but
seek-phrases-in-log still returns no rules if I use --reqpatlength= to
non zero value.

I have no idea.
I'll send you a modified seek-phrases-in-log (offlist) for you to try...

I've got two news, bad and good.
The good news is you version of script works!
Bad news is that script in official repo doesn't work.
bugzilla?

I see what is going. Variable maxreqpatlength isn't initialized in
original script...



Pls open a bug to track the changes for the future.

And I've got good news :)
I'll rename the one we now have in SVN and commit my working version as a replacement.

Thx

Reply via email to