On 06/30/2011 01:45 PM, J4K wrote: > On 06/30/2011 11:37 AM, J4K wrote: >> On 06/30/2011 11:09 AM, J4K wrote: >>> On 06/29/2011 09:55 PM, Lawrence @ Rogers wrote: >>>> On 29/06/2011 4:58 PM, JKL wrote: >>>>> select count(spam_count) from bayes_vars >>>> Run this query >>>> >>>> SELECT username,spam_count,ham_count FROM bayes_vars >>>> >>>> This will give a list of usernames that have been used to learn ham >>>> and spam into SpamAssassin's Bayes MySQL DB. For a site-wide >>>> installation, this should only return one result. >>>> >>>> To answer your previous question, I meant to simply add the >>>> bayes_sql_override_username setting to your local.cf and restart >>>> spamassassin >>>> >>>> If you are using Postfix with the postfix username, set it as >>>> >>>> bayes_sql_override_username postfix >>>> >>>> This ensures that all future e-mails are labeled as being learned from >>>> the postfix user, regardless of whether you did it manually using >>>> sa-learn via ssh or another interface, or auto-learning is used. For >>>> one site-wide Bayes installation, this is what you want. >>>> >>>> Regards, >>>> Lawrence >>>> >>> Hi there, >>> >>> >>> This is the table I have in mysql, and the one I intend to populate with >>> data:- >>> >>> mysql> describe bayes_vars; >>> +--------------------+--------------+------+-----+------------+----------------+ >>> | Field | Type | Null | Key | Default | >>> Extra | >>> +--------------------+--------------+------+-----+------------+----------------+ >>> | id | int(11) | NO | PRI | NULL | >>> auto_increment | >>> | username | varchar(200) | NO | UNI | >>> | | >>> | spam_count | int(11) | NO | | 0 >>> | | >>> | ham_count | int(11) | NO | | 0 >>> | | >>> | token_count | int(11) | NO | | 0 >>> | | >>> | last_expire | int(11) | NO | | 0 >>> | | >>> | last_atime_delta | int(11) | NO | | 0 >>> | | >>> | last_expire_reduce | int(11) | NO | | 0 >>> | | >>> | oldest_token_age | int(11) | NO | | 2147483647 >>> | | >>> | newest_token_age | int(11) | NO | | 0 >>> | | >>> +--------------------+--------------+------+-----+------------+----------------+ >>> 10 rows in set (0.00 sec) >>> >>> >>> The configuration I intend to use for Bayes is: >>> >>> -------------------- START local.cf ------------------------------- >>> rewrite_header Subject *****SPAM***** >>> report_safe 0 >>> report_hostname xxx.xxx.com >>> dns_available yes >>> use_dcc 1 >>> dcc_path /usr/local/bin/dccproc >>> dcc_home /var/dcc >>> use_pyzor 1 >>> pyzor_path /usr/bin/pyzor >>> pyzor_timeout 5 >>> use_razor2 1 >>> razor_config /etc/razor/razor-agent.conf >>> razor_timeout 5 >>> >>> required_score 6.0 >>> >>> use_bayes 1 >>> skip_rbl_checks 1 >>> bayes_auto_learn 0 >>> # bayes_auto_learn_threshold_nonspam 0.1 >>> # bayes_auto_learn_threshold_spam 13.0 >>> >>> bayes_expiry_max_db_size 300000 >>> bayes_auto_expire 1 >>> >>> bayes_sql_override_username postfix >>> # I don't understand what this setting does, nor why its postfix. >>> Postfix has no intereaction with SA in my set-up as postfix pipes the >>> mail into dovecot,and dovecot handles the spamc portion before filing >>> the email. >>> >>> |bayes_store_module Mail::SpamAssassin::BayesStore::MySQL >>> bayes_sql_dsn DBI:mysql:spamassassin:localhost >>> bayes_sql_username |shamster_user >>> |bayes_sql_password shamster||_password| >>> >>> ifplugin Mail::SpamAssassin::Plugin::Shortcircuit >>> shortcircuit USER_IN_WHITELIST on >>> shortcircuit SUBJECT_IN_WHITELIST on >>> shortcircuit USER_IN_BLACKLIST on >>> shortcircuit SUBJECT_IN_BLACKLIST on >>> >>> loadplugin Mail::SpamAssassin::Plugin::Rule2XSBody >>> endif >>> >>> score RDNS_DYNAMIC 2.639 0.363 1.663 1.700 >>> meta __PILL_PRICE_1 (0) >>> meta __PILL_PRICE_2 (0) >>> meta __PILL_PRICE_3 (0) >>> -------------------- END local.cf ------------------------------- >>> >>> N.B Yes, I know there are some custom rules in the local.cf and these'll >>> be lost after an upgrade of SA, but I have reasonable backups. >>> >>> * Questions >>> Does the configuration above look correct? >>> Will SA only write into the table bayes_vars, or will it touch other tables? >> Seems that some process butchered part of the config by discovering some >> pipe characters. >> >> |bayes_store_module Mail::SpamAssassin::BayesStore::MySQL >> bayes_sql_dsn DBI:mysql:spamassassin:localhost >> bayes_sql_username |shamster_user >> |bayes_sql_password shamster||_password| >> >> Above should have read: >> |bayes_store_module Mail::SpamAssassin::BayesStore::MySQL >> bayes_sql_dsn DBI:mysql:spamassassin:localhost >> bayes_sql_username sa_user >> bayes_sql_password sa_user_password| >> >> Other question: If the above looks correct, is that somethin else that I >> ought to enable? e.g plugins for mysql, or a particular perl module >> that I might have omitted? >> >> Regards, S. > Regarding local.cf > > Should the password be quoted such as in single quotes? > > The password has many strange chars in it e.g > bayes_sql_password fg$%-)_()(Wsuisrt{^%TEST RTFM problem... Apologies.
Jun 30 16:10:11.628 [2220] dbg: bayes: found bayes db version 3 Jun 30 16:10:11.628 [2220] dbg: bayes: Using userid: 186 Jun 30 16:10:11.628 [2220] dbg: bayes: not available for scanning, only 0 spam(s) in bayes DB < 200 Solved by feeding one piece of spam to init the database: sa-learn --spam gtube.txt However, I added some messages, but the detail from --dump magic show nothing: # sa-learn --ham cur/ Learned tokens from 25 message(s) (26 message(s) examined) # sa-learn --dump magic 0.000 0 3 0 non-token data: bayes db version 0.000 0 0 0 non-token data: nspam 0.000 0 0 0 non-token data: nham 0.000 0 0 0 non-token data: ntokens 0.000 0 2147483647 0 non-token data: oldest atime 0.000 0 0 0 non-token data: newest atime 0.000 0 0 0 non-token data: last journal sync atime 0.000 0 0 0 non-token data: last expiry atime 0.000 0 0 0 non-token data: last expire atime delta 0.000 0 0 0 non-token data: last expire reduction count I checked if the postfix entry was created in bayes_vars; | postfix | 0 | 0 | +-------------------------------+------------+-----------+ Does this look correct?