On 2022-05-06 6:56 am, Henrik K wrote: 

> On Fri, May 06, 2022 at 12:31:47PM +0200, giovanni@paclan.itwrote:
> On 5/6/22 11:08, Niels Kobschätzki wrote: Hi, I have a setup where the 
> spamassassin-servers have actually no access to the data of the mail-servers. 
> Now I was looking into having per user bayes-databases and saw that I can do 
> that with a SQL-database. I have already a small galera-cluster and I wonder 
> if spamassassin will work with it because of the limitations galera has. The 
> limitations are: * only innodb * unsupported explicit locking * a primary key 
> on all tables is necessary * no XA transactions * no reliance on 
> auto-increment Does anyone have experience with such a setup? Few things to 
> consider: bayes_expire has no primary key.

>From what I see, there's no reason why it shouldn't be.

CREATE TABLE bayes_expire (
 id int(11) NOT NULL default '0',
 runtime int(11) NOT NULL default '0',
 KEY bayes_expire_idx1 (id)
) ENGINE=InnoDB;

BayesStore/MySQL.pm has kind of a dumb insert which might insert things
multiple times

 my $sql = "INSERT INTO bayes_expire (id,runtime) VALUES (?,?)";

It should just be converted to UPSERT.

Of course this won't help until 4.0.0 is released..

> bayes_vars MySQL table has the id defined as "id int(11) NOT NULL 
> AUTO_INCREMENT".

Google implies Galera supports auto_increment just fine, it just does
something funny like incrementing them in 3 multiples or something.

It works fine with Galera - been running that for years. This is from my
ansible spamassassin role 

> 267 # http://blog.secaserver.com/2013/10/converting-data-work-galera-cluster/
> 268 - name: Set bayes_expire key to PRIMARY so galera replication works
> 269 lineinfile:
> 270 path: /usr/share/doc/spamassassin/sql/bayes_mysql.sql
> 271 regexp: 'KEY (bayes_expire_idx1.*)'
> 272 line: 'PRIMARY KEY 1'
> 273 backrefs: yes
> 274 state: present
> 275
> 276 # NOTE: As of 3.4.3 see UPGRADE file - says to add last_hit field to awl 
> table
> 277 # exactly as we're doing here
> 278 # lastupdate timestamp default CURRENT_TIMESTAMP ON UPDATE 
> CURRENT_TIMESTAMP
> 279 # http://www200.pair.com/mecham/spam/debian-spamassassin-sql.html
> 280 - name: Add last_hit field to bayes_seen and awl tables
> 281 lineinfile:
> 282 path: "/usr/share/doc/spamassassin/sql/{{ item.file }}"
> 283 insertbefore: '.*({{ item.before }}).*'
> 284 line: ' last_hit timestamp NOT NULL default CURRENT_TIMESTAMP ON UPDATE 
> CURRENT_TIMESTAMP,'
> 285 state: present
> 286 with_items:
> 287 - { before: 'id,msgid', file: bayes_mysql.sql }
> 288 - { before: 'username,email,signedby,ip', file: awl_mysql.sql }
> 289
> 290 # Only run the DB creation tasks on a single host
> 291 # This depends on mysql/galera repliction being active to propagate
> 292 # the DB across to the other nodes
> 293 - when: inventory_hostname == groups.testmail[0] or play_hosts | length 
> == 1
> 294 block:
> 295 #
> 296 - name: Create Spamassassin database
> 297 mysql_db:
> 298 name: spamassassin
> 299 state: present
> 300 register: spamassassin_database

Just setting the bayes_expire key to PRIMARY was all that was needed. 

Reply via email to