On 2022-05-06 6:56 am, Henrik K wrote:
> On Fri, May 06, 2022 at 12:31:47PM +0200, giovanni@paclan.itwrote: > On 5/6/22 11:08, Niels Kobschätzki wrote: Hi, I have a setup where the > spamassassin-servers have actually no access to the data of the mail-servers. > Now I was looking into having per user bayes-databases and saw that I can do > that with a SQL-database. I have already a small galera-cluster and I wonder > if spamassassin will work with it because of the limitations galera has. The > limitations are: * only innodb * unsupported explicit locking * a primary key > on all tables is necessary * no XA transactions * no reliance on > auto-increment Does anyone have experience with such a setup? Few things to > consider: bayes_expire has no primary key. >From what I see, there's no reason why it shouldn't be. CREATE TABLE bayes_expire ( id int(11) NOT NULL default '0', runtime int(11) NOT NULL default '0', KEY bayes_expire_idx1 (id) ) ENGINE=InnoDB; BayesStore/MySQL.pm has kind of a dumb insert which might insert things multiple times my $sql = "INSERT INTO bayes_expire (id,runtime) VALUES (?,?)"; It should just be converted to UPSERT. Of course this won't help until 4.0.0 is released.. > bayes_vars MySQL table has the id defined as "id int(11) NOT NULL > AUTO_INCREMENT". Google implies Galera supports auto_increment just fine, it just does something funny like incrementing them in 3 multiples or something. It works fine with Galera - been running that for years. This is from my ansible spamassassin role > 267 # http://blog.secaserver.com/2013/10/converting-data-work-galera-cluster/ > 268 - name: Set bayes_expire key to PRIMARY so galera replication works > 269 lineinfile: > 270 path: /usr/share/doc/spamassassin/sql/bayes_mysql.sql > 271 regexp: 'KEY (bayes_expire_idx1.*)' > 272 line: 'PRIMARY KEY 1' > 273 backrefs: yes > 274 state: present > 275 > 276 # NOTE: As of 3.4.3 see UPGRADE file - says to add last_hit field to awl > table > 277 # exactly as we're doing here > 278 # lastupdate timestamp default CURRENT_TIMESTAMP ON UPDATE > CURRENT_TIMESTAMP > 279 # http://www200.pair.com/mecham/spam/debian-spamassassin-sql.html > 280 - name: Add last_hit field to bayes_seen and awl tables > 281 lineinfile: > 282 path: "/usr/share/doc/spamassassin/sql/{{ item.file }}" > 283 insertbefore: '.*({{ item.before }}).*' > 284 line: ' last_hit timestamp NOT NULL default CURRENT_TIMESTAMP ON UPDATE > CURRENT_TIMESTAMP,' > 285 state: present > 286 with_items: > 287 - { before: 'id,msgid', file: bayes_mysql.sql } > 288 - { before: 'username,email,signedby,ip', file: awl_mysql.sql } > 289 > 290 # Only run the DB creation tasks on a single host > 291 # This depends on mysql/galera repliction being active to propagate > 292 # the DB across to the other nodes > 293 - when: inventory_hostname == groups.testmail[0] or play_hosts | length > == 1 > 294 block: > 295 # > 296 - name: Create Spamassassin database > 297 mysql_db: > 298 name: spamassassin > 299 state: present > 300 register: spamassassin_database Just setting the bayes_expire key to PRIMARY was all that was needed.