Updates: 1) I've managed to exit the backup firewall
2) adding -d -vvv didn't print any more log_debug concerning our fatal() on table stats it did print other log_debug but not from the shutdown() path. it exited again with only printing: pfe: check_table: cannot get table stats for dir2-lmtp@relayd/dir2-lmtp: No such file or directory hce_notify_done: db1 (script ok) hce_notify_done: db2 (script ok) hce_notify_done: db3 (script ok) pfe_statistics: table: ldap, up: 2 id: 1 pfe_statistics: table: mail, up: 2 id: 2 pfe_statistics: table: mx-smtps, up: 2 id: 3 pfe_statistics: table: mx-subm, up: 2 id: 4 pfe_statistics: table: dir-imap, up: 2 id: 5 pfe_statistics: table: dir-pop, up: 2 id: 6 pfe_statistics: table: dir-lmtp, up: 2 id: 7 pfe_statistics: table: dir-sieve, up: 2 id: 8 pfe_statistics: table: imap-smtp, up: 2 id: 9 pfe_statistics: table: sql, up: 3 id: 10 pfe_statistics: table: radius, up: 2 id: 11 pfe_statistics: table: radacct, up: 2 id: 12 pfe_statistics: table: dir2-imap, up: 0 id: 13 pfe_statistics: table: dir2-pop, up: 0 id: 14 pfe_statistics: table: dir2-lmtp, up: 0 id: 15 pfe: check_table: cannot get table stats for dir2-lmtp@relayd/dir2-lmtp: No such file or directory hce exiting, pid 4773 in these logs my log with table name/up/id was before checking if it's up. I did the following. I believe it's needed although it doesn't solve why the table is missing. In any case we shouldn't get table statistics for tables that are down. log_debug are just added by for my debugging, but the if (!rdr->table->up) continue; should probably go in. G Index: pfe.c =================================================================== RCS file: /cvs/src/usr.sbin/relayd/pfe.c,v retrieving revision 1.90 diff -u -p -r1.90 pfe.c --- pfe.c 14 Sep 2020 11:30:25 -0000 1.90 +++ pfe.c 5 Jul 2023 12:27:37 -0000 @@ -790,6 +790,12 @@ pfe_statistics(int fd, short events, voi getmonotime(&tv_now); TAILQ_FOREACH(rdr, env->sc_rdrs, entry) { + if (!rdr->table->up) { + log_debug("%s: table: %s is down. continuing", __func__, rdr->conf.name); + continue; + } + //bilias + log_debug("%s: table: %s, up: %d id: %d", __func__, rdr->conf.name, rdr->table->up, rdr->conf.table_id); cnt = check_table(env, rdr, rdr->table); if (rdr->conf.backup_id != EMPTY_TABLE) cnt += check_table(env, rdr, rdr->backup); On 05/07/2023 13:42, Alexandr Nedvedicky wrote: > Hello, > > On Wed, Jul 05, 2023 at 11:36:26AM +0300, Kapetanakis Giannis wrote: >> Tried to replicate the issue today with running relayd in debug mode in >> order to print more details. >> >> /usr/sbin/relayd -d -v > I did poke to sources. try to increase verbosity by using more 'v': > > /usr/sbin/relayd -d -vvv > > single '-v' does not seem to be enough to make log_debug() to print > anything at least '-vv' is required. > > please retry with '-vv' at least. > >> when relayd exited it only printed: >> pfe: check_table: cannot get table stats for dir-sieve@relayd/dir-sieve: No >> such file or directory >> >> nothing from: >> kill_tables(): >> log_debug("%s: deleted %d tables", __func__, cnt); >> >> or >> flush_rulesets(): >> log_debug("%s: flushed rules", __func__); >> >> are you sure table delete/removal is coming from there? > I did use 'grep DIOC' on relayd sources to see which pf ioctls > are being used there. The only place where relayd calls > DIOCRCLRTABLES is kill_tables() function. The only way to > get there is via > pfe_shutdown() > flush_rulesets() > kill_tables() > this is the only call stack I can think of when looking at source code. > > also keep in mind the log message is displayed after all tables are > removed. so in theory if pfe_statistics() timer fires while tables > are being flushed it may find out table just got deleted and do exit > via fatal(). On the other hand this sounds unlikely given the > stats collection timer runs every minute only. >> In any case it shouldn't try to get stats for empty tables. >> Maybe a check should be added in pfe_statistics() ? >> >> G >> > thanks and > regards > sashan