Hum weird, isn't DBIC supposed to deal with reconnection? You could add the find in a try, then find again in the catch... (for debug purpose!) Or force a new DBMS connection for each job (C4::Context->dbh({new => 1})), but that's not a long term solution.
Le jeu. 19 oct. 2023 à 08:04, David Cook <dc...@prosentient.com.au> a écrit : > Just one last one…. > > > > I put a Carp on to the SIGPIPE and sure enough it is the > Koha::BackgroundJobs->find() causing the SIGPIPE to be raised. > > > > Of course, ignoring the SIGPIPE isn’t going to help because the job won’t > be fetched. > > > > I’m not sure what the answer here is… and unfortunately I’ve ran out of > time today. With my work task, I’ll probably abandon using the background > jobs for the moment as I need to do something more reliable. I’ll think > about this one later… > > > > at /kohadevbox/koha/misc/background_jobs_worker.pl line 62. > > main::__ANON__("PIPE") called at > /usr/share/perl5/DBIx/Class/Storage/DBI.pm line 932 > > eval {...} called at /usr/share/perl5/DBIx/Class/Storage/DBI.pm line > 932 > > > DBIx::Class::Storage::DBI::connected(DBIx::Class::Storage::DBI::mysql=HASH(0x55b81d683d78)) > called at /usr/share/perl5/DBIx/Class/Storage/DBI.pm line 850 > > > DBIx::Class::Storage::DBI::__ANON__(DBIx::Class::Storage::BlockRunner=HASH(0x55b81f9d8d28)) > called at /usr/share/perl5/DBIx/Class/Storage/BlockRunner.pm line 190 > > DBIx::Class::Storage::BlockRunner::__ANON__() called at > /usr/share/perl5/Context/Preserve.pm line 38 > > Context::Preserve::preserve_context(CODE(0x55b81fa01620), "replace", > CODE(0x55b81fa016c8)) called at > /usr/share/perl5/DBIx/Class/Storage/BlockRunner.pm line 213 > > > DBIx::Class::Storage::BlockRunner::_run(DBIx::Class::Storage::BlockRunner=HASH(0x55b81f9d8d28), > CODE(0x55b81f9d8ce0)) called at > /usr/share/perl5/DBIx/Class/Storage/BlockRunner.pm line 105 > > > DBIx::Class::Storage::BlockRunner::run(DBIx::Class::Storage::BlockRunner=HASH(0x55b81f9d8d28), > CODE(0x55b81f9d8ce0)) called at /usr/share/perl5/DBIx/Class/Storage/DBI.pm > line 856 > > DBIx::Class::Storage::DBI::dbh_do(undef, undef, "SELECT `me`.`id`, > `me`.`status`, `me`.`progress`, `me`.`size`"..., ARRAY(0x55b81f453d30), > ARRAY(0x55b81f81d380)) called at /usr/share/perl5/DBIx/Class/Storage/DBI.pm > line 1939 > > > DBIx::Class::Storage::DBI::_execute(DBIx::Class::Storage::DBI::mysql=HASH(0x55b81d683d78), > "select", ARRAY(0x55b81f9d3278), ARRAY(0x55b81d7db5a8), > HASH(0x55b81f16bd08), HASH(0x55b81f6cc0b8)) called at > /usr/share/perl5/DBIx/Class/Stor > > age/DBI.pm line 2584 > > > DBIx::Class::Storage::DBI::_select(DBIx::Class::Storage::DBI::mysql=HASH(0x55b81d683d78), > ARRAY(0x55b81f9d3278), ARRAY(0x55b81d7db5a8), HASH(0x55b81f16bd08), > HASH(0x55b81f4ce6c0)) called at /usr/share/perl5/DBIx/Class/Storage/DBI.pm > > line 2761 > > > DBIx::Class::Storage::DBI::select_single(DBIx::Class::Storage::DBI::mysql=HASH(0x55b81d683d78), > ARRAY(0x55b81f9d3278), ARRAY(0x55b81d7db5a8), HASH(0x55b81f16bd08), > HASH(0x55b81f4ce6c0)) called at /usr/share/perl5/DBIx/Class/ResultSet > > .pm line 1101 > > > DBIx::Class::ResultSet::single(DBIx::Class::ResultSet=HASH(0x55b81d89b180)) > called at /usr/share/perl5/DBIx/Class/ResultSet.pm line 910 > > > DBIx::Class::ResultSet::find(DBIx::Class::ResultSet=HASH(0x55b81f9d6b68), > 12) called at /kohadevbox/koha/Koha/Objects.pm line 96 > > Koha::Objects::find("Koha::BackgroundJobs", 12) called at > /kohadevbox/koha/misc/background_jobs_worker.pl line 132 > > > > David Cook > > Senior Software Engineer > > Prosentient Systems > > Suite 7.03 > > 6a Glen St > > Milsons Point NSW 2061 > > Australia > > > > Office: 02 9212 0899 > > Online: 02 8005 0595 > > > > *From:* David Cook <dc...@prosentient.com.au> > *Sent:* Thursday, October 19, 2023 4:26 PM > *To:* 'Koha Devel' <koha-devel@lists.koha-community.org> > *Cc:* 'Jonathan Druart' <jonathan.dru...@bugs.koha-community.org>; 'Tomas > Cohen Arazi' <tomasco...@gmail.com> > *Subject:* RE: [Koha-devel] Background jobs quirks > > > > Success! > > > > I created the persistent database connection between the background job > worker and the database, and then I restarted the database. > > > > When I tried to enqueue a background job, it’s now stuck in a “New” status > with a Progress of “null/1”, and /var/log/koha/kohadev/worker-error.log > says the following: > > 20231019 05:02:14 kohadev-koha-worker-long_tasks: client (pid 23247) > killed by signal 13, respawning > > > > Now Starman seems to have managed to re-establish a connection without any > errors. It’s not clear why that might be. I suppose Starman/Plack might be > suppressing SIGPIPE errors. We might have some subtle issues under the hood > without even realizing it. > > > > (That suddenly reminds me of that MYSQL_OPT_RECONNECT issue from August. > MySQL 8.0.34/8.1.0 deprecate automatic reconnection. I would hope that > DBD::MySQL would take care of that, but as we might recall that module > isn’t really maintained…) > > > > I am testing on 22.11 so maybe it’s a bit different in master… but it’s > interesting. > > > > David Cook > > Senior Software Engineer > > Prosentient Systems > > Suite 7.03 > > 6a Glen St > > Milsons Point NSW 2061 > > Australia > > > > Office: 02 9212 0899 > > Online: 02 8005 0595 > > > > *From:* David Cook <dc...@prosentient.com.au> > *Sent:* Thursday, October 19, 2023 3:55 PM > *To:* 'David Cook' <dc...@prosentient.com.au>; 'Koha Devel' < > koha-devel@lists.koha-community.org> > *Subject:* RE: [Koha-devel] Background jobs quirks > > > > On a standard Debian package install of Koha, it looks like there are 3 > persistent connections to the database: 2x Starman workers and 1x Zebra > Indexer. > > > > Then if you use the background jobs, the background job worker for that > queue will create a persistent connection as well. > > > > I had to work out the MySQL/MariaDB ID through trial and error since there > wasn’t enough information in the processlist in the database, but I got the > ID, so I’ll check the logs in the morning to see if that’s the problem… > > > > -- > > > > I’m losing some confidence in this theory as I’m looking at the > koha-testing-docker database instance and I’m seeing similar timeouts, but > it hasn’t triggered the “killed by signal 13, respawning”. > > > > 2023-10-19 3:06:09 44 [Warning] Aborted connection 44 to db: > 'koha_kohadev' user: 'koha_kohadev' host: '172.21.0.4' (Got an error > reading communication packets) > > 2023-10-19 3:06:09 32 [Warning] Aborted connection 32 to db: > 'koha_kohadev' user: 'koha_kohadev' host: '172.21.0.4' (Got an error > reading communication packets) > > > > -- > > > > I’m thinking the best thing to do is just ignore SIGPIPE and let our > regular error handling take care of it, as we should then get a more > nuanced error message somewhere… > > > > Looking at https://metacpan.org/dist/Net-Stomp/source/lib/Net/Stomp.pm > and it ignores SIGPIPE when it’s sending frames… so it’s probably not the > $conn->ack() doing it. > > > > I do wonder if it’s “my $job = > Koha::BackgroundJobs->find($args->{job_id});” > > > > I suppose we’ll see in the morning… > > > > David Cook > > Senior Software Engineer > > Prosentient Systems > > Suite 7.03 > > 6a Glen St > > Milsons Point NSW 2061 > > Australia > > > > Office: 02 9212 0899 > > Online: 02 8005 0595 > > > > *From:* Koha-devel <koha-devel-boun...@lists.koha-community.org> *On > Behalf Of *David Cook via Koha-devel > *Sent:* Thursday, October 19, 2023 2:41 PM > *To:* 'Koha Devel' <koha-devel@lists.koha-community.org> > *Subject:* [Koha-devel] Background jobs quirks > > > > Hi all, > > > > Have you seen something like the following in your worker-error.log? > > > > 20231018 07:44:06 instance-koha-worker-long_tasks: client (pid 3949888) > killed by signal 13, respawning > > 20231019 07:21:26 instance-koha-worker-long_tasks: client (pid 4082852) > killed by signal 13, respawning > > > > I found some discussion in a thread > https://www.mail-archive.com/koha@lists.katipo.co.nz/msg30046.html > > > > Signal 13 would be a SIGPIPE I believe. At first, I assumed it was the > connection between the background worker and RabbitMQ, but based off the > RabbitMQ logs[1] that seemed unlikely. > > > > MariaDB is another possibility. Looking at its logs [2] and I see some > timed out connections. > > > > I might monitor the connections between the background workers and the > database tonight and see if it’s related. It might also explain why it > happens on my MariaDB system but not my MySQL system… > > > > Anyway, just putting out some feelers… > > > > [1] > > 2023-10-18 07:44:06.492400+11:00 [info] <0.30487.131> accepting STOMP > connection <0.30487.131> (127.0.0.1:59120 -> 127.0.0.1:61613) > > 2023-10-18 07:44:06.500534+11:00 [info] <0.30487.131> closing STOMP > connection <0.30487.131> (127.0.0.1:59120 -> 127.0.0.1:61613) > > 2023-10-18 07:44:06.510426+11:00 [info] <0.20712.130> closing STOMP > connection <0.20712.130> (127.0.0.1:57508 -> 127.0.0.1:61613) > > 2023-10-18 07:44:08.703380+11:00 [info] <0.30518.131> accepting STOMP > connection <0.30518.131> (127.0.0.1:59882 -> 127.0.0.1:61613) > > > > 2023-10-19 07:21:26.254065+11:00 [info] <0.27344.133> accepting STOMP > connection <0.27344.133> (127.0.0.1:54116 -> 127.0.0.1:61613) > > 2023-10-19 07:21:26.259129+11:00 [info] <0.27344.133> closing STOMP > connection <0.27344.133> (127.0.0.1:54116 -> 127.0.0.1:61613) > > 2023-10-19 07:21:26.269652+11:00 [info] <0.30518.131> closing STOMP > connection <0.30518.131> (127.0.0.1:59882 -> 127.0.0.1:61613) > > 2023-10-19 07:21:26.660604+11:00 [info] <0.27372.133> accepting STOMP > connection <0.27372.133> (127.0.0.1:46178 -> 127.0.0.1:61613) > > > > [2] > > Oct 18 02:08:28 awesome-host mariadbd[959]: 2023-10-18 2:08:28 6308 > [Warning] Aborted connection 6308 to db: 'koha_instance' user: > 'koha_instance' host: 'localhost' (Got timeout reading communication > packets) > > Oct 18 02:15:58 awesome-host mariadbd[959]: 2023-10-18 2:15:58 6304 > [Warning] Aborted connection 6304 to db: 'koha_instance' user: > 'koha_instance' host: 'localhost' (Got timeout reading communication > packets) > > Oct 18 02:16:06 awesome-host mariadbd[959]: 2023-10-18 2:16:06 6303 > [Warning] Aborted connection 6303 to db: 'koha_instance' user: > 'koha_instance' host: 'localhost' (Got timeout reading communication > packets) > > Oct 18 21:10:08 awesome-host mariadbd[959]: 2023-10-18 21:10:08 6380 > [Warning] Aborted connection 6380 to db: 'koha_instance' user: > 'koha_instance' host: 'localhost' (Got timeout reading communication > packets) > > Oct 19 02:21:17 awesome-host mariadbd[959]: 2023-10-19 2:21:17 6431 > [Warning] Aborted connection 6431 to db: 'koha_instance' user: > 'koha_instance' host: 'localhost' (Got timeout reading communication > packets) > > Oct 19 02:24:02 awesome-host mariadbd[959]: 2023-10-19 2:24:02 6432 > [Warning] Aborted connection 6432 to db: 'koha_instance' user: > 'koha_instance' host: 'localhost' (Got timeout reading communication > packets) > > > > David Cook > > Senior Software Engineer > > Prosentient Systems > > Suite 7.03 > > 6a Glen St > > Milsons Point NSW 2061 > > Australia > > > > Office: 02 9212 0899 > > Online: 02 8005 0595 > > >
_______________________________________________ Koha-devel mailing list Koha-devel@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel website : https://www.koha-community.org/ git : https://git.koha-community.org/ bugs : https://bugs.koha-community.org/