Re: [Koha-devel] The many failings of background_jobs_worker.pl

Philippe Blouin Wed, 21 Dec 2022 12:49:00 -0800

Hello Fridolin,

Yes, performance is probably important for MQ, it seems to eat adisproportionate amount of resources sometime. :)

But like I said, our server is a beast. In every aspect of its build,it has the best components for 2022.


Let's see about the race condition:

 * An update is done on a biblio
 * An update_elastic_index job is created in the db.
 * Its ID is pushed onto MQ
 * background_jobs_worker.pl picks up the ID from MQ
     o it goes to the DB, and finds nothing with that ID.  We get a
       pointer error (yeah, I come from C)
     o This is NOT an old forgotten floating job, since we can see the
       job in the database when looking manually.
     o The jobs stays there forever, with status 'new'.
 * If I add a "sleep 1", this issue _mostly_ disappear.

There's no server performance that could explain this. Maybe some DBcaching ?



Philippe Blouin,
Directeur de la technologie

Tél.  : (833) 465-4276, poste 230
philippe.blo...@inlibro.com

inLibro | pour esprit libre | www.inLibro.com <http://www.inLibro.com>
On 2022-12-21 13:06, Fridolin SOMERS wrote:

Hi,

I think network performance is really important for RabbitMQ.
We at Biblibre have a virtual machine in each server, to share betweeneach virtual machine (one Koha per machine) but keep a good networkperformance.
Looks to work well, but we are still in 21.11.

Best regards,

Le 20/12/2022 à 09:13, Philippe Blouin a écrit :
Howdy!
Since moving a lot of our users to 22.05.06, we've installed theworker everywhere. But the number of issues encountered is staggering.
The first one was

Can't call method "process" on an undefined value
where the id received from MQ was not found in the DB, and theprocess is going straight to process_job and failing. Absolutely noidea how that occurs, seems completely counterintuitive (the ID comesfrom the DB after all), but here it is. Hacked the code to add a"sleep 1" to fix most of that one.
Then came the fact that stored events were not checked if theconnection to MQ was successful at startup. Bug 30654 refers it.Hacked a little "$init" in there to clear that up at startup.
Then came the
malformed UTF-8 character in JSON string, at character offset 296(before "\x{e9}serv\x{e9} au ...")
at decode_json that crashes the whole process. And for some reason,it never gets over it, gets the same problem at every restart, likethe event is never "eaten" from the queue. Hacked an eval then atry-catch over it...
After coding a monitor to alert when a background_jobs has been "new"over 5 minutes in the DB, I was inundated by messages. There's alwayone elasticsearch_update that escapes among the flurry, and theyslowly add up.
At this point, the only viable solution is to run the workers butdisable RabbitMQ everywhere. Are we really the only onesexperiencing that?
Regards,
PS Our servers are well-above-average Debian 11 machines with lot offirepower (ram, cpu, i/o...).
--
Philippe Blouin,
Directeur de la technologie

Tél.  : (833) 465-4276, poste 230
philippe.blo...@inlibro.com

inLibro | pour esprit libre | www.inLibro.com <http://www.inLibro.com>

_______________________________________________
Koha-devel mailing list
Koha-devel@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
website : https://www.koha-community.org/
git : https://git.koha-community.org/
bugs : https://bugs.koha-community.org/

_______________________________________________
Koha-devel mailing list
Koha-devel@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
website : https://www.koha-community.org/
git : https://git.koha-community.org/
bugs : https://bugs.koha-community.org/

Re: [Koha-devel] The many failings of background_jobs_worker.pl

Reply via email to