Hello Fridolin,
Yes, performance is probably important for MQ, it seems to eat a
disproportionate amount of resources sometime. :)
But like I said, our server is a beast. In every aspect of its build,
it has the best components for 2022.
Let's see about the race condition:
* An update is done on a biblio
* An update_elastic_index job is created in the db.
* Its ID is pushed onto MQ
* background_jobs_worker.pl picks up the ID from MQ
o it goes to the DB, and finds nothing with that ID. We get a
pointer error (yeah, I come from C)
o This is NOT an old forgotten floating job, since we can see the
job in the database when looking manually.
o The jobs stays there forever, with status 'new'.
* If I add a "sleep 1", this issue _mostly_ disappear.
There's no server performance that could explain this. Maybe some DB
caching ?
Philippe Blouin,
Directeur de la technologie
Tél. : (833) 465-4276, poste 230
philippe.blo...@inlibro.com
inLibro | pour esprit libre | www.inLibro.com <http://www.inLibro.com>
On 2022-12-21 13:06, Fridolin SOMERS wrote:
Hi,
I think network performance is really important for RabbitMQ.
We at Biblibre have a virtual machine in each server, to share between
each virtual machine (one Koha per machine) but keep a good network
performance.
Looks to work well, but we are still in 21.11.
Best regards,
Le 20/12/2022 à 09:13, Philippe Blouin a écrit :
Howdy!
Since moving a lot of our users to 22.05.06, we've installed the
worker everywhere. But the number of issues encountered is staggering.
The first one was
Can't call method "process" on an undefined value
where the id received from MQ was not found in the DB, and the
process is going straight to process_job and failing. Absolutely no
idea how that occurs, seems completely counterintuitive (the ID comes
from the DB after all), but here it is. Hacked the code to add a
"sleep 1" to fix most of that one.
Then came the fact that stored events were not checked if the
connection to MQ was successful at startup. Bug 30654 refers it.
Hacked a little "$init" in there to clear that up at startup.
Then came the
malformed UTF-8 character in JSON string, at character offset 296
(before "\x{e9}serv\x{e9} au ...")
at decode_json that crashes the whole process. And for some reason,
it never gets over it, gets the same problem at every restart, like
the event is never "eaten" from the queue. Hacked an eval then a
try-catch over it...
After coding a monitor to alert when a background_jobs has been "new"
over 5 minutes in the DB, I was inundated by messages. There's alway
one elasticsearch_update that escapes among the flurry, and they
slowly add up.
At this point, the only viable solution is to run the workers but
disable RabbitMQ everywhere. Are we really the only ones
experiencing that?
Regards,
PS Our servers are well-above-average Debian 11 machines with lot of
firepower (ram, cpu, i/o...).
--
Philippe Blouin,
Directeur de la technologie
Tél. : (833) 465-4276, poste 230
philippe.blo...@inlibro.com
inLibro | pour esprit libre | www.inLibro.com <http://www.inLibro.com>
_______________________________________________
Koha-devel mailing list
Koha-devel@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
website : https://www.koha-community.org/
git : https://git.koha-community.org/
bugs : https://bugs.koha-community.org/
_______________________________________________
Koha-devel mailing list
Koha-devel@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
website : https://www.koha-community.org/
git : https://git.koha-community.org/
bugs : https://bugs.koha-community.org/