Re: [Koha] Help needed with zombie background_jobs processes

Jonathan Druart Wed, 19 Apr 2023 09:25:13 -0700

It would be interesting to revert the changes from 32558 that have
been backported into 22.11.04 and see if it helps.


Le mer. 19 avr. 2023 à 18:01, Cindy Murdock Ames <cmurd...@ccfls.org> a écrit :
>
> Hi Jonathan,
>
> I just tried sending SHGCHLD to the parent processes, it didn't have any 
> effect.  The parents are "/usr/bin/perl 
> /usr/share/koha/bin/background_jobs_worker.pl --queue default" and 
> "/usr/bin/perl /usr/share/koha/bin/background_jobs_worker.pl --queue 
> long_tasks".
>
> worker-error.log has a few entries like these three from today:
> 20230419 08:44:53 ccfls-koha-worker-long_tasks: client (pid 12169) killed by 
> signal 13, respawning
> 20230419 09:36:06 ccfls-koha-worker: client (pid 14398) killed by signal 13, 
> respawning
> 20230419 09:59:35 ccfls-koha-worker: client (pid 29935) killed by signal 13, 
> respawning
>
> Those timestamps correspond to three jobs in the jobs queue that didn't 
> complete and have a "null/n" (n being numbers that I think correspond to the 
> number of things in the batch).  The first is a batch item record 
> modification and the other two are holds queue updates.
>
> I cancelled these three jobs and the zombies remained.
>
> worker-output.log has a number of entries like these, but unfortunately there 
> are no timestamps so I can't link it to anything, although the timestamp on 
> the file itself is from yesterday at 13:08, which I think corresponded to a 
> successful staging and import of records.
>
> Use of uninitialized value $subfield_value in pattern match (m//) at 
> /usr/share/koha/lib/Koha/SimpleMARC.pm line 435.
> Use of uninitialized value $subfield_value in string eq at 
> /usr/share/koha/lib/Koha/SimpleMARC.pm line 435.
>
> I did try something else.  The parent process for the long queue had 
> apparently already respawned, but the one for the default one hadn't, so I 
> killed it with -9.  The two zombies that had been there went away and the 
> default queue restarted.  Before I did that I tried a MARC upload, it was 
> stuck at 0%.  I cancelled the job and retried it after killing the default 
> queue and it worked, but it spawned a new zombie which was a child of the 
> long_tasks queue.  Yesterday it seemed to work if there was only one zombie, 
> but not two.  No new entries in either of the worker- files.
>
> Thanks for your help.
>
> c.
> -----------------------------------------------------------
> Cindy Murdock Ames
> IT Services Director
> Meadville Public Library | CCFLS
> https://meadvillelibrary.org | https://ccfls.org
>
> Please report tech support issues in Mantis:  https://mantis.ccfls.org
>
>
> On Wed, Apr 19, 2023 at 2:44 AM Jonathan Druart 
> <jonathan.dru...@bugs.koha-community.org> wrote:
>>
>> Did you have a look at worker-*.log? Nothing useful there?
>>
>> You can try to send SIGCHLD to the parent to kill the zombie.
>>
>> Le mar. 18 avr. 2023 à 22:09, Cindy Murdock Ames <cmurd...@ccfls.org> a 
>> écrit :
>> >
>> > A few other things I've noticed:
>> >
>> > - Sometimes the zombie processes will go away on their own, sometimes it 
>> > seems when you retry the MARC import or whatever it was that failed.  This 
>> > one is really weird to me as in all my years as a sysadmin I thought it 
>> > was not possible for zombie processes to go away without a reboot.  But 
>> > maybe that's changed and now zombies can rise from the dead.  Lol.
>> >
>> > - In looking at the jobs list in Koha, it seems that Holds queue updates 
>> > are especially prone to getting stuck at a progress of null/1.
>> >
>> > - If you reattempt a job that is stuck (ie, reattempting a MARC file 
>> > upload or what not) it will often succeed.  The original failed job 
>> > remains with a progress of null.
>> >
>> > c.
>> > -----------------------------------------------------------
>> > Cindy Murdock Ames
>> > IT Services Director
>> > Meadville Public Library | CCFLS
>> > https://meadvillelibrary.org | https://ccfls.org
>> >
>> > Please report tech support issues in Mantis:  https://mantis.ccfls.org
>> >
>> >
>> > On Tue, Apr 18, 2023 at 3:55 PM Cindy Murdock Ames <cmurd...@ccfls.org> 
>> > wrote:
>> >>
>> >> Yes, it's 22.11.04, package version.
>> >>
>> >> -----------------------------------------------------------
>> >> Cindy Murdock Ames
>> >> IT Services Director
>> >> Meadville Public Library | CCFLS
>> >> https://meadvillelibrary.org | https://ccfls.org
>> >>
>> >>
>> >>
>> >>
>> >> On Tue, Apr 18, 2023 at 2:59 PM Jonathan Druart 
>> >> <jonathan.dru...@bugs.koha-community.org> wrote:
>> >>>
>> >>> Hi Cindy,
>> >>> Which exact version of Koha 22.11.xx? It should be the latest one.
>> >>> Regards,
>> >>> Jonathan
>> >>>
>> >>> Le mar. 18 avr. 2023 à 19:13, Cindy Murdock Ames <cmurd...@ccfls.org> a 
>> >>> écrit :
>> >>> >
>> >>> > Hi all,
>> >>> >
>> >>> > A couple weekends ago I upgraded our Koha instance from 22.05 to 
>> >>> > 22.11, and
>> >>> > I'm having trouble with the background_jobs processes becoming zombies
>> >>> > after a very short amount of time, necessitating a reboot.  I suspect 
>> >>> > it's
>> >>> > a misconfiguration on my part, so if someone can shed some light I'd 
>> >>> > really
>> >>> > appreciate it!
>> >>> >
>> >>> > The first symptom was our MARC imports getting stuck at "import 
>> >>> > queued",
>> >>> > and after some digging (and thanks to the thread in this list with the
>> >>> > subject of "Background job / Staging MARC import stuck at 0%" I found 
>> >>> > I was
>> >>> > entirely missing the <message_broker> section in our config, so I added
>> >>> > this:
>> >>> >
>> >>> >  <message_broker>
>> >>> >    <hostname>localhost</hostname>
>> >>> >    <port>61613</port>
>> >>> >    <username>guest</username>
>> >>> >    <password>guest</password>
>> >>> >    <vhost></vhost>
>> >>> >  </message_broker>
>> >>> >
>> >>> > Which seemed to resolve it, but now I find that the background_jobs
>> >>> > processes are going zombie after processing only a few jobs.  Here's 
>> >>> > some
>> >>> > info from the rabbitmq log after restarting the server:
>> >>> >
>> >>> > =INFO REPORT==== 18-Apr-2023::12:23:46 ===
>> >>> > node           : rabbit@ccflskoha
>> >>> > home dir       : /var/lib/rabbitmq
>> >>> > config file(s) : /etc/rabbitmq/rabbitmq.config (not found)
>> >>> > cookie hash    : ojvkUE6eUtku7kHlx3uiFg==
>> >>> > log            : /var/log/rabbitmq/rab...@ccflskoha.log
>> >>> > sasl log       : /var/log/rabbitmq/rab...@ccflskoha-sasl.log
>> >>> > database dir   : /var/lib/rabbitmq/mnesia/rabbit@ccflskoha
>> >>> >
>> >>> > Is it problematic that /etc/rabbitmq/rabbitmq.config is missing?  
>> >>> > Anything
>> >>> > else I should be looking at?  We're running on Ubuntu SE 18.04 if that 
>> >>> > is
>> >>> > helpful.
>> >>> >
>> >>> > Thanks much!
>> >>> > Cindy
>> >>> >
>> >>> >
>> >>> > -----------------------------------------------------------
>> >>> > Cindy Murdock Ames
>> >>> > IT Services Director
>> >>> > Meadville Public Library | CCFLS
>> >>> > https://meadvillelibrary.org | https://ccfls.org
>> >>> > _______________________________________________
>> >>> >
>> >>> > Koha mailing list  http://koha-community.org
>> >>> > Koha@lists.katipo.co.nz
>> >>> > Unsubscribe: https://lists.katipo.co.nz/mailman/listinfo/koha
_______________________________________________

Koha mailing list  http://koha-community.org
Koha@lists.katipo.co.nz
Unsubscribe: https://lists.katipo.co.nz/mailman/listinfo/koha

Re: [Koha] Help needed with zombie background_jobs processes

Reply via email to