Le 04/12/2020 à 21:24, Peter Statham a écrit :
I might have spoken too soon.
The latest release of 1.8 works flawlessly on my debian desktop but
still crashes when I attempt the same configuration on a CentOS
virtual machine on our VMWare cluster.
I'm not sure if this is down to differences in the way memory fencing
or thread scheduling work on these platforms or if it is a
library/compiler issue. Backporting the LBPRM spinlocks from 1.9's
src/lb_fwlc.c seems to help but I will continue investigating and
hopefully rule out some of the other possibilities.
Hum, not good. Peter, it is the same crash or not ? I didn't checked very
deeply, but I guess you backported th e commit 1b87748ff5 ("BUG/MEDIUM:
lb/threads: always properly lock LB algorithms on maintenance operations"). A
comment in the commit message says it may be required on the 1.8 if some bugs
surface in this area.
However I'm surprised because locked functions are called for the rendez-vous
point. It means all threads are blocked at the same point waiting the updates on
servers are performed.
--
Christopher Faulet