On Mon, 19 Oct 2020 at 10:00, Christopher Faulet <[email protected]> wrote:
>
> Le 16/10/2020 à 10:04, Christopher Faulet a écrit :
> > Le 13/10/2020 à 14:53, Peter Statham a écrit :
> >> Hello,
> >>
> >> We've found an issue when using agent checks in conjunction with the 
> >> weighted
> >> least connections algorithm in multithreaded mode.  It seems to me as if 
> >> it is
> >> possible for next_eweight in struct server to be modified in another thread
> >> during the execution of fwlc_srv_reposition.  If next_eweight is set to 
> >> zero
> >> then a division by zero occurs on line 59 in src/lb_fwlc.c in 
> >> fwlc_queue_srv.
> >>
> >> I notice that in haproxy-2.0.18 this section of code is protected by
> >> HA_SPINLOCKs and I've been unable to replicate this issue in that version.
> >>
> >> I've written an agent (attached), bad_agent.py, which provokes this 
> >> condition by
> >> switching randomly between 1 and 0 percent.  I also include a minimal
> >> configuration, cfg (also attached), which seems sufficient to cause the 
> >> issue.
> >> With these two running “ab -n 5000000 -c 500 http://192.168.92.1:8080/” 
> >> will
> >> quickly crash the haproxy process.
> >>
> >> I include links to a coredump and the binary that generated it 
> >> (unstripped).
> >> The backtrace of thread 1 follows.
> >>
> >
> > Hi,
> >
> > Thanks for the reproducer. I'm able to crash HAProxy too using your config 
> > and
> > your agent. It seems to only crash on the 1.8. I'll investigate.
> >
>
> Hi,
>
> In fact, it fails in all branches supporting the threads. The leasconn and 
> first
> loadbalancing algorithms are affected by this bug. In leastconn, it may crash
> because of the division by 0 when the server weight is set to 0. But for the
> both algos, the server tree may be also corrupted, leading to stranger and
> undefined bugs.
>
> I pushed a fix (commit 26a52a) and backported it as far as 1.8. So, it should 
> be
> fixed in all branches now.
>
> Thanks !
> --
> Christopher Faulet

Thank you for making a patch for this bug, Christopher.  I've checked
out the 1.8 master (I would have done so sooner, but I'm afraid I
didn't have access to my email last week) and I'm happy to say I can't
replicate the crash. :)

--
Peter Statham

Reply via email to