Re: Some HTTP connections not closing properly on Haproxy 2.8.10

Jens Wahnes Tue, 30 Jul 2024 08:23:56 -0700

Hi Christopher,

Christopher Faulet wrote:

Le 29/07/2024 à 16:30, Jens Wahnes a écrit :

Christopher Faulet wrote:

Le 29/07/2024 à 09:05, Christopher Faulet a écrit :

Thanks, I will investigate. It is indeed most probably an issue withthe
splicing, as Willy said. I will try to find the bug on the 2.8 and
figure out if
upper versions are affected too.


I'm able to reproduce the issue by hacking the code, forcing a
connection error by hand. It occurs when an error is reported on the
connection when haproxy tries to send data using kernel splicing. But it
is only an issue when a filter is attached to the applicative stream. I
guess you have enabled the HTTP compression. The response is not
compressed of course, otherwise the kernel splicing would not be used.

But it is still attached to the stream and it has an effect in thiscase.


AFAIK, the older versions are not affected. On newer versions, I don't
really know. There is an issue with my hack, but timeouts are still
active and a true client abort is properly detected. So, I'm inclined to
think there is no issue on these versions. But my fix will probably be
applicable too.

I'm on the fix. I must test when this happens on server side, to be
sure. But it should be fixed soon.


Thank you for the update.

My results so far: Everything is fine on 2.8.10 without splicing.

On 3.0.3 with splicing turned on, I have also not seen any lingering
sessions, but I have only been running version 3.0.3 for some hours now,
so this could still happen. I'll rather let it run for some more time
before drawing conclusions.

Thanks for the confirmation. On 3.0, I was unable to reproduce theissue. So I'm not surprised.

On version 3.0.3 with splicing turned on, I actually did end up with abackend connection in state CLOSE_WAIT that is still around after somehours. But it is different from the other cases I saw (with version2.8.10). This one on 3.0.3 is an HTTPS connection (likely using HTTP/2)on the frontend side and the backend connection is HTTP/1. What is alsoa bit special is that the HTTP response code is 204, so there is no"real" data being transmitted.

So I'm not sure if this one is really related to the other ones or ifit's just a coincidence that I'm seeing this happen as I'm looking forother connections not being closed properly. :)


The associated `show sess` looks like this (IP addresses slightly altered):

```

0x7fc63caccc00: [30/Jul/2024:14:05:50.626096] id=123594 proto=tcpv4source=10.80.119.118:53864flags=0x3384a, conn_retries=0, conn_exp=<NEVER> conn_et=0x000srv_conn=0x7fc63f0a6000, pend_pos=(nil) waiting=0 epoch=0frontend=Loadbalancer (id=48 mode=http), listener=https (id=15)addr=10.210.18.56:443

  backend=projekt_piwik_2019 (id=94 mode=http) addr=172.16.240.53:50858
  server=counterstrike (id=1) addr=172.16.240.99:2501

task=0x7fc63eb6f260 (state=0x00 nice=400 calls=6 rate=0 exp=<NEVER>tid=1(1/1) age=2h20m)txn=0x7fc63ed00320 flags=0x40000 meth=3 status=-1 req.st=MSG_DONErsp.st=MSG_RPBEFORE req.f=0x4d rsp.f=0x00scf=0x7fc642deb620 flags=0x00070006 ioto=1m state=CLOendp=CONN,0x7fc63ebbdf00,0x5043d601 sub=0 rex=<NEVER> wex=<NEVER> rto=?wto=<NEVER>

    iobuf.flags=0x00000000 .pipe=0 .buf=0@(nil)+0/0
      h2s=0x7fc63ebbdf00 h2s.id=15 .st=CLO .flg=0x4109 .rxbuf=0@(nil)+0/0

.sc=0x7fc642deb620(.flg=0x00070006 .app=0x7fc63caccc00).sd=0x7fc642d25770(.flg=0x5043d601)

       .subs=(nil)

h2c=0x7fc642d8f200 h2c.st0=FRH .err=0 .maxid=15 .lastid=-1.flg=0x1a60600 .nbst=0 .nbsc=1, .glitches=0.fctl_cnt=0 .send_cnt=0 .tree_cnt=1 .orph_cnt=0 .sub=0 .dsi=15.dbuf=0@(nil)+0/0.mbuf=[1..1|32],h=[0@(nil)+0/0],t=[0@(nil)+0/0].task=0x7fc642c0af60 .exp=<NEVER>co0=0x7fc63cb2d320 ctrl=tcpv4 xprt=SSL mux=H2 data=STRMtarget=LISTENER:0x7fc641a5a400

      flags=0x801c0300 fd=237 fd.state=1922 updt=0 fd.tmask=0x2

scb=0x7fc642da50e0 flags=0x00001013 ioto=10m state=ESTendp=CONN,0x7fc63ebbb200,0x50404001 sub=1 rex=<NEVER> wex=<NEVER> rto=?wto=<NEVER>

    iobuf.flags=0x00000000 .pipe=0 .buf=0@(nil)+0/0

h1s=0x7fc63ebbb200 h1s.flg=0x94010 .sd.flg=0x50404001.req.state=MSG_DONE .res.state=MSG_DONE.meth=POST status=204 .sd.flg=0x50404001 .sc.flg=0x00001013.sc.app=0x7fc63caccc00.subs=0x7fc642da50f8(ev=1 tl=0x7fc642da9c40 tl.calls=4tl.ctx=0x7fc642da50e0 tl.fct=sc_conn_io_cb)h1c=0x7fc63ca5d840 h1c.flg=0x80000000 .sub=0 .ibuf=0@(nil)+0/0.obuf=0@(nil)+0/0 .task=0x7fc63cad2940 .exp=<NEVER>co1=0x7fc63ebb6820 ctrl=tcpv4 xprt=RAW mux=H1 data=STRMtarget=SERVER:0x7fc63f0a6000

      flags=0x00000300 fd=167 fd.state=11122 updt=0 fd.tmask=0x2
  filters={0x7fc642d6fb30="bandwidth limitation filter"}
  req=0x7fc63caccc28 (f=0x20840000 an=0x48000 tofwd=0 total=993)

an_exp=<NEVER> buf=0x7fc63caccc30 data=0x7fc63cb52280 o=0 p=0i=16384 size=16384htx=0x7fc63cb52280 flags=0x10 size=16336 data=1 used=1 wrap=NOextra=0

  res=0x7fc63caccc70 (f=0x80008000 an=0x20000000 tofwd=0 total=274)

an_exp=<NEVER> buf=0x7fc63caccc78 data=0x7fc63c6a7dc0 o=274 p=274i=16110 size=16384htx=0x7fc63c6a7dc0 flags=0x10 size=16336 data=274 used=11 wrap=NOextra=0

```

I pushed some patches that should fix your issue. They cannot be appliedas-is on the 2.8. You can use attached patches for the 2.8 if you wantto try. It could help to be sure they properly fix your issue.

Thank you. I applied the patches and compiled a version of 2.8.10 basedon this. However, I'm hesitating to run it just yet, as I am uncertainif the above example of a "stuck" session in version 3.0.3 is worthanother look. If you would like me to perform any action on that sessionto diagnose further, like the "close FD" or anything else, please let meknow.


Otherwise, I'd try the version of 2.8.10 with your patched applied.


Jens

Re: Some HTTP connections not closing properly on Haproxy 2.8.10

Reply via email to