Hi,I finally found some issues. Not sure it is exactly yours. But it is very similar. So I'm confident. I pushed 3 new fixes in 3.1-DEV. It may be good to check on the 3.0 if this fully fixes all your issues. And eventually, if possible, on 2.8 too, with the splicing enabled.
In attachement, the whole series for 3.0 and 2.8: My first 3 patches and the 3 new ones.
I'm going to be on vacation for 3 weeks. so there's no rush Many thanks for your help. -- Christopher Faulet
From db2e751bc8cc4bd8a96d2149c8a59022d15b9709 Mon Sep 17 00:00:00 2001 From: Christopher Faulet <[email protected]> Date: Mon, 29 Jul 2024 17:48:16 +0200 Subject: [PATCH 1/6] BUG/MEDIUM: stconn: Report error on SC on send if a previous SE error was set When a send on a connection is performed, if a SE error (or a pending error) was already reported earlier, we leave immediately. No send is performed. However, we must be sure to report the error at the SC level if necessary. Indeed, the SE error may have been reported during the zero-copy data forwarding. So during receive on the opposite side. In that case, we may have missed the opportunity to report it at the SC level. The patch must be backported as far as 2.8. (cherry picked from commit 5dc45445ff18207dbacebf1f777e1f1abcd5065d) Signed-off-by: Christopher Faulet <[email protected]> --- src/stconn.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/stconn.c b/src/stconn.c index 607740305a..536b1a6bbc 100644 --- a/src/stconn.c +++ b/src/stconn.c @@ -1590,6 +1590,8 @@ int sc_conn_send(struct stconn *sc) if (sc->state < SC_ST_CON) return 0; BUG_ON(sc_ep_test(sc, SE_FL_EOS|SE_FL_ERROR|SE_FL_ERR_PENDING) == (SE_FL_EOS|SE_FL_ERR_PENDING)); + if (sc_ep_test(sc, SE_FL_ERROR)) + sc->flags |= SC_FL_ERROR; return 1; } -- 2.45.2
From 0c98078e8a9dedad6e75f8b23c293ebdf85150a6 Mon Sep 17 00:00:00 2001 From: Christopher Faulet <[email protected]> Date: Tue, 30 Jul 2024 08:41:03 +0200 Subject: [PATCH 2/6] BUG/MEDIUM: mux-pt/mux-h1: Release the pipe on connection error on sending path When data are sent using the kernel splicing, if a connection error occurred, the pipe must be released. Indeed, in that case, no more data can be sent and there is no reason to not release the pipe. But it is in fact an issue for the stream because the channel will appear are not empty. This may prevent the stream to be released. This happens on 2.8 when a filter is also attached on it. On 2.9 and upper, it seems there is not issue. But it is hard to be sure and the current patch remains valid is all cases. On 2.6 and lower, the code is not the same and, AFAIK, there is no issue. This patch must be backported to 2.8. However, on 2.8, there is no zero-copy data forwarding. The patch must be adapted. There is no done_ff/resume_ff callback functions for muxes. The pipe must released in sc_conn_send() when an error flag is set on the SE, after the call to snd_pipe callback function. (cherry picked from commit 760d26a8625f3af2b6939037a40f19b5f8063be1) Signed-off-by: Christopher Faulet <[email protected]> --- src/mux_h1.c | 8 ++++++++ src/mux_pt.c | 8 ++++++++ 2 files changed, 16 insertions(+) diff --git a/src/mux_h1.c b/src/mux_h1.c index a4118dc9b3..075acb62f4 100644 --- a/src/mux_h1.c +++ b/src/mux_h1.c @@ -4829,6 +4829,10 @@ static size_t h1_done_ff(struct stconn *sc) h1c->conn->xprt->subscribe(h1c->conn, h1c->conn->xprt_ctx, SUB_RETRY_RECV, &h1c->wait_event); } se_fl_set_error(h1s->sd); + if (sd->iobuf.pipe) { + put_pipe(sd->iobuf.pipe); + sd->iobuf.pipe = NULL; + } TRACE_DEVEL("connection error", H1_EV_STRM_ERR|H1_EV_H1C_ERR|H1_EV_H1S_ERR, h1c->conn, h1s); } @@ -5084,6 +5088,10 @@ static int h1_resume_fastfwd(struct stconn *sc, unsigned int flags) h1c->conn->xprt->subscribe(h1c->conn, h1c->conn->xprt_ctx, SUB_RETRY_RECV, &h1c->wait_event); } se_fl_set_error(h1s->sd); + if (h1s->sd->iobuf.pipe) { + put_pipe(h1s->sd->iobuf.pipe); + h1s->sd->iobuf.pipe = NULL; + } TRACE_DEVEL("connection error", H1_EV_STRM_ERR|H1_EV_H1C_ERR|H1_EV_H1S_ERR, h1c->conn, h1s); } diff --git a/src/mux_pt.c b/src/mux_pt.c index 6dbbe04cd1..7af06e4962 100644 --- a/src/mux_pt.c +++ b/src/mux_pt.c @@ -631,6 +631,10 @@ static size_t mux_pt_done_ff(struct stconn *sc) if (conn_xprt_read0_pending(conn)) se_fl_set(ctx->sd, SE_FL_EOS); se_fl_set_error(ctx->sd); + if (se->iobuf.pipe) { + put_pipe(se->iobuf.pipe); + se->iobuf.pipe = NULL; + } TRACE_DEVEL("error on connection", PT_EV_TX_DATA|PT_EV_CONN_ERR, conn, sc); } @@ -746,6 +750,10 @@ static int mux_pt_resume_fastfwd(struct stconn *sc, unsigned int flags) if (conn_xprt_read0_pending(conn)) se_fl_set(ctx->sd, SE_FL_EOS); se_fl_set_error(ctx->sd); + if (se->iobuf.pipe) { + put_pipe(se->iobuf.pipe); + se->iobuf.pipe = NULL; + } TRACE_DEVEL("error on connection", PT_EV_TX_DATA|PT_EV_CONN_ERR, conn, sc); } -- 2.45.2
From f8c4c580190ef542be2121a7fdf03bb0f13c84bf Mon Sep 17 00:00:00 2001 From: Christopher Faulet <[email protected]> Date: Tue, 30 Jul 2024 10:43:55 +0200 Subject: [PATCH 3/6] BUILD: mux-pt: Use the right name for the sedesc variable A typo was introduced in 760d26a86 ("BUG/MEDIUM: mux-pt/mux-h1: Release the pipe on connection error on sending path"). The sedesc variable is 'sd', not 'se'. This patch must be backported with the commit above. (cherry picked from commit d9f41b1d6e811022372ce541e67b047bd18630a9) Signed-off-by: Christopher Faulet <[email protected]> --- src/mux_pt.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/src/mux_pt.c b/src/mux_pt.c index 7af06e4962..9de9f98b81 100644 --- a/src/mux_pt.c +++ b/src/mux_pt.c @@ -631,9 +631,9 @@ static size_t mux_pt_done_ff(struct stconn *sc) if (conn_xprt_read0_pending(conn)) se_fl_set(ctx->sd, SE_FL_EOS); se_fl_set_error(ctx->sd); - if (se->iobuf.pipe) { - put_pipe(se->iobuf.pipe); - se->iobuf.pipe = NULL; + if (sd->iobuf.pipe) { + put_pipe(sd->iobuf.pipe); + sd->iobuf.pipe = NULL; } TRACE_DEVEL("error on connection", PT_EV_TX_DATA|PT_EV_CONN_ERR, conn, sc); } @@ -750,9 +750,9 @@ static int mux_pt_resume_fastfwd(struct stconn *sc, unsigned int flags) if (conn_xprt_read0_pending(conn)) se_fl_set(ctx->sd, SE_FL_EOS); se_fl_set_error(ctx->sd); - if (se->iobuf.pipe) { - put_pipe(se->iobuf.pipe); - se->iobuf.pipe = NULL; + if (sd->iobuf.pipe) { + put_pipe(sd->iobuf.pipe); + sd->iobuf.pipe = NULL; } TRACE_DEVEL("error on connection", PT_EV_TX_DATA|PT_EV_CONN_ERR, conn, sc); } -- 2.45.2
From 65edcb5a9ffbd40a1298b234c6067fbaafed719f Mon Sep 17 00:00:00 2001 From: Christopher Faulet <[email protected]> Date: Thu, 1 Aug 2024 15:42:09 +0200 Subject: [PATCH 4/6] BUG/MEDIUM: http-ana: Report error on write error waiting for the response When we are waiting for the server response, if an error is pending on the frontend side (a write error on client), it is handled as an abort and all regular response analyzers are removed, except the one responsible to release the filters, if any. However, while it is handled as an abort, the error is not reported, as usual, via http_reply_and_close() function. It is an issue because in that, the channels buffers are not reset. Because of this bug, it is possible to block a stream infinitely. The request side is waiting for the response side and the response side is blocked because filters must be released and this cannot be done because data remain blocked in channels buffers. So, in that case, calling http_reply_and_close() with no message is enough to unblock the stream. This patch must be backported as far as 2.8. (cherry picked from commit 0ba6202796fe24099aeff89a5a4b83af99fc027b) Signed-off-by: Christopher Faulet <[email protected]> --- src/http_ana.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/http_ana.c b/src/http_ana.c index 51963412e1..aa56184a0d 100644 --- a/src/http_ana.c +++ b/src/http_ana.c @@ -1379,6 +1379,7 @@ int http_wait_for_response(struct stream *s, struct channel *rep, int an_bit) http_set_term_flags(s); /* process_stream() will take care of the error */ + http_reply_and_close(s, txn->status, NULL); DBG_TRACE_DEVEL("leaving on error", STRM_EV_STRM_ANA|STRM_EV_HTTP_ANA|STRM_EV_HTTP_ERR, s, txn); return 0; -- 2.45.2
From 9b973fa9f0cf02fc0e1ab33a293db1344cf9f293 Mon Sep 17 00:00:00 2001 From: Christopher Faulet <[email protected]> Date: Thu, 1 Aug 2024 16:01:50 +0200 Subject: [PATCH 5/6] BUG/MEDIUM: h2: Only report early HTX EOM for tunneled streams For regular H2 messages, the HTX EOM flag is synonymous the end of input. So SE_FL_EOI flag must also be set on the stream-endpoint descriptor. However, there is an exception. For tunneled streams, the end of message is reported on the HTX message just after the headers. But in that case, no end of input is reported on the SE. But here, there is a bug. The "early" EOM is also report on the HTX messages when there is no payload (for instance a content-length set to 0). If there is no ES flag on the H2 HEADERS frame, it is an unexpected case. Because for the applicative stream and most probably for the opposite endpoint, the message is considered as finihsed. It is switched in its DONE state (or the equivalent on the endpoint). But, if an extra H2 frame with the ES flag is received, a TRAILERS frame or an emtpy DATA frame, an extra EOT HTX block is pushed to carry the HTX EOM flag. So an extra HTX block is emitted for a regular HTX message. It is totally invalid, it must never happen. Because it is an undefined behavior, it is difficult to predict the result. But it definitly prevent the applicative stream to properly handle aborts and errors because data remain blocked in the channel buffer. Indeed, the end of the message was seen, so no more data are forwarded. It seems to be an issue for 2.8 and upper. Harder to evaluate for older versions. This patch must be backported as far as 2.4. (cherry picked from commit 6743e128f34fba297f2cac836a4f11b84acd503a) Signed-off-by: Christopher Faulet <[email protected]> --- src/h2.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/h2.c b/src/h2.c index 9c60cc6b30..5a2d74caca 100644 --- a/src/h2.c +++ b/src/h2.c @@ -456,7 +456,8 @@ int h2_make_htx_request(struct http_hdr *list, struct htx *htx, unsigned int *ms (*msgf & H2_MSGF_BODY_TUNNEL)) { /* Request without body or tunnel requested */ sl_flags |= HTX_SL_F_BODYLESS; - htx->flags |= HTX_FL_EOM; + if (*msgf & H2_MSGF_BODY_TUNNEL) + htx->flags |= HTX_FL_EOM; } if (*msgf & H2_MSGF_EXT_CONNECT) { @@ -724,7 +725,8 @@ int h2_make_htx_response(struct http_hdr *list, struct htx *htx, unsigned int *m (*msgf & H2_MSGF_BODY_TUNNEL)) { /* Response without body or tunnel successfully established */ sl_flags |= HTX_SL_F_BODYLESS; - htx->flags |= HTX_FL_EOM; + if (*msgf & H2_MSGF_BODY_TUNNEL) + htx->flags |= HTX_FL_EOM; } /* update the start line with last detected header info */ -- 2.45.2
From 7c2785196884cc8e6fb1ec443dd041089def3aee Mon Sep 17 00:00:00 2001 From: Christopher Faulet <[email protected]> Date: Thu, 1 Aug 2024 16:22:41 +0200 Subject: [PATCH 6/6] BUG/MEDIUM: mux-h2: Propagate term flags to SE on error in h2s_wake_one_stream When a stream is explicitly woken up by the H2 conneciton, if an error condition is detected, the corresponding error flag is set on the SE. So SE_FL_ERROR or SE_FL_ERR_PENDING, depending if the end of stream was reported or not. However, there is no attempt to propagate other termination flags. We must be sure to properly set SE_FL_EOI and SE_FL_EOS when appropriate to be able to switch a pending error to a fatal error. Because of this bug, the SE remains with a pending error and no end of stream, preventing the applicative stream to trully abort it. It means on some abort scenario, it is possible to block a stream infinitely. This patch must be backported at least as far as 2.8. No bug was observed on older versions while the same code is inuse. (cherry picked from commit 184f16ded7a0274bffe99a4795d0a27f8be7c006) Signed-off-by: Christopher Faulet <[email protected]> --- src/mux_h2.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mux_h2.c b/src/mux_h2.c index bd1c58c995..adb0e63be5 100644 --- a/src/mux_h2.c +++ b/src/mux_h2.c @@ -2274,6 +2274,7 @@ static void h2s_wake_one_stream(struct h2s *h2s) (h2s->h2c->st0 >= H2_CS_ERROR || (h2s->h2c->flags & H2_CF_ERROR) || (h2s->h2c->last_sid > 0 && (!h2s->id || h2s->id > h2s->h2c->last_sid)))) { se_fl_set_error(h2s->sd); + h2s_propagate_term_flags(h2c, h2s); if (h2s->st < H2_SS_ERROR) h2s->st = H2_SS_ERROR; -- 2.45.2
0001-BUG-MEDIUM-stconn-Report-error-on-SC-on-send-if-a-pr.patch.2.8
Description: Unix manual page
0002-BUG-MEDIUM-mux-pt-mux-h1-Release-the-pipe-on-connect.patch.2.8
Description: Unix manual page
0003-BUG-MINOR-stconn-Request-to-send-something-to-be-wok.patch.2.8
Description: Unix manual page
0004-BUG-MEDIUM-http-ana-Report-error-on-write-error-wait.patch.2.8
Description: Unix manual page
0005-BUG-MEDIUM-h2-Only-report-early-HTX-EOM-for-tunneled.patch.2.8
Description: Unix manual page
0006-BUG-MEDIUM-mux-h2-Propagate-term-flags-to-SE-on-erro.patch.2.8
Description: Unix manual page

