On Fri, 26 Jun 2026 at 12:34, Anatoly Burakov <[email protected]> wrote:
>
> When rte_mp_request_async() fails to send requests to all peers,
> copy and param can lose ownership and leak.
>
> However, we cannot simply free them unconditionally, as "partial failure"
> means some requests were already queued and thus still reference `copy` and
> `param`, so freeing them directly on the error path can cause
> use-after-free when those requests are later handled by the async timeout.
>
> Fix this by rolling back queued requests from the current batch, and reset
> nb_sent to 0. Freeing the requests is now safe even if some requests were
> sent, as any responses or timeouts will not find the request ID in the
> queue and will safely exit without doing anything.
>
> Coverity issue: 501503
>
> Fixes: f05e26051c15 ("eal: add IPC asynchronous request")
> Cc: [email protected]
>
> Signed-off-by: Anatoly Burakov <[email protected]>
> ---
>  lib/eal/common/eal_common_proc.c | 31 +++++++++++++++++++++++++++++++
>  1 file changed, 31 insertions(+)
>
> diff --git a/lib/eal/common/eal_common_proc.c 
> b/lib/eal/common/eal_common_proc.c
> index 991bf215a3..0cffc7a127 100644
> --- a/lib/eal/common/eal_common_proc.c
> +++ b/lib/eal/common/eal_common_proc.c
> @@ -1245,6 +1245,32 @@ rte_mp_request_async(struct rte_mp_msg *req, const 
> struct timespec *ts,
>                 } else if (mp_request_async(path, copy, param, ts))
>                         ret = -1;
>         }
> +
> +       /*
> +        * On partial failure, roll back all queued requests. We hold the lock
> +        * so no one else touches the queue. All requests in this batch share
> +        * the same param pointer. Stale alarms will fire and harmlessly find
> +        * nothing via ID-based lookup.
> +        */
> +       if (ret != 0 && reply->nb_sent > 0) {
> +               struct pending_request *r, *next;
> +
> +               for (r = TAILQ_FIRST(&pending_requests.requests);
> +                               r != NULL; r = next) {
> +                       next = TAILQ_NEXT(r, next);
> +                       if (r->type == REQUEST_TYPE_ASYNC &&
> +                                       r->async.param == param) {
> +                               TAILQ_REMOVE(&pending_requests.requests,
> +                                               r, next);
> +                               free(r->reply);
> +                               /* r->request == copy, freed below after the 
> loop */
> +                               free(r);
> +                       }
> +               }
> +               /* requests on the queue were removed so keep things 
> consistent */
> +               reply->nb_sent = 0;
> +       }
> +

Please, don't reimplement the safe macro.

I plan to update this with:

@@ -1252,15 +1252,11 @@ rte_mp_request_async(struct rte_mp_msg *req,
const struct timespec *ts,
         * nothing via ID-based lookup.
         */
        if (ret != 0 && reply->nb_sent > 0) {
-               struct pending_request *r, *next;
-
-               for (r = TAILQ_FIRST(&pending_requests.requests);
-                               r != NULL; r = next) {
-                       next = TAILQ_NEXT(r, next);
-                       if (r->type == REQUEST_TYPE_ASYNC &&
-                                       r->async.param == param) {
-                               TAILQ_REMOVE(&pending_requests.requests,
-                                               r, next);
+               struct pending_request *r, *tmp;
+
+               RTE_TAILQ_FOREACH_SAFE(r, &pending_requests.requests,
next, tmp) {
+                       if (r->type == REQUEST_TYPE_ASYNC &&
r->async.param == param) {
+
TAILQ_REMOVE(&pending_requests.requests, r, next);
                                free(r->reply);
                                /* r->request == copy, freed below
after the loop */
                                free(r);

Objection?
If not, I'll update while applying.


-- 
David Marchand

Reply via email to