We can avoid another indirect call per packet wrapping the rx
handler call with the proper helper.

To ensure that even the last listed direct call experience
measurable gain, despite the additional conditionals we must
traverse before reaching it, I tested reversing the order of the
listed options, with performance differences below noise level.

Together with the previous indirect call patch, this gives
~6% performance improvement in raw UDP tput.

v1 -> v2:
 - update the direct call list and use a macro to define it,
   as per Saeed suggestion. An intermediated additional
   macro is needed to allow arg list expansion

Signed-off-by: Paolo Abeni <pab...@redhat.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h    | 4 ++++
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 5 ++++-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h 
b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 3a183d690e23..52bcdc87cbe2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -148,6 +148,10 @@ struct page_pool;
 
 #define MLX5E_MSG_LEVEL                        NETIF_MSG_LINK
 
+#define MLX5_RX_INDIRECT_CALL_LIST \
+       mlx5e_handle_rx_cqe_mpwrq, mlx5e_handle_rx_cqe, mlx5i_handle_rx_cqe, \
+       mlx5e_ipsec_handle_rx_cqe
+
 #define mlx5e_dbg(mlevel, priv, format, ...)                    \
 do {                                                            \
        if (NETIF_MSG_##mlevel & (priv)->msglevel)              \
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index 0fe5f13d07cc..7faf643eb1b9 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -1303,6 +1303,8 @@ void mlx5e_handle_rx_cqe_mpwrq(struct mlx5e_rq *rq, 
struct mlx5_cqe64 *cqe)
        mlx5_wq_ll_pop(wq, cqe->wqe_id, &wqe->next.next_wqe_index);
 }
 
+#define INDIRECT_CALL_LIST(f, list, ...) INDIRECT_CALL_4(f, list, __VA_ARGS__)
+
 int mlx5e_poll_rx_cq(struct mlx5e_cq *cq, int budget)
 {
        struct mlx5e_rq *rq = container_of(cq, struct mlx5e_rq, cq);
@@ -1333,7 +1335,8 @@ int mlx5e_poll_rx_cq(struct mlx5e_cq *cq, int budget)
 
                mlx5_cqwq_pop(cqwq);
 
-               rq->handle_rx_cqe(rq, cqe);
+               INDIRECT_CALL_LIST(rq->handle_rx_cqe,
+                                  MLX5_RX_INDIRECT_CALL_LIST, rq, cqe);
        } while ((++work_done < budget) && (cqe = mlx5_cqwq_get_cqe(cqwq)));
 
 out:
-- 
2.20.1

Reply via email to