[PATCH v4 15/15] vdpa/mlx5: prepare virtqueue resource creation

2022-06-18 Thread Li Zhang
Split the virtqs virt-queue resource between the configuration threads. Also need pre-created virt-queue resource after virtq destruction. This accelerates the LM process and reduces its time by 30%. Signed-off-by: Li Zhang Acked-by: Matan Azrad --- doc/guides/rel_notes/release_22_07.rst | 1

[PATCH v4 14/15] vdpa/mlx5: add virtq sub-resources creation

2022-06-18 Thread Li Zhang
pre-created virt-queue sub-resource in device probe stage and then modify virtqueue in device config stage. Steer table also need to support dummy virt-queue. This accelerates the LM process and reduces its time by 40%. Signed-off-by: Li Zhang Signed-off-by: Yajun Wu Acked-by: Matan Azrad

[PATCH v4 13/15] vdpa/mlx5: add device close task

2022-06-18 Thread Li Zhang
Split the virtqs device close tasks after stopping virt-queue between the configuration threads. This accelerates the LM process and reduces its time by 50%. Signed-off-by: Li Zhang Acked-by: Matan Azrad --- drivers/vdpa/mlx5/mlx5_vdpa.c | 56 +-- drivers/vdpa

[PATCH v4 12/15] vdpa/mlx5: add virtq LM log task

2022-06-18 Thread Li Zhang
Split the virtqs LM log between the configuration threads. This accelerates the LM process and reduces its time by 20%. Signed-off-by: Li Zhang Acked-by: Matan Azrad --- drivers/vdpa/mlx5/mlx5_vdpa.h | 3 + drivers/vdpa/mlx5/mlx5_vdpa_cthread.c | 34 +++ drivers/vdpa/mlx5

[PATCH v4 11/15] vdpa/mlx5: add virtq creation task for MT management

2022-06-18 Thread Li Zhang
The virtq object and all its sub-resources use a lot of FW commands and can be accelerated by the MT management. Split the virtqs creation between the configuration threads. This accelerates the LM process and reduces its time by 20%. Signed-off-by: Li Zhang Acked-by: Matan Azrad --- drivers

[PATCH v4 10/15] vdpa/mlx5: add MT task for VM memory registration

2022-06-18 Thread Li Zhang
direct MRs in parallel using the MT mechanism. After completion, the primary thread creates the indirect MR needed for the following virtqs configurations. This optimization accelerrate the LM process and reduce its time by 5%. Signed-off-by: Li Zhang Acked-by: Matan Azrad --- drivers/vdpa/mlx5

[PATCH v4 09/15] vdpa/mlx5: add task ring for MT management

2022-06-18 Thread Li Zhang
polls its ring and dequeue tasks. That’s why the ring should be in multi-producer and single consumer mode. Anatomic counter manages the tasks completion notification. The threads report errors to the caller by a dedicated error counter per task. Signed-off-by: Li Zhang Acked-by: Matan Azrad

[PATCH v4 08/15] vdpa/mlx5: add multi-thread management for configuration

2022-06-18 Thread Li Zhang
. Signed-off-by: Li Zhang Acked-by: Matan Azrad --- doc/guides/vdpadevs/mlx5.rst | 11 +++ drivers/vdpa/mlx5/meson.build | 1 + drivers/vdpa/mlx5/mlx5_vdpa.c | 41 drivers/vdpa/mlx5/mlx5_vdpa.h | 36 +++ drivers/vdpa/mlx5/mlx5_vdpa_cthread.c | 129

[PATCH v4 07/15] vdpa/mlx5: optimize datapath-control synchronization

2022-06-18 Thread Li Zhang
polling and parallel configurations on the same virtq. 2.A doorbell lock synchronizes doorbell update, which is shared for all the virtqs in the device. 3.A steering lock for the shared steering objects updates. Signed-off-by: Li Zhang Acked-by: Matan Azrad --- drivers/vdpa/mlx5/mlx5_vdpa.c

[PATCH v4 06/15] vdpa/mlx5: pre-create virtq at probe time

2022-06-18 Thread Li Zhang
accelerates the LM process and reduces its time by 70%. Signed-off-by: Li Zhang Acked-by: Matan Azrad Reviewed-by: Maxime Coquelin --- doc/guides/rel_notes/release_22_07.rst | 4 + drivers/vdpa/mlx5/mlx5_vdpa.h | 4 + drivers/vdpa/mlx5/mlx5_vdpa_lm.c | 19 +- drivers/vdpa/mlx5

[PATCH v4 05/15] common/mlx5: extend virtq modifiable fields

2022-06-18 Thread Li Zhang
mode: event_mode/event_qpn_or_msix Signed-off-by: Li Zhang Acked-by: Matan Azrad --- drivers/common/mlx5/mlx5_devx_cmds.c | 70 +++- drivers/common/mlx5/mlx5_devx_cmds.h | 6 ++- drivers/common/mlx5/mlx5_prm.h | 13 +- 3 files changed, 76 insertions(+), 13

[PATCH v4 04/15] vdpa/mlx5: support event qp reuse

2022-06-18 Thread Li Zhang
From: Yajun Wu To speed up queue create time, event qp and cq will create only once. Each virtq creation will reuse same event qp and cq. Because FW will set event qp to error state during virtq destroy, need modify event qp to RESET state, then modify qp to RTS state as usual. This can save abo

[PATCH v4 03/15] common/mlx5: add DevX API to move QP to reset state

2022-06-18 Thread Li Zhang
From: Yajun Wu Support set QP to RESET state. Signed-off-by: Yajun Wu Acked-by: Matan Azrad Reviewed-by: Maxime Coquelin --- drivers/common/mlx5/mlx5_devx_cmds.c | 7 +++ drivers/common/mlx5/mlx5_prm.h | 17 + 2 files changed, 24 insertions(+) diff --git a/drivers

[PATCH v4 02/15] vdpa/mlx5: support pre create virtq resource

2022-06-18 Thread Li Zhang
From: Yajun Wu The motivation of this change is to reduce vDPA device queue creation time by creating some queue resource in vDPA device probe stage. In VM live migration scenario, this can reduce 0.8ms for each queue creation, thus reduce LM network downtime. To create queue resource(umem/coun

[PATCH v4 01/15] vdpa/mlx5: fix usage of capability for max number of virtqs

2022-06-18 Thread Li Zhang
The driver wrongly takes the capability value for the number of virtq pairs instead of just the number of virtqs. Adjust all the usages of it to be the number of virtqs. Fixes: c2eb33a ("vdpa/mlx5: manage virtqs by array") Cc: sta...@dpdk.org Signed-off-by: Li Zhang Acked-by: M

[PATCH v4 00/15] mlx5/vdpa: optimize live migration time

2022-06-18 Thread Li Zhang
. V4: * Fix coding style issue Li Zhang (12): vdpa/mlx5: fix usage of capability for max number of virtqs common/mlx5: extend virtq modifiable fields vdpa/mlx5: pre-create virtq at probe time vdpa/mlx5: optimize datapath-control synchronization vdpa/mlx5: add multi-thread management for

[PATCH v3 15/15] vdpa/mlx5: prepare virtqueue resource creation

2022-06-18 Thread Li Zhang
Split the virtqs virt-queue resource between the configuration threads. Also need pre-created virt-queue resource after virtq destruction. This accelerates the LM process and reduces its time by 30%. Signed-off-by: Li Zhang Acked-by: Matan Azrad --- doc/guides/rel_notes/release_22_07.rst | 1

[PATCH v3 14/15] vdpa/mlx5: add virtq sub-resources creation

2022-06-18 Thread Li Zhang
pre-created virt-queue sub-resource in device probe stage and then modify virtqueue in device config stage. Steer table also need to support dummy virt-queue. This accelerates the LM process and reduces its time by 40%. Signed-off-by: Li Zhang Signed-off-by: Yajun Wu Acked-by: Matan Azrad

[PATCH v3 13/15] vdpa/mlx5: add device close task

2022-06-18 Thread Li Zhang
Split the virtqs device close tasks after stopping virt-queue between the configuration threads. This accelerates the LM process and reduces its time by 50%. Signed-off-by: Li Zhang Acked-by: Matan Azrad --- drivers/vdpa/mlx5/mlx5_vdpa.c | 56 +-- drivers/vdpa

[PATCH v3 12/15] vdpa/mlx5: add virtq LM log task

2022-06-18 Thread Li Zhang
Split the virtqs LM log between the configuration threads. This accelerates the LM process and reduces its time by 20%. Signed-off-by: Li Zhang Acked-by: Matan Azrad --- drivers/vdpa/mlx5/mlx5_vdpa.h | 3 + drivers/vdpa/mlx5/mlx5_vdpa_cthread.c | 34 +++ drivers/vdpa/mlx5

[PATCH v3 11/15] vdpa/mlx5: add virtq creation task for MT management

2022-06-18 Thread Li Zhang
The virtq object and all its sub-resources use a lot of FW commands and can be accelerated by the MT management. Split the virtqs creation between the configuration threads. This accelerates the LM process and reduces its time by 20%. Signed-off-by: Li Zhang Acked-by: Matan Azrad --- drivers

[PATCH v3 10/15] vdpa/mlx5: add MT task for VM memory registration

2022-06-18 Thread Li Zhang
direct MRs in parallel using the MT mechanism. After completion, the primary thread creates the indirect MR needed for the following virtqs configurations. This optimization accelerrate the LM process and reduce its time by 5%. Signed-off-by: Li Zhang Acked-by: Matan Azrad --- drivers/vdpa/mlx5

[PATCH v3 07/15] vdpa/mlx5: optimize datapath-control synchronization

2022-06-18 Thread Li Zhang
polling and parallel configurations on the same virtq. 2.A doorbell lock synchronizes doorbell update, which is shared for all the virtqs in the device. 3.A steering lock for the shared steering objects updates. Signed-off-by: Li Zhang Acked-by: Matan Azrad --- drivers/vdpa/mlx5/mlx5_vdpa.c

[PATCH v3 08/15] vdpa/mlx5: add multi-thread management for configuration

2022-06-18 Thread Li Zhang
. Signed-off-by: Li Zhang Acked-by: Matan Azrad --- doc/guides/vdpadevs/mlx5.rst | 11 +++ drivers/vdpa/mlx5/meson.build | 1 + drivers/vdpa/mlx5/mlx5_vdpa.c | 41 drivers/vdpa/mlx5/mlx5_vdpa.h | 36 +++ drivers/vdpa/mlx5/mlx5_vdpa_cthread.c | 129

[PATCH v3 09/15] vdpa/mlx5: add task ring for MT management

2022-06-18 Thread Li Zhang
polls its ring and dequeue tasks. That’s why the ring should be in multi-producer and single consumer mode. Anatomic counter manages the tasks completion notification. The threads report errors to the caller by a dedicated error counter per task. Signed-off-by: Li Zhang Acked-by: Matan Azrad

[PATCH v3 06/15] vdpa/mlx5: pre-create virtq at probe time

2022-06-18 Thread Li Zhang
accelerates the LM process and reduces its time by 70%. Signed-off-by: Li Zhang Acked-by: Matan Azrad Reviewed-by: Maxime Coquelin --- doc/guides/rel_notes/release_22_07.rst | 4 + drivers/vdpa/mlx5/mlx5_vdpa.h | 4 + drivers/vdpa/mlx5/mlx5_vdpa_lm.c | 19 +- drivers/vdpa/mlx5

[PATCH v3 05/15] common/mlx5: extend virtq modifiable fields

2022-06-18 Thread Li Zhang
mode: event_mode/event_qpn_or_msix Signed-off-by: Li Zhang Acked-by: Matan Azrad --- drivers/common/mlx5/mlx5_devx_cmds.c | 70 +++- drivers/common/mlx5/mlx5_devx_cmds.h | 6 ++- drivers/common/mlx5/mlx5_prm.h | 13 +- 3 files changed, 76 insertions(+), 13

[PATCH v3 03/15] common/mlx5: add DevX API to move QP to reset state

2022-06-18 Thread Li Zhang
From: Yajun Wu Support set QP to RESET state. Signed-off-by: Yajun Wu Acked-by: Matan Azrad Reviewed-by: Maxime Coquelin --- drivers/common/mlx5/mlx5_devx_cmds.c | 7 +++ drivers/common/mlx5/mlx5_prm.h | 17 + 2 files changed, 24 insertions(+) diff --git a/drivers

[PATCH v3 04/15] vdpa/mlx5: support event qp reuse

2022-06-18 Thread Li Zhang
From: Yajun Wu To speed up queue create time, event qp and cq will create only once. Each virtq creation will reuse same event qp and cq. Because FW will set event qp to error state during virtq destroy, need modify event qp to RESET state, then modify qp to RTS state as usual. This can save abo

[PATCH v3 02/15] vdpa/mlx5: support pre create virtq resource

2022-06-18 Thread Li Zhang
From: Yajun Wu The motivation of this change is to reduce vDPA device queue creation time by creating some queue resource in vDPA device probe stage. In VM live migration scenario, this can reduce 0.8ms for each queue creation, thus reduce LM network downtime. To create queue resource(umem/coun

[PATCH v3 01/15] vdpa/mlx5: fix usage of capability for max number of virtqs

2022-06-18 Thread Li Zhang
The driver wrongly takes the capability value for the number of virtq pairs instead of just the number of virtqs. Adjust all the usages of it to be the number of virtqs. Fixes: c2eb33a ("vdpa/mlx5: manage virtqs by array") Cc: sta...@dpdk.org Signed-off-by: Li Zhang Acked-by: M

[PATCH v3 00/15] mlx5/vdpa: optimize live migration time

2022-06-18 Thread Li Zhang
1868 RFC ("Add vDPA multi-threads optiomization") https://patchwork.dpdk.org/project/dpdk/cover/20220408075606.33056-1-l...@nvidia.com/ V2: * Drop eal device removal patch in series. * Add release note in release_22_07.rst. V3: * Fix comments about commit log issue. * Avoid cutting log

RE: [PATCH v2 02/15] vdpa/mlx5: support pre create virtq resource

2022-06-18 Thread Li Zhang
Thanks for your comment and will fix it on V3. Regards, Li Zhang > -Original Message- > From: Maxime Coquelin > Sent: Friday, June 17, 2022 11:37 PM > To: Li Zhang ; Ori Kam ; Slava > Ovsiienko ; Matan Azrad ; > Shahaf Shuler > Cc: dev@dpdk.org; NBU-Contact-Thom

RE: [PATCH v2 06/15] vdpa/mlx5: pre-create virtq in the prob

2022-06-18 Thread Li Zhang
Thanks for your comments and will fix it on V3. Regards, Li Zhang > -Original Message- > From: Maxime Coquelin > Sent: Friday, June 17, 2022 11:54 PM > To: Li Zhang ; Ori Kam ; Slava > Ovsiienko ; Matan Azrad ; > Shahaf Shuler > Cc: dev@dpdk.org; NBU-Contact-Thom

RE: [PATCH v2 00/15] mlx5/vdpa: optimize live migration time

2022-06-16 Thread Li Zhang
Hi Maxime, Are there any comments about the patch? Please let me know and thanks help review it. Regards, Li Zhang > -Original Message- > From: Maxime Coquelin > Sent: Thursday, June 16, 2022 5:02 PM > To: Li Zhang ; Ori Kam ; Slava > Ovsiienko ; Matan Azrad ; > S

[PATCH v2 13/15] vdpa/mlx5: add device close task

2022-06-15 Thread Li Zhang
Split the virtqs device close tasks after stopping virt-queue between the configuration threads. This accelerates the LM process and reduces its time by 50%. Signed-off-by: Li Zhang Acked-by: Matan Azrad --- drivers/vdpa/mlx5/mlx5_vdpa.c | 56 +-- drivers/vdpa

[PATCH v2 09/15] vdpa/mlx5: add task ring for MT management

2022-06-15 Thread Li Zhang
polls its ring and dequeue tasks. That’s why the ring should be in multi-producer and single consumer mode. Anatomic counter manages the tasks completion notification. The threads report errors to the caller by a dedicated error counter per task. Signed-off-by: Li Zhang Acked-by: Matan Azrad

[PATCH v2 15/15] vdpa/mlx5: prepare virtqueue resource creation

2022-06-15 Thread Li Zhang
Split the virtqs virt-queue resource between the configuration threads. Also need pre-created virt-queue resource after virtq destruction. This accelerates the LM process and reduces its time by 30%. Signed-off-by: Li Zhang Acked-by: Matan Azrad --- doc/guides/rel_notes/release_22_07.rst | 1

[PATCH v2 01/15] vdpa/mlx5: fix usage of capability for max number of virtqs

2022-06-15 Thread Li Zhang
The driver wrongly takes the capability value for the number of virtq pairs instead of just the number of virtqs. Adjust all the usages of it to be the number of virtqs. Fixes: c2eb33a ("vdpa/mlx5: manage virtqs by array") Cc: sta...@dpdk.org Signed-off-by: Li Zhang Acked-by: M

[PATCH v2 05/15] common/mlx5: extend virtq modifiable fields

2022-06-15 Thread Li Zhang
mode: event_mode/event_qpn_or_msix Signed-off-by: Li Zhang Acked-by: Matan Azrad --- drivers/common/mlx5/mlx5_devx_cmds.c | 70 +++- drivers/common/mlx5/mlx5_devx_cmds.h | 6 ++- drivers/common/mlx5/mlx5_prm.h | 13 +- 3 files changed, 76 insertions(+), 13

[PATCH v2 14/15] vdpa/mlx5: add virtq sub-resources creation

2022-06-15 Thread Li Zhang
pre-created virt-queue sub-resource in device probe stage and then modify virtqueue in device config stage. Steer table also need to support dummy virt-queue. This accelerates the LM process and reduces its time by 40%. Signed-off-by: Li Zhang Signed-off-by: Yajun Wu Acked-by: Matan Azrad

[PATCH v2 04/15] vdpa/mlx5: support event qp reuse

2022-06-15 Thread Li Zhang
From: Yajun Wu To speed up queue create time, event qp and cq will create only once. Each virtq creation will reuse same event qp and cq. Because FW will set event qp to error state during virtq destroy, need modify event qp to RESET state, then modify qp to RTS state as usual. This can save abo

[PATCH v2 11/15] vdpa/mlx5: add virtq creation task for MT management

2022-06-15 Thread Li Zhang
The virtq object and all its sub-resources use a lot of FW commands and can be accelerated by the MT management. Split the virtqs creation between the configuration threads. This accelerates the LM process and reduces its time by 20%. Signed-off-by: Li Zhang Acked-by: Matan Azrad --- drivers

[PATCH v2 12/15] vdpa/mlx5: add virtq LM log task

2022-06-15 Thread Li Zhang
Split the virtqs LM log between the configuration threads. This accelerates the LM process and reduces its time by 20%. Signed-off-by: Li Zhang Acked-by: Matan Azrad --- drivers/vdpa/mlx5/mlx5_vdpa.h | 3 + drivers/vdpa/mlx5/mlx5_vdpa_cthread.c | 34 +++ drivers/vdpa/mlx5

[PATCH v2 08/15] vdpa/mlx5: add multi-thread management for configuration

2022-06-15 Thread Li Zhang
. Signed-off-by: Li Zhang Acked-by: Matan Azrad --- doc/guides/vdpadevs/mlx5.rst | 11 +++ drivers/vdpa/mlx5/meson.build | 1 + drivers/vdpa/mlx5/mlx5_vdpa.c | 41 drivers/vdpa/mlx5/mlx5_vdpa.h | 36 +++ drivers/vdpa/mlx5/mlx5_vdpa_cthread.c | 129

[PATCH v2 07/15] vdpa/mlx5: optimize datapath-control synchronization

2022-06-15 Thread Li Zhang
polling and parallel configurations on the same virtq. 2.A doorbell lock synchronizes doorbell update, which is shared for all the virtqs in the device. 3.A steering lock for the shared steering objects updates. Signed-off-by: Li Zhang Acked-by: Matan Azrad --- drivers/vdpa/mlx5/mlx5_vdpa.c

[PATCH v2 03/15] common/mlx5: add DevX API to move QP to reset state

2022-06-15 Thread Li Zhang
From: Yajun Wu Support set QP to RESET state. Signed-off-by: Yajun Wu Acked-by: Matan Azrad --- drivers/common/mlx5/mlx5_devx_cmds.c | 7 +++ drivers/common/mlx5/mlx5_prm.h | 17 + 2 files changed, 24 insertions(+) diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c

[PATCH v2 10/15] vdpa/mlx5: add MT task for VM memory registration

2022-06-15 Thread Li Zhang
direct MRs in parallel using the MT mechanism. After completion, the primary thread creates the indirect MR needed for the following virtqs configurations. This optimization accelerrate the LM process and reduce its time by 5%. Signed-off-by: Li Zhang Acked-by: Matan Azrad --- drivers/vdpa/mlx5

[PATCH v2 00/15] mlx5/vdpa: optimize live migration time

2022-06-15 Thread Li Zhang
1868 RFC ("Add vDPA multi-threads optiomization") https://patchwork.dpdk.org/project/dpdk/cover/20220408075606.33056-1-l...@nvidia.com/ V2: * Drop eal device removal patch in series. * Add release note in release_22_07.rst. Li Zhang (12): vdpa/mlx5: fix usage of capability for max numbe

[PATCH v2 06/15] vdpa/mlx5: pre-create virtq in the prob

2022-06-15 Thread Li Zhang
accelerates the LM process and reduces its time by 70%. Signed-off-by: Li Zhang Acked-by: Matan Azrad --- doc/guides/rel_notes/release_22_07.rst | 4 + drivers/vdpa/mlx5/mlx5_vdpa.h | 4 + drivers/vdpa/mlx5/mlx5_vdpa_lm.c | 13 +- drivers/vdpa/mlx5/mlx5_vdpa_virtq.c| 257

[PATCH v2 02/15] vdpa/mlx5: support pre create virtq resource

2022-06-15 Thread Li Zhang
From: Yajun Wu The motivation of this change is to reduce vDPA device queue creation time by create some queue resource in vDPA device probe stage. In VM live migration scenario, this can reduce 0.8ms for each queue creation, thus reduce LM network downtime. To create queue resource(umem/counte

[PATCH v1 17/17] vdpa/mlx5: prepare virtqueue resource creation

2022-06-06 Thread Li Zhang
Split the virtqs virt-queue resource between the configuration threads. Also need pre-created virt-queue resource after virtq destruction. This accelerates the LM process and reduces its time by 30%. Signed-off-by: Li Zhang --- drivers/vdpa/mlx5/mlx5_vdpa.c | 115

[PATCH v1 16/17] vdpa/mlx5: add virtq sub-resources creation

2022-06-06 Thread Li Zhang
pre-created virt-queue sub-resource in device probe stage and then modify virtqueue in device config stage. Steer table also need to support dummy virt-queue. This accelerates the LM process and reduces its time by 40%. Signed-off-by: Li Zhang Signed-off-by: Yajun Wu --- drivers/vdpa/mlx5

[PATCH v1 15/17] vdpa/mlx5: add device close task

2022-06-06 Thread Li Zhang
Split the virtqs device close tasks after stopping virt-queue between the configuration threads. This accelerates the LM process and reduces its time by 50%. Signed-off-by: Li Zhang --- drivers/vdpa/mlx5/mlx5_vdpa.c | 56 +-- drivers/vdpa/mlx5/mlx5_vdpa.h

[PATCH v1 14/17] vdpa/mlx5: add virtq LM log task

2022-06-06 Thread Li Zhang
Split the virtqs LM log between the configuration threads. This accelerates the LM process and reduces its time by 20%. Signed-off-by: Li Zhang --- drivers/vdpa/mlx5/mlx5_vdpa.h | 3 + drivers/vdpa/mlx5/mlx5_vdpa_cthread.c | 34 +++ drivers/vdpa/mlx5/mlx5_vdpa_lm.c | 85

[PATCH v1 13/17] vdpa/mlx5: add virtq creation task for MT management

2022-06-06 Thread Li Zhang
The virtq object and all its sub-resources use a lot of FW commands and can be accelerated by the MT management. Split the virtqs creation between the configuration threads. This accelerates the LM process and reduces its time by 20%. Signed-off-by: Li Zhang --- drivers/vdpa/mlx5/mlx5_vdpa.h

[PATCH v1 12/17] vdpa/mlx5: add MT task for VM memory registration

2022-06-06 Thread Li Zhang
direct MRs in parallel using the MT mechanism. After completion, the primary thread creates the indirect MR needed for the following virtqs configurations. This optimization accelerrate the LM process and reduce its time by 5%. Signed-off-by: Li Zhang --- drivers/vdpa/mlx5/mlx5_vdpa.c

[PATCH v1 11/17] vdpa/mlx5: add task ring for MT management

2022-06-06 Thread Li Zhang
polls its ring and dequeue tasks. That’s why the ring should be in multi-producer and single consumer mode. Anatomic counter manages the tasks completion notification. The threads report errors to the caller by a dedicated error counter per task. Signed-off-by: Li Zhang --- drivers/vdpa/mlx5

[PATCH v1 10/17] vdpa/mlx5: add multi-thread management for configuration

2022-06-06 Thread Li Zhang
. Signed-off-by: Li Zhang --- doc/guides/vdpadevs/mlx5.rst | 11 +++ drivers/vdpa/mlx5/meson.build | 1 + drivers/vdpa/mlx5/mlx5_vdpa.c | 41 drivers/vdpa/mlx5/mlx5_vdpa.h | 36 +++ drivers/vdpa/mlx5/mlx5_vdpa_cthread.c | 129

[PATCH v1 09/17] vdpa/mlx5: optimize datapath-control synchronization

2022-06-06 Thread Li Zhang
polling and parallel configurations on the same virtq. 2.A doorbell lock synchronizes doorbell update, which is shared for all the virtqs in the device. 3.A steering lock for the shared steering objects updates. Signed-off-by: Li Zhang --- drivers/vdpa/mlx5/mlx5_vdpa.c | 24 --- drivers

[PATCH v1 08/17] vdpa/mlx5: pre-create virtq in the prob

2022-06-06 Thread Li Zhang
accelerates the LM process and reduces its time by 70%. Signed-off-by: Li Zhang --- drivers/vdpa/mlx5/mlx5_vdpa.h | 4 + drivers/vdpa/mlx5/mlx5_vdpa_lm.c| 13 +- drivers/vdpa/mlx5/mlx5_vdpa_virtq.c | 257 +--- 3 files changed, 170 insertions(+), 104 deletions

[PATCH v1 07/17] common/mlx5: extend virtq modifiable fields

2022-06-06 Thread Li Zhang
mode: event_mode/event_qpn_or_msix Signed-off-by: Li Zhang --- drivers/common/mlx5/mlx5_devx_cmds.c | 70 +++- drivers/common/mlx5/mlx5_devx_cmds.h | 6 ++- drivers/common/mlx5/mlx5_prm.h | 13 +- 3 files changed, 76 insertions(+), 13 deletions(-) diff --git a

[PATCH v1 06/17] vdpa/mlx5: support event qp reuse

2022-06-06 Thread Li Zhang
From: Yajun Wu To speed up queue create time, event qp and cq will create only once. Each virtq creation will reuse same event qp and cq. Because FW will set event qp to error state during virtq destroy, need modify event qp to RESET state, then modify qp to RTS state as usual. This can save abo

[PATCH v1 05/17] common/mlx5: add DevX API to move QP to reset state

2022-06-06 Thread Li Zhang
From: Yajun Wu Support set QP to RESET state. Signed-off-by: Yajun Wu --- drivers/common/mlx5/mlx5_devx_cmds.c | 7 +++ drivers/common/mlx5/mlx5_prm.h | 17 + 2 files changed, 24 insertions(+) diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/

[PATCH v1 04/17] vdpa/mlx5: support pre create virtq resource

2022-06-06 Thread Li Zhang
From: Yajun Wu The motivation of this change is to reduce vDPA device queue creation time by create some queue resource in vDPA device probe stage. In VM live migration scenario, this can reduce 0.8ms for each queue creation, thus reduce LM network downtime. To create queue resource(umem/counte

[PATCH v1 03/17] examples/vdpa: fix devices cleanup

2022-06-06 Thread Li Zhang
From: Yajun Wu Move rte_eal_cleanup to function vdpa_sample_quit which handling all example app quit. Otherwise rte_eal_cleanup won't be called on receiving signal like SIGINT(control + c). Fixes: 10aa3757 ("examples: add eal cleanup to examples") Cc: sta...@dpdk.org Signed-off-by: Yajun Wu --

[PATCH v1 02/17] eal: add device removal in rte cleanup

2022-06-06 Thread Li Zhang
From: Yajun Wu Add device removal in function rte_eal_cleanup. This is the last chance device remove get called for sanity. Loop vdev bus first and then all bus for all device, calling rte_dev_remove. Cc: sta...@dpdk.org Signed-off-by: Yajun Wu --- lib/eal/freebsd/eal.c | 33 +

[PATCH v1 01/17] vdpa/mlx5: fix usage of capability for max number of virtqs

2022-06-06 Thread Li Zhang
The driver wrongly takes the capability value for the number of virtq pairs instead of just the number of virtqs. Adjust all the usages of it to be the number of virtqs. Fixes: c2eb33a ("vdpa/mlx5: manage virtqs by array") Cc: sta...@dpdk.org Signed-off-by: Li Zhang --- drivers

[PATCH v1 00/17] Add vDPA multi-threads optiomization

2022-06-06 Thread Li Zhang
1868 RFC ("Add vDPA multi-threads optiomization") https://patchwork.dpdk.org/project/dpdk/cover/20220408075606.33056-1-l...@nvidia.com/ Li Zhang (12): vdpa/mlx5: fix usage of capability for max number of virtqs common/mlx5: extend virtq modifiable fields vdpa/mlx5: pre-create virt

[PATCH v1 17/17] vdpa/mlx5: prepare virtqueue resource creation

2022-06-06 Thread Li Zhang
Split the virtqs virt-queue resource between the configuration threads. Also need pre-created virt-queue resource after virtq destruction. This accelerates the LM process and reduces its time by 30%. Signed-off-by: Li Zhang --- drivers/vdpa/mlx5/mlx5_vdpa.c | 115

[PATCH v1 16/17] vdpa/mlx5: add virtq sub-resources creation

2022-06-06 Thread Li Zhang
pre-created virt-queue sub-resource in device probe stage and then modify virtqueue in device config stage. Steer table also need to support dummy virt-queue. This accelerates the LM process and reduces its time by 40%. Signed-off-by: Li Zhang Signed-off-by: Yajun Wu --- drivers/vdpa/mlx5

[PATCH 16/16] vdpa/mlx5: prepare virtqueue resource creation

2022-06-06 Thread Li Zhang
Split the virtqs virt-queue resource between the configuration threads. Also need pre-created virt-queue resource after virtq destruction. This accelerates the LM process and reduces its time by 30%. Signed-off-by: Li Zhang --- drivers/vdpa/mlx5/mlx5_vdpa.c | 115

[PATCH 15/16] vdpa/mlx5: add virtq sub-resources creation

2022-06-06 Thread Li Zhang
pre-created virt-queue sub-resource in device probe stage and then modify virtqueue in device config stage. Steer table also need to support dummy virt-queue. This accelerates the LM process and reduces its time by 40%. Signed-off-by: Li Zhang Signed-off-by: Yajun Wu --- drivers/vdpa/mlx5

[PATCH v1 15/17] vdpa/mlx5: add device close task

2022-06-06 Thread Li Zhang
Split the virtqs device close tasks after stopping virt-queue between the configuration threads. This accelerates the LM process and reduces its time by 50%. Signed-off-by: Li Zhang --- drivers/vdpa/mlx5/mlx5_vdpa.c | 56 +-- drivers/vdpa/mlx5/mlx5_vdpa.h

[PATCH v1 14/17] vdpa/mlx5: add virtq LM log task

2022-06-06 Thread Li Zhang
Split the virtqs LM log between the configuration threads. This accelerates the LM process and reduces its time by 20%. Signed-off-by: Li Zhang --- drivers/vdpa/mlx5/mlx5_vdpa.h | 3 + drivers/vdpa/mlx5/mlx5_vdpa_cthread.c | 34 +++ drivers/vdpa/mlx5/mlx5_vdpa_lm.c | 85

[PATCH v1 13/17] vdpa/mlx5: add virtq creation task for MT management

2022-06-06 Thread Li Zhang
The virtq object and all its sub-resources use a lot of FW commands and can be accelerated by the MT management. Split the virtqs creation between the configuration threads. This accelerates the LM process and reduces its time by 20%. Signed-off-by: Li Zhang --- drivers/vdpa/mlx5/mlx5_vdpa.h

[PATCH 14/16] vdpa/mlx5: add device close task

2022-06-06 Thread Li Zhang
Split the virtqs device close tasks after stopping virt-queue between the configuration threads. This accelerates the LM process and reduces its time by 50%. Signed-off-by: Li Zhang --- drivers/vdpa/mlx5/mlx5_vdpa.c | 56 +-- drivers/vdpa/mlx5/mlx5_vdpa.h

[PATCH v1 12/17] vdpa/mlx5: add MT task for VM memory registration

2022-06-06 Thread Li Zhang
direct MRs in parallel using the MT mechanism. After completion, the primary thread creates the indirect MR needed for the following virtqs configurations. This optimization accelerrate the LM process and reduce its time by 5%. Signed-off-by: Li Zhang --- drivers/vdpa/mlx5/mlx5_vdpa.c

[PATCH 13/16] vdpa/mlx5: add virtq LM log task

2022-06-06 Thread Li Zhang
Split the virtqs LM log between the configuration threads. This accelerates the LM process and reduces its time by 20%. Signed-off-by: Li Zhang --- drivers/vdpa/mlx5/mlx5_vdpa.h | 3 + drivers/vdpa/mlx5/mlx5_vdpa_cthread.c | 34 +++ drivers/vdpa/mlx5/mlx5_vdpa_lm.c | 85

[PATCH 12/16] vdpa/mlx5: add virtq creation task for MT management

2022-06-06 Thread Li Zhang
The virtq object and all its sub-resources use a lot of FW commands and can be accelerated by the MT management. Split the virtqs creation between the configuration threads. This accelerates the LM process and reduces its time by 20%. Signed-off-by: Li Zhang --- drivers/vdpa/mlx5/mlx5_vdpa.h

[PATCH v1 11/17] vdpa/mlx5: add task ring for MT management

2022-06-06 Thread Li Zhang
polls its ring and dequeue tasks. That’s why the ring should be in multi-producer and single consumer mode. Anatomic counter manages the tasks completion notification. The threads report errors to the caller by a dedicated error counter per task. Signed-off-by: Li Zhang --- drivers/vdpa/mlx5

[PATCH 11/16] vdpa/mlx5: add MT task for VM memory registration

2022-06-06 Thread Li Zhang
direct MRs in parallel using the MT mechanism. After completion, the primary thread creates the indirect MR needed for the following virtqs configurations. This optimization accelerrate the LM process and reduce its time by 5%. Signed-off-by: Li Zhang --- drivers/vdpa/mlx5/mlx5_vdpa.c

[PATCH v1 10/17] vdpa/mlx5: add multi-thread management for configuration

2022-06-06 Thread Li Zhang
. Signed-off-by: Li Zhang --- doc/guides/vdpadevs/mlx5.rst | 11 +++ drivers/vdpa/mlx5/meson.build | 1 + drivers/vdpa/mlx5/mlx5_vdpa.c | 41 drivers/vdpa/mlx5/mlx5_vdpa.h | 36 +++ drivers/vdpa/mlx5/mlx5_vdpa_cthread.c | 129

[PATCH 10/16] vdpa/mlx5: add task ring for MT management

2022-06-06 Thread Li Zhang
polls its ring and dequeue tasks. That’s why the ring should be in multi-producer and single consumer mode. Anatomic counter manages the tasks completion notification. The threads report errors to the caller by a dedicated error counter per task. Signed-off-by: Li Zhang --- drivers/vdpa/mlx5

[PATCH v1 09/17] vdpa/mlx5: optimize datapath-control synchronization

2022-06-06 Thread Li Zhang
polling and parallel configurations on the same virtq. 2.A doorbell lock synchronizes doorbell update, which is shared for all the virtqs in the device. 3.A steering lock for the shared steering objects updates. Signed-off-by: Li Zhang --- drivers/vdpa/mlx5/mlx5_vdpa.c | 24 --- drivers

[PATCH 08/16] vdpa/mlx5: optimize datapath-control synchronization

2022-06-06 Thread Li Zhang
polling and parallel configurations on the same virtq. 2.A doorbell lock synchronizes doorbell update, which is shared for all the virtqs in the device. 3.A steering lock for the shared steering objects updates. Signed-off-by: Li Zhang --- drivers/vdpa/mlx5/mlx5_vdpa.c | 24 --- drivers

[PATCH 09/16] vdpa/mlx5: add multi-thread management for configuration

2022-06-06 Thread Li Zhang
. Signed-off-by: Li Zhang --- doc/guides/vdpadevs/mlx5.rst | 11 +++ drivers/vdpa/mlx5/meson.build | 1 + drivers/vdpa/mlx5/mlx5_vdpa.c | 41 drivers/vdpa/mlx5/mlx5_vdpa.h | 36 +++ drivers/vdpa/mlx5/mlx5_vdpa_cthread.c | 129

[PATCH v1 08/17] vdpa/mlx5: pre-create virtq in the prob

2022-06-06 Thread Li Zhang
accelerates the LM process and reduces its time by 70%. Signed-off-by: Li Zhang --- drivers/vdpa/mlx5/mlx5_vdpa.h | 4 + drivers/vdpa/mlx5/mlx5_vdpa_lm.c| 13 +- drivers/vdpa/mlx5/mlx5_vdpa_virtq.c | 257 +--- 3 files changed, 170 insertions(+), 104 deletions

[PATCH v1 07/17] common/mlx5: extend virtq modifiable fields

2022-06-06 Thread Li Zhang
mode: event_mode/event_qpn_or_msix Signed-off-by: Li Zhang --- drivers/common/mlx5/mlx5_devx_cmds.c | 70 +++- drivers/common/mlx5/mlx5_devx_cmds.h | 6 ++- drivers/common/mlx5/mlx5_prm.h | 13 +- 3 files changed, 76 insertions(+), 13 deletions(-) diff --git a

[PATCH 07/16] vdpa/mlx5: pre-create virtq in the prob

2022-06-06 Thread Li Zhang
accelerates the LM process and reduces its time by 70%. Signed-off-by: Li Zhang --- drivers/vdpa/mlx5/mlx5_vdpa.h | 4 + drivers/vdpa/mlx5/mlx5_vdpa_lm.c| 13 +- drivers/vdpa/mlx5/mlx5_vdpa_virtq.c | 257 +--- 3 files changed, 170 insertions(+), 104 deletions

[PATCH v1 06/17] vdpa/mlx5: support event qp reuse

2022-06-06 Thread Li Zhang
From: Yajun Wu To speed up queue create time, event qp and cq will create only once. Each virtq creation will reuse same event qp and cq. Because FW will set event qp to error state during virtq destroy, need modify event qp to RESET state, then modify qp to RTS state as usual. This can save abo

[PATCH 06/16] common/mlx5: extend virtq modifiable fields

2022-06-06 Thread Li Zhang
mode: event_mode/event_qpn_or_msix Signed-off-by: Li Zhang --- drivers/common/mlx5/mlx5_devx_cmds.c | 70 +++- drivers/common/mlx5/mlx5_devx_cmds.h | 6 ++- drivers/common/mlx5/mlx5_prm.h | 13 +- 3 files changed, 76 insertions(+), 13 deletions(-) diff --git a

[PATCH 05/16] vdpa/mlx5: support event qp reuse

2022-06-06 Thread Li Zhang
From: Yajun Wu To speed up queue create time, event qp and cq will create only once. Each virtq creation will reuse same event qp and cq. Because FW will set event qp to error state during virtq destroy, need modify event qp to RESET state, then modify qp to RTS state as usual. This can save abo

[PATCH v1 05/17] common/mlx5: add DevX API to move QP to reset state

2022-06-06 Thread Li Zhang
From: Yajun Wu Support set QP to RESET state. Signed-off-by: Yajun Wu --- drivers/common/mlx5/mlx5_devx_cmds.c | 7 +++ drivers/common/mlx5/mlx5_prm.h | 17 + 2 files changed, 24 insertions(+) diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/

[PATCH 03/16] vdpa/mlx5: support pre create virtq resource

2022-06-06 Thread Li Zhang
From: Yajun Wu The motivation of this change is to reduce vDPA device queue creation time by create some queue resource in vDPA device probe stage. In VM live migration scenario, this can reduce 0.8ms for each queue creation, thus reduce LM network downtime. To create queue resource(umem/counte

[PATCH v1 04/17] vdpa/mlx5: support pre create virtq resource

2022-06-06 Thread Li Zhang
From: Yajun Wu The motivation of this change is to reduce vDPA device queue creation time by create some queue resource in vDPA device probe stage. In VM live migration scenario, this can reduce 0.8ms for each queue creation, thus reduce LM network downtime. To create queue resource(umem/counte

[PATCH 04/16] common/mlx5: add DevX API to move QP to reset state

2022-06-06 Thread Li Zhang
From: Yajun Wu Support set QP to RESET state. Signed-off-by: Yajun Wu --- drivers/common/mlx5/mlx5_devx_cmds.c | 7 +++ drivers/common/mlx5/mlx5_prm.h | 17 + 2 files changed, 24 insertions(+) diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/

[PATCH v1 03/17] examples/vdpa: fix devices cleanup

2022-06-06 Thread Li Zhang
From: Yajun Wu Move rte_eal_cleanup to function vdpa_sample_quit which handling all example app quit. Otherwise rte_eal_cleanup won't be called on receiving signal like SIGINT(control + c). Fixes: 10aa3757 ("examples: add eal cleanup to examples") Cc: sta...@dpdk.org Signed-off-by: Yajun Wu --

[PATCH 02/16] examples/vdpa: fix vDPA device remove

2022-06-06 Thread Li Zhang
From: Yajun Wu Add calling rte_dev_remove in vDPA example application exit. Otherwise rte_dev_remove never get called. Fixes: edbed86d1cc ("examples/vdpa: introduce a new sample for vDPA") Cc: sta...@dpdk.org Signed-off-by: Yajun Wu --- examples/vdpa/main.c | 4 1 file changed, 4 inserti

[PATCH v1 02/17] eal: add device removal in rte cleanup

2022-06-06 Thread Li Zhang
From: Yajun Wu Add device removal in function rte_eal_cleanup. This is the last chance device remove get called for sanity. Loop vdev bus first and then all bus for all device, calling rte_dev_remove. Cc: sta...@dpdk.org Signed-off-by: Yajun Wu --- lib/eal/freebsd/eal.c | 33 +

  1   2   3   4   >