Split the virtqs virt-queue resource between
the configuration threads.
Also need pre-created virt-queue resource
after virtq destruction.
This accelerates the LM process and reduces its time by 30%.
Signed-off-by: Li Zhang
Acked-by: Matan Azrad
---
doc/guides/rel_notes/release_22_07.rst | 1
pre-created virt-queue sub-resource in device probe stage
and then modify virtqueue in device config stage.
Steer table also need to support dummy virt-queue.
This accelerates the LM process and reduces its time by 40%.
Signed-off-by: Li Zhang
Signed-off-by: Yajun Wu
Acked-by: Matan Azrad
Split the virtqs device close tasks after
stopping virt-queue between the configuration threads.
This accelerates the LM process and
reduces its time by 50%.
Signed-off-by: Li Zhang
Acked-by: Matan Azrad
---
drivers/vdpa/mlx5/mlx5_vdpa.c | 56 +--
drivers/vdpa
Split the virtqs LM log between the configuration threads.
This accelerates the LM process and reduces its time by 20%.
Signed-off-by: Li Zhang
Acked-by: Matan Azrad
---
drivers/vdpa/mlx5/mlx5_vdpa.h | 3 +
drivers/vdpa/mlx5/mlx5_vdpa_cthread.c | 34 +++
drivers/vdpa/mlx5
The virtq object and all its sub-resources use a lot of
FW commands and can be accelerated by the MT management.
Split the virtqs creation between the configuration threads.
This accelerates the LM process and reduces its time by 20%.
Signed-off-by: Li Zhang
Acked-by: Matan Azrad
---
drivers
direct MRs in parallel using the MT mechanism.
After completion, the primary thread creates the indirect MR
needed for the following virtqs configurations.
This optimization accelerrate the LM process and
reduce its time by 5%.
Signed-off-by: Li Zhang
Acked-by: Matan Azrad
---
drivers/vdpa/mlx5
polls its ring and dequeue tasks.
That’s why the ring should be in multi-producer
and single consumer mode.
Anatomic counter manages the tasks completion notification.
The threads report errors to the caller by
a dedicated error counter per task.
Signed-off-by: Li Zhang
Acked-by: Matan Azrad
.
Signed-off-by: Li Zhang
Acked-by: Matan Azrad
---
doc/guides/vdpadevs/mlx5.rst | 11 +++
drivers/vdpa/mlx5/meson.build | 1 +
drivers/vdpa/mlx5/mlx5_vdpa.c | 41
drivers/vdpa/mlx5/mlx5_vdpa.h | 36 +++
drivers/vdpa/mlx5/mlx5_vdpa_cthread.c | 129
polling and
parallel configurations on the same virtq.
2.A doorbell lock synchronizes doorbell update,
which is shared for all the virtqs in the device.
3.A steering lock for the shared steering objects updates.
Signed-off-by: Li Zhang
Acked-by: Matan Azrad
---
drivers/vdpa/mlx5/mlx5_vdpa.c
accelerates the LM process and
reduces its time by 70%.
Signed-off-by: Li Zhang
Acked-by: Matan Azrad
Reviewed-by: Maxime Coquelin
---
doc/guides/rel_notes/release_22_07.rst | 4 +
drivers/vdpa/mlx5/mlx5_vdpa.h | 4 +
drivers/vdpa/mlx5/mlx5_vdpa_lm.c | 19 +-
drivers/vdpa/mlx5
mode: event_mode/event_qpn_or_msix
Signed-off-by: Li Zhang
Acked-by: Matan Azrad
---
drivers/common/mlx5/mlx5_devx_cmds.c | 70 +++-
drivers/common/mlx5/mlx5_devx_cmds.h | 6 ++-
drivers/common/mlx5/mlx5_prm.h | 13 +-
3 files changed, 76 insertions(+), 13
From: Yajun Wu
To speed up queue create time, event qp and cq will create only once.
Each virtq creation will reuse same event qp and cq.
Because FW will set event qp to error state during virtq destroy,
need modify event qp to RESET state, then modify qp to RTS state as
usual. This can save abo
From: Yajun Wu
Support set QP to RESET state.
Signed-off-by: Yajun Wu
Acked-by: Matan Azrad
Reviewed-by: Maxime Coquelin
---
drivers/common/mlx5/mlx5_devx_cmds.c | 7 +++
drivers/common/mlx5/mlx5_prm.h | 17 +
2 files changed, 24 insertions(+)
diff --git a/drivers
From: Yajun Wu
The motivation of this change is to reduce vDPA device queue creation
time by creating some queue resource in vDPA device probe stage.
In VM live migration scenario, this can reduce 0.8ms for each queue
creation, thus reduce LM network downtime.
To create queue resource(umem/coun
The driver wrongly takes the capability value for
the number of virtq pairs instead of just the number of virtqs.
Adjust all the usages of it to be the number of virtqs.
Fixes: c2eb33a ("vdpa/mlx5: manage virtqs by array")
Cc: sta...@dpdk.org
Signed-off-by: Li Zhang
Acked-by: M
.
V4:
* Fix coding style issue
Li Zhang (12):
vdpa/mlx5: fix usage of capability for max number of virtqs
common/mlx5: extend virtq modifiable fields
vdpa/mlx5: pre-create virtq at probe time
vdpa/mlx5: optimize datapath-control synchronization
vdpa/mlx5: add multi-thread management for
Split the virtqs virt-queue resource between
the configuration threads.
Also need pre-created virt-queue resource
after virtq destruction.
This accelerates the LM process and reduces its time by 30%.
Signed-off-by: Li Zhang
Acked-by: Matan Azrad
---
doc/guides/rel_notes/release_22_07.rst | 1
pre-created virt-queue sub-resource in device probe stage
and then modify virtqueue in device config stage.
Steer table also need to support dummy virt-queue.
This accelerates the LM process and reduces its time by 40%.
Signed-off-by: Li Zhang
Signed-off-by: Yajun Wu
Acked-by: Matan Azrad
Split the virtqs device close tasks after
stopping virt-queue between the configuration threads.
This accelerates the LM process and
reduces its time by 50%.
Signed-off-by: Li Zhang
Acked-by: Matan Azrad
---
drivers/vdpa/mlx5/mlx5_vdpa.c | 56 +--
drivers/vdpa
Split the virtqs LM log between the configuration threads.
This accelerates the LM process and reduces its time by 20%.
Signed-off-by: Li Zhang
Acked-by: Matan Azrad
---
drivers/vdpa/mlx5/mlx5_vdpa.h | 3 +
drivers/vdpa/mlx5/mlx5_vdpa_cthread.c | 34 +++
drivers/vdpa/mlx5
The virtq object and all its sub-resources use a lot of
FW commands and can be accelerated by the MT management.
Split the virtqs creation between the configuration threads.
This accelerates the LM process and reduces its time by 20%.
Signed-off-by: Li Zhang
Acked-by: Matan Azrad
---
drivers
direct MRs in parallel using the MT mechanism.
After completion, the primary thread creates the indirect MR
needed for the following virtqs configurations.
This optimization accelerrate the LM process and
reduce its time by 5%.
Signed-off-by: Li Zhang
Acked-by: Matan Azrad
---
drivers/vdpa/mlx5
polling and
parallel configurations on the same virtq.
2.A doorbell lock synchronizes doorbell update,
which is shared for all the virtqs in the device.
3.A steering lock for the shared steering objects updates.
Signed-off-by: Li Zhang
Acked-by: Matan Azrad
---
drivers/vdpa/mlx5/mlx5_vdpa.c
.
Signed-off-by: Li Zhang
Acked-by: Matan Azrad
---
doc/guides/vdpadevs/mlx5.rst | 11 +++
drivers/vdpa/mlx5/meson.build | 1 +
drivers/vdpa/mlx5/mlx5_vdpa.c | 41
drivers/vdpa/mlx5/mlx5_vdpa.h | 36 +++
drivers/vdpa/mlx5/mlx5_vdpa_cthread.c | 129
polls its ring and dequeue tasks.
That’s why the ring should be in multi-producer
and single consumer mode.
Anatomic counter manages the tasks completion notification.
The threads report errors to the caller by
a dedicated error counter per task.
Signed-off-by: Li Zhang
Acked-by: Matan Azrad
accelerates the LM process and
reduces its time by 70%.
Signed-off-by: Li Zhang
Acked-by: Matan Azrad
Reviewed-by: Maxime Coquelin
---
doc/guides/rel_notes/release_22_07.rst | 4 +
drivers/vdpa/mlx5/mlx5_vdpa.h | 4 +
drivers/vdpa/mlx5/mlx5_vdpa_lm.c | 19 +-
drivers/vdpa/mlx5
mode: event_mode/event_qpn_or_msix
Signed-off-by: Li Zhang
Acked-by: Matan Azrad
---
drivers/common/mlx5/mlx5_devx_cmds.c | 70 +++-
drivers/common/mlx5/mlx5_devx_cmds.h | 6 ++-
drivers/common/mlx5/mlx5_prm.h | 13 +-
3 files changed, 76 insertions(+), 13
From: Yajun Wu
Support set QP to RESET state.
Signed-off-by: Yajun Wu
Acked-by: Matan Azrad
Reviewed-by: Maxime Coquelin
---
drivers/common/mlx5/mlx5_devx_cmds.c | 7 +++
drivers/common/mlx5/mlx5_prm.h | 17 +
2 files changed, 24 insertions(+)
diff --git a/drivers
From: Yajun Wu
To speed up queue create time, event qp and cq will create only once.
Each virtq creation will reuse same event qp and cq.
Because FW will set event qp to error state during virtq destroy,
need modify event qp to RESET state, then modify qp to RTS state as
usual. This can save abo
From: Yajun Wu
The motivation of this change is to reduce vDPA device queue creation
time by creating some queue resource in vDPA device probe stage.
In VM live migration scenario, this can reduce 0.8ms for each queue
creation, thus reduce LM network downtime.
To create queue resource(umem/coun
The driver wrongly takes the capability value for
the number of virtq pairs instead of just the number of virtqs.
Adjust all the usages of it to be the number of virtqs.
Fixes: c2eb33a ("vdpa/mlx5: manage virtqs by array")
Cc: sta...@dpdk.org
Signed-off-by: Li Zhang
Acked-by: M
1868
RFC ("Add vDPA multi-threads optiomization")
https://patchwork.dpdk.org/project/dpdk/cover/20220408075606.33056-1-l...@nvidia.com/
V2:
* Drop eal device removal patch in series.
* Add release note in release_22_07.rst.
V3:
* Fix comments about commit log issue.
* Avoid cutting log
Thanks for your comment and will fix it on V3.
Regards,
Li Zhang
> -Original Message-
> From: Maxime Coquelin
> Sent: Friday, June 17, 2022 11:37 PM
> To: Li Zhang ; Ori Kam ; Slava
> Ovsiienko ; Matan Azrad ;
> Shahaf Shuler
> Cc: dev@dpdk.org; NBU-Contact-Thom
Thanks for your comments and will fix it on V3.
Regards,
Li Zhang
> -Original Message-
> From: Maxime Coquelin
> Sent: Friday, June 17, 2022 11:54 PM
> To: Li Zhang ; Ori Kam ; Slava
> Ovsiienko ; Matan Azrad ;
> Shahaf Shuler
> Cc: dev@dpdk.org; NBU-Contact-Thom
Hi Maxime,
Are there any comments about the patch?
Please let me know and thanks help review it.
Regards,
Li Zhang
> -Original Message-
> From: Maxime Coquelin
> Sent: Thursday, June 16, 2022 5:02 PM
> To: Li Zhang ; Ori Kam ; Slava
> Ovsiienko ; Matan Azrad ;
> S
Split the virtqs device close tasks after
stopping virt-queue between the configuration threads.
This accelerates the LM process and
reduces its time by 50%.
Signed-off-by: Li Zhang
Acked-by: Matan Azrad
---
drivers/vdpa/mlx5/mlx5_vdpa.c | 56 +--
drivers/vdpa
polls its ring and dequeue tasks.
That’s why the ring should be in multi-producer
and single consumer mode.
Anatomic counter manages the tasks completion notification.
The threads report errors to the caller by
a dedicated error counter per task.
Signed-off-by: Li Zhang
Acked-by: Matan Azrad
Split the virtqs virt-queue resource between
the configuration threads.
Also need pre-created virt-queue resource
after virtq destruction.
This accelerates the LM process and reduces its time by 30%.
Signed-off-by: Li Zhang
Acked-by: Matan Azrad
---
doc/guides/rel_notes/release_22_07.rst | 1
The driver wrongly takes the capability value for
the number of virtq pairs instead of just the number of virtqs.
Adjust all the usages of it to be the number of virtqs.
Fixes: c2eb33a ("vdpa/mlx5: manage virtqs by array")
Cc: sta...@dpdk.org
Signed-off-by: Li Zhang
Acked-by: M
mode: event_mode/event_qpn_or_msix
Signed-off-by: Li Zhang
Acked-by: Matan Azrad
---
drivers/common/mlx5/mlx5_devx_cmds.c | 70 +++-
drivers/common/mlx5/mlx5_devx_cmds.h | 6 ++-
drivers/common/mlx5/mlx5_prm.h | 13 +-
3 files changed, 76 insertions(+), 13
pre-created virt-queue sub-resource in device probe stage
and then modify virtqueue in device config stage.
Steer table also need to support dummy virt-queue.
This accelerates the LM process and reduces its time by 40%.
Signed-off-by: Li Zhang
Signed-off-by: Yajun Wu
Acked-by: Matan Azrad
From: Yajun Wu
To speed up queue create time, event qp and cq will create only once.
Each virtq creation will reuse same event qp and cq.
Because FW will set event qp to error state during virtq destroy,
need modify event qp to RESET state, then modify qp to RTS state as
usual. This can save abo
The virtq object and all its sub-resources use a lot of
FW commands and can be accelerated by the MT management.
Split the virtqs creation between the configuration threads.
This accelerates the LM process and reduces its time by 20%.
Signed-off-by: Li Zhang
Acked-by: Matan Azrad
---
drivers
Split the virtqs LM log between the configuration threads.
This accelerates the LM process and reduces its time by 20%.
Signed-off-by: Li Zhang
Acked-by: Matan Azrad
---
drivers/vdpa/mlx5/mlx5_vdpa.h | 3 +
drivers/vdpa/mlx5/mlx5_vdpa_cthread.c | 34 +++
drivers/vdpa/mlx5
.
Signed-off-by: Li Zhang
Acked-by: Matan Azrad
---
doc/guides/vdpadevs/mlx5.rst | 11 +++
drivers/vdpa/mlx5/meson.build | 1 +
drivers/vdpa/mlx5/mlx5_vdpa.c | 41
drivers/vdpa/mlx5/mlx5_vdpa.h | 36 +++
drivers/vdpa/mlx5/mlx5_vdpa_cthread.c | 129
polling and
parallel configurations on the same virtq.
2.A doorbell lock synchronizes doorbell update,
which is shared for all the virtqs in the device.
3.A steering lock for the shared steering objects updates.
Signed-off-by: Li Zhang
Acked-by: Matan Azrad
---
drivers/vdpa/mlx5/mlx5_vdpa.c
From: Yajun Wu
Support set QP to RESET state.
Signed-off-by: Yajun Wu
Acked-by: Matan Azrad
---
drivers/common/mlx5/mlx5_devx_cmds.c | 7 +++
drivers/common/mlx5/mlx5_prm.h | 17 +
2 files changed, 24 insertions(+)
diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c
direct MRs in parallel using the MT mechanism.
After completion, the primary thread creates the indirect MR
needed for the following virtqs configurations.
This optimization accelerrate the LM process and
reduce its time by 5%.
Signed-off-by: Li Zhang
Acked-by: Matan Azrad
---
drivers/vdpa/mlx5
1868
RFC ("Add vDPA multi-threads optiomization")
https://patchwork.dpdk.org/project/dpdk/cover/20220408075606.33056-1-l...@nvidia.com/
V2:
* Drop eal device removal patch in series.
* Add release note in release_22_07.rst.
Li Zhang (12):
vdpa/mlx5: fix usage of capability for max numbe
accelerates the LM process and
reduces its time by 70%.
Signed-off-by: Li Zhang
Acked-by: Matan Azrad
---
doc/guides/rel_notes/release_22_07.rst | 4 +
drivers/vdpa/mlx5/mlx5_vdpa.h | 4 +
drivers/vdpa/mlx5/mlx5_vdpa_lm.c | 13 +-
drivers/vdpa/mlx5/mlx5_vdpa_virtq.c| 257
From: Yajun Wu
The motivation of this change is to reduce vDPA device queue creation
time by create some queue resource in vDPA device probe stage.
In VM live migration scenario, this can reduce 0.8ms for each queue
creation, thus reduce LM network downtime.
To create queue resource(umem/counte
Split the virtqs virt-queue resource between
the configuration threads.
Also need pre-created virt-queue resource
after virtq destruction.
This accelerates the LM process and reduces its time by 30%.
Signed-off-by: Li Zhang
---
drivers/vdpa/mlx5/mlx5_vdpa.c | 115
pre-created virt-queue sub-resource in device probe stage
and then modify virtqueue in device config stage.
Steer table also need to support dummy virt-queue.
This accelerates the LM process and reduces its time by 40%.
Signed-off-by: Li Zhang
Signed-off-by: Yajun Wu
---
drivers/vdpa/mlx5
Split the virtqs device close tasks after
stopping virt-queue between the configuration threads.
This accelerates the LM process and
reduces its time by 50%.
Signed-off-by: Li Zhang
---
drivers/vdpa/mlx5/mlx5_vdpa.c | 56 +--
drivers/vdpa/mlx5/mlx5_vdpa.h
Split the virtqs LM log between the configuration threads.
This accelerates the LM process and reduces its time by 20%.
Signed-off-by: Li Zhang
---
drivers/vdpa/mlx5/mlx5_vdpa.h | 3 +
drivers/vdpa/mlx5/mlx5_vdpa_cthread.c | 34 +++
drivers/vdpa/mlx5/mlx5_vdpa_lm.c | 85
The virtq object and all its sub-resources use a lot of
FW commands and can be accelerated by the MT management.
Split the virtqs creation between the configuration threads.
This accelerates the LM process and reduces its time by 20%.
Signed-off-by: Li Zhang
---
drivers/vdpa/mlx5/mlx5_vdpa.h
direct MRs in parallel using the MT mechanism.
After completion, the primary thread creates the indirect MR
needed for the following virtqs configurations.
This optimization accelerrate the LM process and
reduce its time by 5%.
Signed-off-by: Li Zhang
---
drivers/vdpa/mlx5/mlx5_vdpa.c
polls its ring and dequeue tasks.
That’s why the ring should be in multi-producer
and single consumer mode.
Anatomic counter manages the tasks completion notification.
The threads report errors to the caller by
a dedicated error counter per task.
Signed-off-by: Li Zhang
---
drivers/vdpa/mlx5
.
Signed-off-by: Li Zhang
---
doc/guides/vdpadevs/mlx5.rst | 11 +++
drivers/vdpa/mlx5/meson.build | 1 +
drivers/vdpa/mlx5/mlx5_vdpa.c | 41
drivers/vdpa/mlx5/mlx5_vdpa.h | 36 +++
drivers/vdpa/mlx5/mlx5_vdpa_cthread.c | 129
polling and
parallel configurations on the same virtq.
2.A doorbell lock synchronizes doorbell update,
which is shared for all the virtqs in the device.
3.A steering lock for the shared steering objects updates.
Signed-off-by: Li Zhang
---
drivers/vdpa/mlx5/mlx5_vdpa.c | 24 ---
drivers
accelerates the LM process and
reduces its time by 70%.
Signed-off-by: Li Zhang
---
drivers/vdpa/mlx5/mlx5_vdpa.h | 4 +
drivers/vdpa/mlx5/mlx5_vdpa_lm.c| 13 +-
drivers/vdpa/mlx5/mlx5_vdpa_virtq.c | 257 +---
3 files changed, 170 insertions(+), 104 deletions
mode: event_mode/event_qpn_or_msix
Signed-off-by: Li Zhang
---
drivers/common/mlx5/mlx5_devx_cmds.c | 70 +++-
drivers/common/mlx5/mlx5_devx_cmds.h | 6 ++-
drivers/common/mlx5/mlx5_prm.h | 13 +-
3 files changed, 76 insertions(+), 13 deletions(-)
diff --git a
From: Yajun Wu
To speed up queue create time, event qp and cq will create only once.
Each virtq creation will reuse same event qp and cq.
Because FW will set event qp to error state during virtq destroy,
need modify event qp to RESET state, then modify qp to RTS state as
usual. This can save abo
From: Yajun Wu
Support set QP to RESET state.
Signed-off-by: Yajun Wu
---
drivers/common/mlx5/mlx5_devx_cmds.c | 7 +++
drivers/common/mlx5/mlx5_prm.h | 17 +
2 files changed, 24 insertions(+)
diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c
b/drivers/common/mlx5/
From: Yajun Wu
The motivation of this change is to reduce vDPA device queue creation
time by create some queue resource in vDPA device probe stage.
In VM live migration scenario, this can reduce 0.8ms for each queue
creation, thus reduce LM network downtime.
To create queue resource(umem/counte
From: Yajun Wu
Move rte_eal_cleanup to function vdpa_sample_quit which
handling all example app quit.
Otherwise rte_eal_cleanup won't be called on receiving signal
like SIGINT(control + c).
Fixes: 10aa3757 ("examples: add eal cleanup to examples")
Cc: sta...@dpdk.org
Signed-off-by: Yajun Wu
--
From: Yajun Wu
Add device removal in function rte_eal_cleanup. This is the last chance
device remove get called for sanity. Loop vdev bus first and then all bus
for all device, calling rte_dev_remove.
Cc: sta...@dpdk.org
Signed-off-by: Yajun Wu
---
lib/eal/freebsd/eal.c | 33 +
The driver wrongly takes the capability value for
the number of virtq pairs instead of just the number of virtqs.
Adjust all the usages of it to be the number of virtqs.
Fixes: c2eb33a ("vdpa/mlx5: manage virtqs by array")
Cc: sta...@dpdk.org
Signed-off-by: Li Zhang
---
drivers
1868
RFC ("Add vDPA multi-threads optiomization")
https://patchwork.dpdk.org/project/dpdk/cover/20220408075606.33056-1-l...@nvidia.com/
Li Zhang (12):
vdpa/mlx5: fix usage of capability for max number of virtqs
common/mlx5: extend virtq modifiable fields
vdpa/mlx5: pre-create virt
Split the virtqs virt-queue resource between
the configuration threads.
Also need pre-created virt-queue resource
after virtq destruction.
This accelerates the LM process and reduces its time by 30%.
Signed-off-by: Li Zhang
---
drivers/vdpa/mlx5/mlx5_vdpa.c | 115
pre-created virt-queue sub-resource in device probe stage
and then modify virtqueue in device config stage.
Steer table also need to support dummy virt-queue.
This accelerates the LM process and reduces its time by 40%.
Signed-off-by: Li Zhang
Signed-off-by: Yajun Wu
---
drivers/vdpa/mlx5
Split the virtqs virt-queue resource between
the configuration threads.
Also need pre-created virt-queue resource
after virtq destruction.
This accelerates the LM process and reduces its time by 30%.
Signed-off-by: Li Zhang
---
drivers/vdpa/mlx5/mlx5_vdpa.c | 115
pre-created virt-queue sub-resource in device probe stage
and then modify virtqueue in device config stage.
Steer table also need to support dummy virt-queue.
This accelerates the LM process and reduces its time by 40%.
Signed-off-by: Li Zhang
Signed-off-by: Yajun Wu
---
drivers/vdpa/mlx5
Split the virtqs device close tasks after
stopping virt-queue between the configuration threads.
This accelerates the LM process and
reduces its time by 50%.
Signed-off-by: Li Zhang
---
drivers/vdpa/mlx5/mlx5_vdpa.c | 56 +--
drivers/vdpa/mlx5/mlx5_vdpa.h
Split the virtqs LM log between the configuration threads.
This accelerates the LM process and reduces its time by 20%.
Signed-off-by: Li Zhang
---
drivers/vdpa/mlx5/mlx5_vdpa.h | 3 +
drivers/vdpa/mlx5/mlx5_vdpa_cthread.c | 34 +++
drivers/vdpa/mlx5/mlx5_vdpa_lm.c | 85
The virtq object and all its sub-resources use a lot of
FW commands and can be accelerated by the MT management.
Split the virtqs creation between the configuration threads.
This accelerates the LM process and reduces its time by 20%.
Signed-off-by: Li Zhang
---
drivers/vdpa/mlx5/mlx5_vdpa.h
Split the virtqs device close tasks after
stopping virt-queue between the configuration threads.
This accelerates the LM process and
reduces its time by 50%.
Signed-off-by: Li Zhang
---
drivers/vdpa/mlx5/mlx5_vdpa.c | 56 +--
drivers/vdpa/mlx5/mlx5_vdpa.h
direct MRs in parallel using the MT mechanism.
After completion, the primary thread creates the indirect MR
needed for the following virtqs configurations.
This optimization accelerrate the LM process and
reduce its time by 5%.
Signed-off-by: Li Zhang
---
drivers/vdpa/mlx5/mlx5_vdpa.c
Split the virtqs LM log between the configuration threads.
This accelerates the LM process and reduces its time by 20%.
Signed-off-by: Li Zhang
---
drivers/vdpa/mlx5/mlx5_vdpa.h | 3 +
drivers/vdpa/mlx5/mlx5_vdpa_cthread.c | 34 +++
drivers/vdpa/mlx5/mlx5_vdpa_lm.c | 85
The virtq object and all its sub-resources use a lot of
FW commands and can be accelerated by the MT management.
Split the virtqs creation between the configuration threads.
This accelerates the LM process and reduces its time by 20%.
Signed-off-by: Li Zhang
---
drivers/vdpa/mlx5/mlx5_vdpa.h
polls its ring and dequeue tasks.
That’s why the ring should be in multi-producer
and single consumer mode.
Anatomic counter manages the tasks completion notification.
The threads report errors to the caller by
a dedicated error counter per task.
Signed-off-by: Li Zhang
---
drivers/vdpa/mlx5
direct MRs in parallel using the MT mechanism.
After completion, the primary thread creates the indirect MR
needed for the following virtqs configurations.
This optimization accelerrate the LM process and
reduce its time by 5%.
Signed-off-by: Li Zhang
---
drivers/vdpa/mlx5/mlx5_vdpa.c
.
Signed-off-by: Li Zhang
---
doc/guides/vdpadevs/mlx5.rst | 11 +++
drivers/vdpa/mlx5/meson.build | 1 +
drivers/vdpa/mlx5/mlx5_vdpa.c | 41
drivers/vdpa/mlx5/mlx5_vdpa.h | 36 +++
drivers/vdpa/mlx5/mlx5_vdpa_cthread.c | 129
polls its ring and dequeue tasks.
That’s why the ring should be in multi-producer
and single consumer mode.
Anatomic counter manages the tasks completion notification.
The threads report errors to the caller by
a dedicated error counter per task.
Signed-off-by: Li Zhang
---
drivers/vdpa/mlx5
polling and
parallel configurations on the same virtq.
2.A doorbell lock synchronizes doorbell update,
which is shared for all the virtqs in the device.
3.A steering lock for the shared steering objects updates.
Signed-off-by: Li Zhang
---
drivers/vdpa/mlx5/mlx5_vdpa.c | 24 ---
drivers
polling and
parallel configurations on the same virtq.
2.A doorbell lock synchronizes doorbell update,
which is shared for all the virtqs in the device.
3.A steering lock for the shared steering objects updates.
Signed-off-by: Li Zhang
---
drivers/vdpa/mlx5/mlx5_vdpa.c | 24 ---
drivers
.
Signed-off-by: Li Zhang
---
doc/guides/vdpadevs/mlx5.rst | 11 +++
drivers/vdpa/mlx5/meson.build | 1 +
drivers/vdpa/mlx5/mlx5_vdpa.c | 41
drivers/vdpa/mlx5/mlx5_vdpa.h | 36 +++
drivers/vdpa/mlx5/mlx5_vdpa_cthread.c | 129
accelerates the LM process and
reduces its time by 70%.
Signed-off-by: Li Zhang
---
drivers/vdpa/mlx5/mlx5_vdpa.h | 4 +
drivers/vdpa/mlx5/mlx5_vdpa_lm.c| 13 +-
drivers/vdpa/mlx5/mlx5_vdpa_virtq.c | 257 +---
3 files changed, 170 insertions(+), 104 deletions
mode: event_mode/event_qpn_or_msix
Signed-off-by: Li Zhang
---
drivers/common/mlx5/mlx5_devx_cmds.c | 70 +++-
drivers/common/mlx5/mlx5_devx_cmds.h | 6 ++-
drivers/common/mlx5/mlx5_prm.h | 13 +-
3 files changed, 76 insertions(+), 13 deletions(-)
diff --git a
accelerates the LM process and
reduces its time by 70%.
Signed-off-by: Li Zhang
---
drivers/vdpa/mlx5/mlx5_vdpa.h | 4 +
drivers/vdpa/mlx5/mlx5_vdpa_lm.c| 13 +-
drivers/vdpa/mlx5/mlx5_vdpa_virtq.c | 257 +---
3 files changed, 170 insertions(+), 104 deletions
From: Yajun Wu
To speed up queue create time, event qp and cq will create only once.
Each virtq creation will reuse same event qp and cq.
Because FW will set event qp to error state during virtq destroy,
need modify event qp to RESET state, then modify qp to RTS state as
usual. This can save abo
mode: event_mode/event_qpn_or_msix
Signed-off-by: Li Zhang
---
drivers/common/mlx5/mlx5_devx_cmds.c | 70 +++-
drivers/common/mlx5/mlx5_devx_cmds.h | 6 ++-
drivers/common/mlx5/mlx5_prm.h | 13 +-
3 files changed, 76 insertions(+), 13 deletions(-)
diff --git a
From: Yajun Wu
To speed up queue create time, event qp and cq will create only once.
Each virtq creation will reuse same event qp and cq.
Because FW will set event qp to error state during virtq destroy,
need modify event qp to RESET state, then modify qp to RTS state as
usual. This can save abo
From: Yajun Wu
Support set QP to RESET state.
Signed-off-by: Yajun Wu
---
drivers/common/mlx5/mlx5_devx_cmds.c | 7 +++
drivers/common/mlx5/mlx5_prm.h | 17 +
2 files changed, 24 insertions(+)
diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c
b/drivers/common/mlx5/
From: Yajun Wu
The motivation of this change is to reduce vDPA device queue creation
time by create some queue resource in vDPA device probe stage.
In VM live migration scenario, this can reduce 0.8ms for each queue
creation, thus reduce LM network downtime.
To create queue resource(umem/counte
From: Yajun Wu
The motivation of this change is to reduce vDPA device queue creation
time by create some queue resource in vDPA device probe stage.
In VM live migration scenario, this can reduce 0.8ms for each queue
creation, thus reduce LM network downtime.
To create queue resource(umem/counte
From: Yajun Wu
Support set QP to RESET state.
Signed-off-by: Yajun Wu
---
drivers/common/mlx5/mlx5_devx_cmds.c | 7 +++
drivers/common/mlx5/mlx5_prm.h | 17 +
2 files changed, 24 insertions(+)
diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c
b/drivers/common/mlx5/
From: Yajun Wu
Move rte_eal_cleanup to function vdpa_sample_quit which
handling all example app quit.
Otherwise rte_eal_cleanup won't be called on receiving signal
like SIGINT(control + c).
Fixes: 10aa3757 ("examples: add eal cleanup to examples")
Cc: sta...@dpdk.org
Signed-off-by: Yajun Wu
--
From: Yajun Wu
Add calling rte_dev_remove in vDPA example application exit. Otherwise
rte_dev_remove never get called.
Fixes: edbed86d1cc ("examples/vdpa: introduce a new sample for vDPA")
Cc: sta...@dpdk.org
Signed-off-by: Yajun Wu
---
examples/vdpa/main.c | 4
1 file changed, 4 inserti
From: Yajun Wu
Add device removal in function rte_eal_cleanup. This is the last chance
device remove get called for sanity. Loop vdev bus first and then all bus
for all device, calling rte_dev_remove.
Cc: sta...@dpdk.org
Signed-off-by: Yajun Wu
---
lib/eal/freebsd/eal.c | 33 +
1 - 100 of 393 matches
Mail list logo