The 'tx_db_nc' is used to differntiate two mapping types, WC and non-WC, both are actually non-cacheable.
The Write-Combining on x86, is not-cacheablei. The Normal-NC, the counterpart on aarch64, is non-cacheable too, as its name suggests, the cache hierarchy was bypassed for accesses to these two memory regions. Interpreting it to 'non-cacheable' makes no distinction and is confusing. re-interprete it to 'non-combining' can make the distinction. Also explains why aarch64 default setting for this is different. Signed-off-by: Gavin Hu <gavin...@arm.com> --- doc/guides/nics/mlx5.rst | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst index afd11cd83..addec9f12 100644 --- a/doc/guides/nics/mlx5.rst +++ b/doc/guides/nics/mlx5.rst @@ -610,9 +610,9 @@ Run-time configuration The rdma core library can map doorbell register in two ways, depending on the environment variable "MLX5_SHUT_UP_BF": - - As regular cached memory (usually with write combining attribute), if the + - As regular memory (usually with write combining attribute), if the variable is either missing or set to zero. - - As non-cached memory, if the variable is present and set to not "0" value. + - As non-combining memory, if the variable is present and set to not "0" value. The type of mapping may slightly affect the Tx performance, the optimal choice is strongly relied on the host architecture and should be deduced practically. @@ -638,6 +638,8 @@ Run-time configuration If ``tx_db_nc`` is omitted or set to zero, the preset (if any) environment variable "MLX5_SHUT_UP_BF" value is used. If there is no "MLX5_SHUT_UP_BF", the default ``tx_db_nc`` value is zero for ARM64 hosts and one for others. + ARM64 is different because it has a gathering buffer, the enforced barrier + can evict the doorbell ring immediately. - ``tx_vec_en`` parameter [int] -- 2.17.1