Hello community experts,

  I am testing a passtrhough gpu performance by measuring device to host &
host to device memory copy bandwidth. The tested GPU is nvidia t4. The
benchmarking script I am using is
https://developer.nvidia.com/blog/how-optimize-data-transfers-cuda-cc/.

On the baremetal machine, the result is
[image: image.png]
in the virtual machine, the result is,

[image: image.png]

My question is what could be the reason for the degradation and is there
anything I can do to improve it? Thank you very much for the help.
--

Best Regards,

Jiatong Shen

Reply via email to