On Fri, 2020-06-26 at 12:23 -0300, Leonardo Bras wrote: > On Wed, 2020-06-24 at 03:24 -0300, Leonardo Bras wrote: > > As of today, if a DDW is created and can't map the whole partition, it's > > removed and the default DMA window "ibm,dma-window" is used instead. > > > > Usually this DDW is bigger than the default DMA window, so it would be > > better to make use of it instead. > > > > Signed-off-by: Leonardo Bras <leobra...@gmail.com> > > --- > > I tested this change with a 256GB DDW which did not map the whole > partition, with a MT27700 Family [ConnectX-4 Virtual Function]. > > I noticed the performance improvement is about the same as using DDW > with IOMMU bypass. > > 64 thread write throughput: +203.0% > 64 thread read throughput: +17.5% > 1 thread write throughput: +20.5% > 1 thread read throughput: +3.43% > Average write latency: -23.0% > Average read latency: -2.26%
The above improvements are based on the default DMA window, which is currently used if DDW can't map the whole partition. Those values are an average of 20 tests for each environment, 30 seconds each test. I also did some intense testing, for 5 hour each: 64 thread write throughput 64 thread read throughput The throughput values are stable in the whole test, and I noticed no error on dmesg / journalctl.