On Wed, Jan 11, 2017 at 03:51:22PM +0100, Thomas Monjalon wrote: > 2017-01-11 12:27, Yuanhan Liu: > > The fact that virtio net header is initiated to zero in PMD driver > > init stage means that these costly writes are unnecessary and could > > be avoided: > > > > if (hdr->csum_start != 0) > > hdr->csum_start = 0; > > > > And that's what the macro ASSIGN_UNLESS_EQUAL does. With this, the > > performance drop introduced by TSO enabling is recovered: it could > > be up to 20% in micro benchmarking. > > This patch is adding a condition to assignments. > We need a benchmark on other architectures like ARM. Please anyone?
I think the cost of condition should be way lower than the cost from the penalty introduced by the cache issue, that I don't see it would perform bad on other platforms. But, of course, testing is always welcome! --yliu > > > [...] > > +/* avoid write operation when necessary, to lessen cache issues */ > > +#define ASSIGN_UNLESS_EQUAL(var, val) do { \ > > + if ((var) != (val)) \ > > + (var) = (val); \ > > +} while (0)