On Thu, Oct 22, 2020 at 9:41 AM Benjamin Herrenschmidt <b...@kernel.crashing.org> wrote: > On Wed, 2020-10-21 at 14:11 +0200, Arnd Bergmann wrote: > > > > At the moment, the only chips that need the heavy barrier are > > > omap4 and mstar_v7, and early l2 cache controllers (not the one > > > on Cortex-A7) have another synchronization callback that IIRC > > > is used for streaming mappings. > > .../... > > > > Obviously, adding one of these for ast2600 would slow down every > > > mb() and writel() a lot, but if it is a chip-wide problem rather than > > > one isolated to the network device, it would be the correct solution, > > > provided that a correct code sequence can be found. > > I'm surprised that problem doesn't already exist on the ast2400 and > 2500 and I thus worry about the performance impact of such a workaround > applied generally to every MMIO writes.... > > But we did kill mmiowb so ... ;-)
The real cost would have to be measured of course, and it depends a lot on how it's done. The read-from-uncached-memory as in the 1/4 patch here seems fairly expensive, the mstarv7_mb() method (spinning on an mmio read) seems worse, but the omap4 method (a posted write to a mmio address in the memory controller to enforce a barrier between the two ports) doesn't seem that bad and would correspond to what the chip should be doing in the first place. Arnd