On Tue, Feb 2, 2021 at 5:04 PM Marek Vasut <marek.va...@gmail.com> wrote: > > On 2/2/21 9:54 AM, Bin Meng wrote: > [...] > >>>> cache aligned in memory, however the cache operations are called on > >>>> the structure sizes, which themselves might not be cache aligned. Add > >>>> the necessary rounding to fix this, which permits the nvme to work on > >>>> arm64. > >>> > >>> +ARM guys > >>> > >>> Which ARM64 SoC did you test this with? > >> > >> RCar3, although that's irrelevant, the problem will happen on any arm or > >> arm64, and possibly any other system which needs cache management. > > > > There was a recent change to nvme.c that fixed a cache issue on ARMv8 > > so I thought this might be platform related. > > I used master, so unlikely.
It's strange this issue was not exposed last time when this driver was tested on ARMv8. > > >>> The round down in this patch should be unnecessary. > >> > >> Can you explain why ? > > > > I just took a further look and most of the start address should be > > cache line aligned (4KiB aligned) except the > > nvme_read_completion_status(). It's only 16 bytes aligned which might > > not be cache line aligned. > > Right, there are various arm chips with 32B/64B alignment requirements. > > >>> But it's better to > >>> figure out which call to dcache_xxx() with an unaligned end address. > >> > >> If you look at the code, most of them can (and do) trigger this, > >> therefore they need such alignment, as explained in the commit message. > > > > Now I wonder what's the correct implementation of the > > invalidate_dcache_range() and flush_dcache_range() in U-Boot? > > Shouldn't the round down/up happen in these APIs instead of doing such > > in drivers? > > Definitely not, because then the rounding might flush/invalidate cache > over areas where this could cause a problem (e.g. neighboring DMA > descriptors). The driver must do the cache management correctly. Well we can implement in these APIs and document its expected usage. Either way a driver has to do the cache management correctly. Not doing it in the driver eliminates some duplications of rounding up/down. For this case, I believe we just need to take care of nvme_read_completion_status(). Regards, Bin