On 08/02/2021 14:31, Andre Przywara wrote: > At the moment nvme_read_completion_status() tries to invidate a single > member of the cqes[] array, which is shady as just a single entry is > not cache line aligned. > The structure is dictated by hardware, and with 16 bytes is smaller than > any cache line we usually deal with. Also multiple entries need to be > consecutive in memory, so we can't pad them to cover a whole cache line. > > As a consequence we can only always invalidate all of them - U-Boot just > uses two of them anyway. This is fine, as they are only ever read by the > CPU (apart from the initial zeroing), so they can't become dirty. > > Make this obvious by always invalidating the whole array, regardless of > the entry number we are about to read. > Also blow up the allocation size to cover whole cache lines, to avoid > other heap allocations to sneak in. > > Signed-off-by: Andre Przywara <andre.przyw...@arm.com> > --- > Hi, > > this is just compile tested, and should fix the only questionable > cache invalidate call in this driver. > Please verify if this fixes any issues! > > Cheers, > Andre > > drivers/nvme/nvme.c | 16 ++++++++++++---- > 1 file changed, 12 insertions(+), 4 deletions(-) > > diff --git a/drivers/nvme/nvme.c b/drivers/nvme/nvme.c > index 5d6331ad346..c9efeff4bc9 100644 > --- a/drivers/nvme/nvme.c > +++ b/drivers/nvme/nvme.c > @@ -22,6 +22,8 @@ > #define NVME_AQ_DEPTH 2 > #define NVME_SQ_SIZE(depth) (depth * sizeof(struct nvme_command)) > #define NVME_CQ_SIZE(depth) (depth * sizeof(struct nvme_completion)) > +#define NVME_CQ_ALLOCATION ALIGN(NVME_CQ_SIZE(NVME_Q_DEPTH), \ > + ARCH_DMA_MINALIGN) > #define ADMIN_TIMEOUT 60 > #define IO_TIMEOUT 30 > #define MAX_PRP_POOL 512 > @@ -144,8 +146,14 @@ static __le16 nvme_get_cmd_id(void) > > static u16 nvme_read_completion_status(struct nvme_queue *nvmeq, u16 index) > { > - u64 start = (ulong)&nvmeq->cqes[index]; > - u64 stop = start + sizeof(struct nvme_completion); > + /* > + * Single CQ entries are always smaller than a cache line, so we > + * can't invalidate them individually. However CQ entries are > + * read only by the CPU, so it's safe to always invalidate all of them, > + * as the cache line should never become dirty. > + */ > + ulong start = (ulong)&nvmeq->cqes[0]; > + ulong stop = start + NVME_CQ_ALLOCATION; > > invalidate_dcache_range(start, stop); > > @@ -241,7 +249,7 @@ static struct nvme_queue *nvme_alloc_queue(struct > nvme_dev *dev, > return NULL; > memset(nvmeq, 0, sizeof(*nvmeq)); > > - nvmeq->cqes = (void *)memalign(4096, NVME_CQ_SIZE(depth)); > + nvmeq->cqes = (void *)memalign(4096, NVME_CQ_ALLOCATION); > if (!nvmeq->cqes) > goto free_nvmeq; > memset((void *)nvmeq->cqes, 0, NVME_CQ_SIZE(depth)); > @@ -339,7 +347,7 @@ static void nvme_init_queue(struct nvme_queue *nvmeq, u16 > qid) > nvmeq->q_db = &dev->dbs[qid * 2 * dev->db_stride]; > memset((void *)nvmeq->cqes, 0, NVME_CQ_SIZE(nvmeq->q_depth)); > flush_dcache_range((ulong)nvmeq->cqes, > - (ulong)nvmeq->cqes + NVME_CQ_SIZE(nvmeq->q_depth)); > + (ulong)nvmeq->cqes + NVME_CQ_ALLOCATION); > dev->online_queues++; > } > >
On Amlogic A311D, fixes timeout on nvme scan: Tested-by: Neil Armstrong <narmstr...@baylibre.com>