Thanks for the review. Comments below.
On Tue, 2 Oct 2018, Alistair Popple wrote: > Thanks Mark, > > Looks like some worthwhile improvments to be had. I've added a couple of > comments inline below. > > > +#define PAGE_64K (64UL * 1024) +#define PAGE_2M (2UL * 1024 * 1024) > > +#define > > PAGE_1G (1UL * 1024 * 1024 * 1024) > > include/linux/sizes.h includes definitions for SZ_64K, SZ_2M, SZ_1G, etc. so > unless they're redefined here for some reason I personally think it's cleaner > to > use those. Agreed, will fix. Thanks for the pointer. > > > /* > > - * Invalidate either a single address or an entire PID depending on > > - * the value of va. > > + * Invalidate a virtual address range > > */ > > -static void mmio_invalidate(struct npu_context *npu_context, int va, > > - unsigned long address, bool flush) > > +static void mmio_invalidate(struct npu_context *npu_context, > > + unsigned long start, unsigned long size, bool flush) > > With this optimisation every caller of mmio_invalidate() sets flush == true so > it no longer appears to be used. We should drop it as a parameter unless you > think there might be some reason to use it in future? > > Therefore we could also drop it as a parameter to get_atsd_launch_val(), > mmio_invalidate_pid() and mmio_invalidate_range() as well as I couldn't find > any > callers of those that set it to anything other than true. Yeah, good catch. I'll simplify all of those. > > > struct mmio_atsd_reg mmio_atsd_reg[NV_MAX_NPUS]; > > unsigned long pid = npu_context->mm->context.id; > > + unsigned long atsd_start = 0; > > + unsigned long end = start + size - 1; > > + int atsd_psize = MMU_PAGE_COUNT; > > + > > + /* > > + * Convert the input range into one of the supported sizes. If the range > > + * doesn't fit, use the next larger supported size. Invalidation latency > > + * is high, so over-invalidation is preferred to issuing multiple > > + * invalidates. > > + */ > > + if (size == PAGE_64K) { > > We also support 4K page sizes on PPC. If I am not mistaken this means every > ATSD > would invalidate the entire GPU TLB for a the given PID on those systems. > Could > we change the above check to `if (size <= PAGE_64K)` to avoid this? PPC supports 4K pages but the GPU ATS implementation does not. For that reason I didn't bother handling invalidates smaller than 64K. I'll add a comment on that. I don't know that this requirement is enforced anywhere though. I could add a PAGE_SIZE == 64K check to pnv_npu2_init_context if you think it would be useful. > > > + atsd_start = start; > > Which would also require: > > atsd_start = ALIGN_DOWN(start, PAGE_64K); > > > + atsd_psize = MMU_PAGE_64K; > > + } else if (ALIGN_DOWN(start, PAGE_2M) == ALIGN_DOWN(end, PAGE_2M)) { > > Wouldn't this lead to under invalidation in ranges which happen to cross a 2M > boundary? For example invalidating a 128K (ie. 2x64K pages) range with start > == > 0x1f0000 and end == 0x210000 would result in an invalidation of the range 0x0 > - > 0x200000 incorrectly leaving 0x200000 - 0x210000 in the GPU TLB. In this example: start 0x1f0000 size 0x020000 end (start + size - 1) 0x20ffff ALIGN_DOWN(start, PAGE_2M) 0x000000 ALIGN_DOWN(end, PAGE_2M) 0x200000 Since ALIGN_DOWN(start, PAGE_2M) != ALIGN_DOWN(end, PAGE_2M), the condition fails and we move to the 1G clause. Then ALIGN_DOWN(start, PAGE_1G) == ALIGN_DOWN(end, PAGE_1G) == 0, so we invalidate the range [0, 1G).