On 22/07/2019 13:09, Robin Murphy wrote: > On 22/07/2019 11:07, Steven Price wrote: >> On 22/07/2019 10:50, Robin Murphy wrote: >>> On 19/07/2019 23:07, Rob Herring wrote: >>>> On Fri, Jul 19, 2019 at 4:39 AM Steven Price <steven.pr...@arm.com> >>>> wrote: >>>>> >>>>> On 18/07/2019 18:03, Rob Herring wrote: >>>>>> On Thu, Jul 18, 2019 at 9:03 AM Steven Price <steven.pr...@arm.com> >>>>>> wrote: >>>>>>> >>>>>>> On 17/07/2019 19:33, Rob Herring wrote: >>>>>>>> Executable buffers have an alignment restriction that they can't >>>>>>>> cross >>>>>>>> 16MB boundary as the GPU program counter is 24-bits. This >>>>>>>> restriction is >>>>>>>> currently not handled and we just get lucky. As current userspace >>>>>>> >>>>>>> Actually it depends which GPU you are using - some do have a bigger >>>>>>> program counter - so it might be the choice of GPU to test on rather >>>>>>> than just luck. But kbase always assumes the worst case (24 bit) >>>>>>> as in >>>>>>> practise that's enough. >>>>>>> >>>>>>>> assumes all BOs are executable, that has to remain the default. So >>>>>>>> add a >>>>>>>> new PANFROST_BO_NOEXEC flag to allow userspace to indicate which >>>>>>>> BOs are >>>>>>>> not executable. >>>>>>>> >>>>>>>> There is also a restriction that executable buffers cannot start >>>>>>>> or end >>>>>>>> on a 4GB boundary. This is mostly avoided as there is only 4GB of >>>>>>>> space >>>>>>>> currently and the beginning is already blocked out for NULL ptr >>>>>>>> detection. Add support to handle this restriction fully >>>>>>>> regardless of >>>>>>>> the current constraints. >>>>>>>> >>>>>>>> For existing userspace, all created BOs remain executable, but the >>>>>>>> GPU >>>>>>>> VA alignment will be increased to the size of the BO. This >>>>>>>> shouldn't >>>>>>>> matter as there is plenty of GPU VA space. >>>>>>>> >>>>>>>> Cc: Tomeu Vizoso <tomeu.viz...@collabora.com> >>>>>>>> Cc: Boris Brezillon <boris.brezil...@collabora.com> >>>>>>>> Cc: Robin Murphy <robin.mur...@arm.com> >>>>>>>> Cc: Steven Price <steven.pr...@arm.com> >>>>>>>> Cc: Alyssa Rosenzweig <aly...@rosenzweig.io> >>>>>>>> Signed-off-by: Rob Herring <r...@kernel.org> >>>>>>>> --- >>>>>>>> drivers/gpu/drm/panfrost/panfrost_drv.c | 20 >>>>>>>> +++++++++++++++++++- >>>>>>>> drivers/gpu/drm/panfrost/panfrost_gem.c | 18 ++++++++++++++++-- >>>>>>>> drivers/gpu/drm/panfrost/panfrost_gem.h | 3 ++- >>>>>>>> drivers/gpu/drm/panfrost/panfrost_mmu.c | 3 +++ >>>>>>>> include/uapi/drm/panfrost_drm.h | 2 ++ >>>>>>>> 5 files changed, 42 insertions(+), 4 deletions(-) >>>>>>>> >>>>>>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c >>>>>>>> b/drivers/gpu/drm/panfrost/panfrost_drv.c >>>>>>>> index d354b92964d5..b91e991bc6a3 100644 >>>>>>>> --- a/drivers/gpu/drm/panfrost/panfrost_drv.c >>>>>>>> +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c >>>>>>>> @@ -49,7 +49,8 @@ static int panfrost_ioctl_create_bo(struct >>>>>>>> drm_device *dev, void *data, >>>>>>>> struct panfrost_gem_object *bo; >>>>>>>> struct drm_panfrost_create_bo *args = data; >>>>>>>> >>>>>>>> - if (!args->size || args->flags || args->pad) >>>>>>>> + if (!args->size || args->pad || >>>>>>>> + (args->flags & ~PANFROST_BO_NOEXEC)) >>>>>>>> return -EINVAL; >>>>>>>> >>>>>>>> bo = panfrost_gem_create_with_handle(file, dev, args->size, >>>>>>>> args->flags, >>>>>>>> @@ -367,6 +368,22 @@ static struct drm_driver >>>>>>>> panfrost_drm_driver = { >>>>>>>> .gem_prime_mmap = drm_gem_prime_mmap, >>>>>>>> }; >>>>>>>> >>>>>>>> +#define PFN_4G_MASK ((SZ_4G - 1) >> PAGE_SHIFT) >>>>>>>> + >>>>>>>> +static void panfrost_drm_mm_color_adjust(const struct drm_mm_node >>>>>>>> *node, >>>>>>>> + unsigned long color, >>>>>>>> + u64 *start, u64 *end) >>>>>>>> +{ >>>>>>>> + /* Executable buffers can't start or end on a 4GB boundary */ >>>>>>>> + if (!color) { >>>>>>>> + if ((*start & PFN_4G_MASK) == 0) >>>>>>>> + (*start)++; >>>>>>>> + >>>>>>>> + if ((*end & PFN_4G_MASK) == 0) >>>>>>>> + (*end)--; >>>>>>>> + } >>>>>>>> +} >>>>>>> >>>>>>> Unless I'm mistaken this won't actually provide the guarantee if the >>>>>>> memory region is >4GB (which admittedly it isn't at the moment). For >>>>>>> example a 8GB region would have the beginning/end trimmed off, but >>>>>>> there's another 4GB in the middle which could be allocated next to. >>>>>> >>>>>> Humm, other than not allowing sizes greater than 4G-(2*PAGE_SIZE), >>>>>> I'm >>>>>> not sure how we solve that. I guess avoiding sizes of (n*4G - >>>>>> PAGE_SIZE) would avoid the problem. Seems like almost 4GB executable >>>>>> buffer should be enough for anyone(TM). >>>>> >>>>> I was thinking of something like: >>>>> >>>>> if ((*end & ~PFN_4G_MASK) != (*start & ~PFN_4G_MASK)) { >>>>> if ((*start & PFN_4G_MASK) + >>>>> (SZ_16M >> PAGE_SHIFT) < PFN_4G_MASK) >>>>> *end = (*start & ~PFN_4G_MASK) + >>>>> (SZ_4GB - SZ_16M) >> PAGE_SHIFT; >>>>> else >>>>> *start = (*end & ~PFN_4G_MASK) + (SZ_16M) >> >>>>> PAGE_SHIFT; >>>>> } >>>>> >>>>> So split the region depending on where we can find a free 16MB region >>>>> which doesn't cross 4GB. >>>> >>>> Here's what I ended up with. It's slightly different in that the start >>>> and end don't get 16MB aligned. The code already takes care of the >>>> alignment which also is not necessarily 16MB, but 'align = size' as >>>> that's sufficient to not cross a 16MB boundary. >>>> >>>> #define PFN_4G (SZ_4G >> PAGE_SHIFT) >>>> #define PFN_4G_MASK (PFN_4G - 1) >>>> #define PFN_16M (SZ_16M >> PAGE_SHIFT) >>>> >>>> static void panfrost_drm_mm_color_adjust(const struct drm_mm_node >>>> *node, >>>> unsigned long color, >>>> u64 *start, u64 *end) >>>> { >>>> /* Executable buffers can't start or end on a 4GB boundary */ >>>> if (!(color & PANFROST_BO_NOEXEC)) { >>>> if ((*start & PFN_4G_MASK) == 0) >>>> (*start)++; >>>> >>>> if ((*end & PFN_4G_MASK) == 0) >>>> (*end)--; >>>> >>>> /* Ensure start and end are in the same 4GB range */ >>>> if ((*end & ~PFN_4G_MASK) != (*start & >>>> ~PFN_4G_MASK)) { >>>> if ((*start & PFN_4G_MASK) + PFN_16M < >>>> PFN_4G_MASK) >>>> *end = (*start & ~PFN_4G_MASK) + >>>> PFN_4G - 1; >>>> else >>>> *start = (*end & ~PFN_4G_MASK) + 1; >>>> >>>> } >>>> } >>>> } >>> >>> If you're happy with the additional restriction that a buffer can't >>> cross a 4GB boundary (which does seem to be required for certain kinds >>> of buffers anyway), then I don't think it needs to be anywhere near that >>> complex: >>> >>> if (!(*start & PFN_4G_MASK)) >>> *start++ >>> *end = min(*end, ALIGN(*start, PFN_4G) - 1); >> >> What happens if *start is very near the end of a 4GB boundary? In that >> case *end - *start might not be as big as 16MB and we could end up >> failing the allocation IIUC. > > Ahem... well, the thing about complicated-and-hard-to-read code is that > it turns out to be complicated and hard to read correctly :) > > FWIW, having taken a closer look, my interpretation would be something > like (even more untested than before): > > u64 start_seg = ALIGN(*start, PFN_4G); > u64 end_seg = ALIGN_DOWN(*end, PFN_4G); > > if (start_seg - start < end - end_seg) { > *start = end_seg + 1; > *end = min(*end, end_seg + PFN_4G - 1); > } else { > *end = start_seg - 1; > *start = max(*start, start_seg - PFN_4G + 1); > } > > but at this point I'm sure we're well into personal preference and > "shorter does not necessarily imply clearer" territory. More > importantly, though, it now occurs to me that the "pick the biggest end" > approach seems inherently suboptimal for cases where the [start,end] > interval crosses *more* than one boundary. For, say, start = PFN_4G - 1, > end = 2 * PFN_4G + 1, either way we'd get caught up on the single page > at one end and ignore the full 4GB in the middle :/
Indeed, that case was just occurring to me too! How about: u64 next_seg = ALIGN(*start, PFN_4G); if (next_seg - *start <= PFN_16M) *start = next_seg + 1; *end = min(*end, ALIGN(*start, PFN_4G) - 1); So always allocate at the beginning, but skip past the next 4GB boundary if there's less than 16MB left (or equal to avoid the 4GB boundary). Steve >>>>> >>>>> But like you say: 4GB should be enough for anyone ;) >>>>> >>>>>>> Also a minor issue, but we might want to consider having some >>>>>>> constants >>>>>>> for 'color' - it's not obvious from this code that color==no_exec. >>>>>> >>>>>> Yeah, I was just going worry about that when we had a second bit to >>>>>> pass. >>>>> >>>>> One other 'minor' issue I noticed as I was writing the above. >>>>> PAGE_SIZE >>>>> is of course the CPU page size - we could have the situation of CPU >>>>> and >>>>> GPU page size being different (e.g. CONFIG_ARM64_64K_PAGES). I'm not >>>>> sure whether we want to support this or not (kbase doesn't). Also in >>>>> theory some GPUs do support 64K (AS_TRANSCFG_ADRMODE_AARCH64_64K) - >>>>> but >>>>> kbase has never used this so I don't know if it works... ;) >>>> >>>> Shhh! I have thought about that, but was ignoring for now. >>> >>> In general, I don't think that should be too great a concern - we >>> already handle it for IOMMUs, provided the minimum IOMMU page size is no >>> larger than PAGE_SIZE, which should always be the case here too. Looking >>> at how panfrost_mmu_map() is implemented, and given that we're already >>> trying to handle stuff at 2MB granularity where possible, I don't see >>> any obvious reasons for 64K pages not to work already (assuming there >>> aren't any 4K assumptions baked into the userspace side). I might have >>> to give it a try... >> >> I'd be interested to know if it works. I know that kbase incorrectly >> uses PAGE_SIZE in several places (in particular assuming it is the size >> of the page tables). And I was aware I was in danger of slipping into >> the same mindset here. >> >> User space shouldn't care too much - other than the size of buffers >> allocated being rounded up to the CPU's page size. At least the Panfrost >> user/kernel ABI has sizes in bytes not pages (unlike kbase). > > Sounds promising - my Juno branch is in a bit of a mess at the moment > but once I've got that cleaned up I'll have a quick play. > > Robin. > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel