On Mon, Aug 9, 2021 at 10:47 AM Sai Prakash Ranjan <saiprakash.ran...@codeaurora.org> wrote: > > On 2021-08-09 23:10, Will Deacon wrote: > > On Mon, Aug 09, 2021 at 10:18:21AM -0700, Rob Clark wrote: > >> On Mon, Aug 9, 2021 at 10:05 AM Will Deacon <w...@kernel.org> wrote: > >> > > >> > On Mon, Aug 09, 2021 at 09:57:08AM -0700, Rob Clark wrote: > >> > > On Mon, Aug 9, 2021 at 7:56 AM Will Deacon <w...@kernel.org> wrote: > >> > > > On Mon, Aug 02, 2021 at 06:36:04PM -0700, Rob Clark wrote: > >> > > > > On Mon, Aug 2, 2021 at 8:14 AM Will Deacon <w...@kernel.org> wrote: > >> > > > > > On Mon, Aug 02, 2021 at 08:08:07AM -0700, Rob Clark wrote: > >> > > > > > > On Mon, Aug 2, 2021 at 3:55 AM Will Deacon <w...@kernel.org> > >> > > > > > > wrote: > >> > > > > > > > On Thu, Jul 29, 2021 at 10:08:22AM +0530, Sai Prakash Ranjan > >> > > > > > > > wrote: > >> > > > > > > > > On 2021-07-28 19:30, Georgi Djakov wrote: > >> > > > > > > > > > On Mon, Jan 11, 2021 at 07:45:02PM +0530, Sai Prakash > >> > > > > > > > > > Ranjan wrote: > >> > > > > > > > > > > commit ecd7274fb4cd ("iommu: Remove unused > >> > > > > > > > > > > IOMMU_SYS_CACHE_ONLY flag") > >> > > > > > > > > > > removed unused IOMMU_SYS_CACHE_ONLY prot flag and > >> > > > > > > > > > > along with it went > >> > > > > > > > > > > the memory type setting required for the non-coherent > >> > > > > > > > > > > masters to use > >> > > > > > > > > > > system cache. Now that system cache support for GPU is > >> > > > > > > > > > > added, we will > >> > > > > > > > > > > need to set the right PTE attribute for GPU buffers to > >> > > > > > > > > > > be sys cached. > >> > > > > > > > > > > Without this, the system cache lines are not allocated > >> > > > > > > > > > > for GPU. > >> > > > > > > > > > > > >> > > > > > > > > > > So the patches in this series introduces a new prot > >> > > > > > > > > > > flag IOMMU_LLC, > >> > > > > > > > > > > renames IO_PGTABLE_QUIRK_ARM_OUTER_WBWA to > >> > > > > > > > > > > IO_PGTABLE_QUIRK_PTW_LLC > >> > > > > > > > > > > and makes GPU the user of this protection flag. > >> > > > > > > > > > > >> > > > > > > > > > Thank you for the patchset! Are you planning to refresh > >> > > > > > > > > > it, as it does > >> > > > > > > > > > not apply anymore? > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > I was waiting on Will's reply [1]. If there are no changes > >> > > > > > > > > needed, then > >> > > > > > > > > I can repost the patch. > >> > > > > > > > > >> > > > > > > > I still think you need to handle the mismatched alias, no? > >> > > > > > > > You're adding > >> > > > > > > > a new memory type to the SMMU which doesn't exist on the CPU > >> > > > > > > > side. That > >> > > > > > > > can't be right. > >> > > > > > > > > >> > > > > > > > >> > > > > > > Just curious, and maybe this is a dumb question, but what is > >> > > > > > > your > >> > > > > > > concern about mismatched aliases? I mean the cache hierarchy > >> > > > > > > on the > >> > > > > > > GPU device side (anything beyond the LLC) is pretty different > >> > > > > > > and > >> > > > > > > doesn't really care about the smmu pgtable attributes.. > >> > > > > > > >> > > > > > If the CPU accesses a shared buffer with different attributes to > >> > > > > > those which > >> > > > > > the device is using then you fall into the "mismatched memory > >> > > > > > attributes" > >> > > > > > part of the Arm architecture. It's reasonably unforgiving (you > >> > > > > > should go and > >> > > > > > read it) and in some cases can apply to speculative accesses as > >> > > > > > well, but > >> > > > > > the end result is typically loss of coherency. > >> > > > > > >> > > > > Ok, I might have a few other sections to read first to decipher the > >> > > > > terminology.. > >> > > > > > >> > > > > But my understanding of LLC is that it looks just like system > >> > > > > memory > >> > > > > to the CPU and GPU (I think that would make it "the point of > >> > > > > coherence" between the GPU and CPU?) If that is true, shouldn't > >> > > > > it be > >> > > > > invisible from the point of view of different CPU mapping options? > >> > > > > >> > > > You could certainly build a system where mismatched attributes don't > >> > > > cause > >> > > > loss of coherence, but as it's not guaranteed by the architecture > >> > > > and the > >> > > > changes proposed here affect APIs which are exposed across SoCs, > >> > > > then I > >> > > > don't think it helps much. > >> > > > > >> > > > >> > > Hmm, the description of the new mapping flag is that it applies only > >> > > to transparent outer level cache: > >> > > > >> > > +/* > >> > > + * Non-coherent masters can use this page protection flag to set > >> > > cacheable > >> > > + * memory attributes for only a transparent outer level of cache, > >> > > also known as > >> > > + * the last-level or system cache. > >> > > + */ > >> > > +#define IOMMU_LLC (1 << 6) > >> > > > >> > > But I suppose we could call it instead IOMMU_QCOM_LLC or something > >> > > like that to make it more clear that it is not necessarily something > >> > > that would work with a different outer level cache implementation? > >> > > >> > ... or we could just deal with the problem so that other people can reuse > >> > the code. I haven't really understood the reluctance to solve this > >> > properly. > >> > > >> > Am I missing some reason this isn't solvable? > >> > >> Oh, was there another way to solve it (other than foregoing setting > >> INC_OCACHE in the pgtables)? Maybe I misunderstood, is there a > >> corresponding setting on the MMU pgtables side of things? > > > > Right -- we just need to program the CPU's MMU with the matching memory > > attributes! It's a bit more fiddly if you're just using ioremap_wc() > > though, as it's usually the DMA API which handles the attributes under > > the > > hood. > > > > Anyway, sorry, I should've said that explicitly earlier on. We've done > > this > > sort of thing in the Android tree so I assumed Sai knew what needed to > > be > > done and then I didn't think to explain to you :( > > > > Right I was aware of that but even in the android tree there is no user > :) > I think we can't have a new memory type without any user right in > upstream > like android tree? > > @Rob, I think you already tried adding a new MT and used > pgprot_syscached() > in GPU driver but it was crashing?
Correct, but IIRC there were some differences in the code for memory types compared to the android tree.. I couldn't figure out the necessary patches to cherry-pick to get the android patch to apply cleanly, so I tried re-implementing it without having much of a clue about how that code works (which was probably the issue) ;-) BR, -R