On Thu, Oct 03, 2024 at 10:06:11AM GMT, Daniel Henrique Barboza wrote: > > > On 10/3/24 6:26 AM, Andrew Jones wrote: > > On Tue, Oct 01, 2024 at 10:02:58PM GMT, Daniel Henrique Barboza wrote: > > ... > > > +/* > > > + * RISCV IOMMU Address Translation Lookup - Page Table Walk > > > + * > > > + * Note: Code is based on get_physical_address() from > > > target/riscv/cpu_helper.c > > > + * Both implementation can be merged into single helper function in > > > future. > > > + * Keeping them separate for now, as error reporting and flow specifics > > > are > > > + * sufficiently different for separate implementation. > > > + * > > > + * @s : IOMMU Device State > > > + * @ctx : Translation context for device id and process address > > > space id. > > > + * @iotlb : translation data: physical address and access mode. > > > + * @return : success or fault cause code. > > > + */ > > > +static int riscv_iommu_spa_fetch(RISCVIOMMUState *s, RISCVIOMMUContext > > > *ctx, > > > + IOMMUTLBEntry *iotlb) > > > +{ > > > + dma_addr_t addr, base; > > > + uint64_t satp, gatp, pte; > > > + bool en_s, en_g; > > > + struct { > > > + unsigned char step; > > > + unsigned char levels; > > > + unsigned char ptidxbits; > > > + unsigned char ptesize; > > > + } sc[2]; > > > + /* Translation stage phase */ > > > + enum { > > > + S_STAGE = 0, > > > + G_STAGE = 1, > > > + } pass; > > > + MemTxResult ret; > > > + > > > + satp = get_field(ctx->satp, RISCV_IOMMU_ATP_MODE_FIELD); > > > + gatp = get_field(ctx->gatp, RISCV_IOMMU_ATP_MODE_FIELD); > > > + > > > + en_s = satp != RISCV_IOMMU_DC_FSC_MODE_BARE; > > > + en_g = gatp != RISCV_IOMMU_DC_IOHGATP_MODE_BARE; > > > + > > > + /* > > > + * Early check for MSI address match when IOVA == GPA. This check > > > + * is required to ensure MSI translation is applied in case > > > + * first-stage translation is set to BARE mode. In this case IOVA > > > + * provided is a valid GPA. Running translation through page walk > > > + * second stage translation will incorrectly try to translate GPA > > > + * to host physical page, likely hitting IOPF. > > > + */ > > > > Why was this comment expanded from the simple > > > > "Early check for MSI address match when IOVA == GPA." > > > > The comment is now incorrect since the check is required even when > > first-stage translation is not BARE. I just skimmed the spec again trying > > to figure out if the removal of '!en_s' is a hack or a fix, and I'm > > inclined to say "fix", but it's an incomplete fix. I found a sentence that > > says > > > > "If the virtual memory scheme selected for first-stage is Bare but the > > scheme for the second-stage is not Bare then the IOVA is a GPA." > > > > which, in a way, defines a GPA to only be a GPA when second-stage is used > > (and all MSI translation specifications refer to GPAs). However, maybe I > > missed it, but I couldn't find any actual reason that the MSI table can't > > be used when first-stage is not BARE and second-stage is (and, of course, > > it makes no difference for single-stage translations to call IOVAs GPAs > > or not). > > > > Now, I also see > > > > "If the virtual memory scheme selected for neither stage is Bare then the > > IOVA is a VA. Two-stage address translation is in effect. The first-stage > > translates the VA to a GPA and the second-stage translates the GPA to a > > SPA." > > > > in the spec, which means we should probably change the removal of '!en_s' > > to '!(en_s && en_g)'. VFIO+irqbypass would still work with that and, when > > Linux learns to support two-stage translation, we wouldn't incorrectly try > > to check for a GVA in the MSI table. > > Ok. It seems to me that we can't rely on the riscv-iommu spec alone to let > us know how to detect if IOVA == GPA, given that one of the main usages > we have ATM (VFIO irqbypass) will use GPAs with S-stage enabled. > > (Note: shouldn't we open a bug against the riscv-iommu spec?) >
I can try writing a sentence or two for the spec to clarify this and send a PR. > I like the idea of ruling out the case where IOVA == VA since that is clear > in the spec. We also know that MSI entries will always contains GPAs. > > This is the change I'm going to make in v9: > > > /* > * Early check for MSI address match when IOVA == GPA to > * support VFIO+irqbypass. The riscv-iommu spec doesn't > * consider the case where a GPA can be produced by S-stage > * only, but we have real life examples like Linux VFIO that > * work that way. The spec alone does not provide a reliable > * way of detecting if IOVA == GPA. > * > * The spec is clear about what is a VA: "If the virtual > * memory scheme selected for neither stage is Bare then > * the IOVA is a VA", in our case "(en_s && en_g)". We also > * know that MSI tables will always hold GPAs. > * > * Thus the check consists of ruling out VAs and checking > * the MSI table. > */ > if (!(en_s && en_g) && (iotlb->perm & IOMMU_WO) && > riscv_iommu_msi_check(s, ctx, iotlb->iova)) { > iotlb->target_as = &s->trap_as; > iotlb->translated_addr = iotlb->iova; > iotlb->addr_mask = ~TARGET_PAGE_MASK; > return 0; > } LGTM Thanks, drew > > Tomasz, let me know if you have any opinions against it. I intend to send > the v9 start of next week. > > > Thanks, > > > Daniel > > > > > > Thanks, > > drew