Hi Eric, On Mon, May 20, 2024 at 10:20:43AM +0200, Eric Auger wrote: > Hi Mostafa, > On 5/16/24 17:20, Mostafa Saleh wrote: > > Hi Eric, > > > > On Wed, May 15, 2024 at 03:48:05PM +0200, Eric Auger wrote: > >> Hi Mostafa, > >> > >> On 4/29/24 05:23, Mostafa Saleh wrote: > >>> This patch adds support for nested(combined) TLB entries. > >> space between nested and (. > > Will do. > >>> The main function combine_tlb() is not used here but in the next > >>> patches, but to simplify the patches it is introduced first. > >>> > >>> Main changes: > >>> 1) New entry added in the TLB, parent_perm, for nested TLB, holds the > >> s/entry/field, s/TLB/SMMUTLBEntry struct > > Will do. > >>> stage-2 permission, this can be used to know the origin of a > >>> permission fault from a cached entry as caching the “and” of the > >>> permissions loses this information. > >>> > >>> SMMUPTWEventInfo is used to hold information about PTW faults so > >>> the event can be populated, the value of stage (which maps to S2 > >>> in the event) used to be set based on the current stage for TLB > >> I don't understand "(which maps to S2 in the event)". What do you mean? > >> This could be S1 or S2 depending on the active stage, no? > > Not really, if the IPA size is larger than S2 input size, this is > > considered stage-1 fault. > > > > For TLB permission fault, yes, that is how it is decided. > > However, with nesting, a permission fault from a cached entry can be > > from a stage-1 or stage-2, that’s why we now cache both and not just > > the combined permission, and the logic to set fault stage is modified > > accordingly. > I meant in smmu_translate() we initially had for permission fault > info->stage = cfg->stage whcih can be S1 or S2. Hence the fact I do not > understand the sentence > > the value of stage (which maps to S2 in the event) > > I understand that with nested this computation needs to change because the > permission can be linked to either the S1 or S2 stage. > Maybe that's just a matter or rephrasing? >
I see, that’s already how it is used now, I will rephrase it in case it is confusing. > > >>> permission faults, however with the parent_perm, it is now set > >>> based on which perm has the missing permission > >>> > >>> When nesting is not enabled it has the same value as perm which > >>> doesn't change the logic. > >>> > >>> 2) As combined TLB implementation is used, the combination logic > >>> chooses: > >>> - tg and level from the entry which has the smallest addr_mask. > >> tbh I am scared bout swapping s1/s2 tg and level. In smmu_iotlb_lookup() > >> I see tt->granule_sz being used which is s1 data. I mean it is not > >> obvious to me this is correct. Could you maybe give more explanations > >> detailing why/how this is guaranted to work. > > As you mentioned the next patch reworks the lookup logic, I can reorder > > the 2 patches if that is better, please let me know what you think? > Yes if you manage to reorder that may be more logical because otherwise > it looks incorrect. Will do. > > > >> Can you give additional details about what s1+s2 combinations were tested? > > I tested with S1 and S2 4K pages > > S1 level = 3 and S2 level = 3 > > S1 level = 2 and S2 level = 3 > > S1 level = 3 and S2 level = 2 > > S1 level = 1 and S2 level = 2 > > > > And also tested with with S1 64K granule and S2 4K. > OK, I would suggest you mention that in the coverletter because it is > reassuring and the combination is not totally obvious - at least to me ;-) - Will do. Thanks, Mostafa > > Eric > > > >>> - Based on that the iova that would be cached is recalculated. > >>> - Translated_addr is chosen from stage-2. > >>> > >>> Signed-off-by: Mostafa Saleh <smost...@google.com> > >>> --- > >>> hw/arm/smmu-common.c | 32 ++++++++++++++++++++++++++++---- > >>> include/hw/arm/smmu-common.h | 1 + > >>> 2 files changed, 29 insertions(+), 4 deletions(-) > >>> > >>> diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c > >>> index 21982621c0..0d6945fa54 100644 > >>> --- a/hw/arm/smmu-common.c > >>> +++ b/hw/arm/smmu-common.c > >>> @@ -394,7 +394,7 @@ static int smmu_ptw_64_s1(SMMUTransCfg *cfg, > >>> tlbe->entry.translated_addr = gpa; > >>> tlbe->entry.iova = iova & ~mask; > >>> tlbe->entry.addr_mask = mask; > >>> - tlbe->entry.perm = PTE_AP_TO_PERM(ap); > >>> + tlbe->parent_perm = tlbe->entry.perm = PTE_AP_TO_PERM(ap); > >> nit: I would prefer on separate lines. > > Will do. > > > >>> tlbe->level = level; > >>> tlbe->granule = granule_sz; > >>> return 0; > >>> @@ -515,7 +515,7 @@ static int smmu_ptw_64_s2(SMMUTransCfg *cfg, > >>> tlbe->entry.translated_addr = gpa; > >>> tlbe->entry.iova = ipa & ~mask; > >>> tlbe->entry.addr_mask = mask; > >>> - tlbe->entry.perm = s2ap; > >>> + tlbe->parent_perm = tlbe->entry.perm = s2ap; > >>> tlbe->level = level; > >>> tlbe->granule = granule_sz; > >>> return 0; > >>> @@ -530,6 +530,27 @@ error: > >>> return -EINVAL; > >>> } > >>> > >>> +/* combine 2 TLB entries and return in tlbe in nested config. */ > >> suggestion: combine S1 and S2 TLB entries into a single entry. As a > >> result the S1 entry is overriden with combined data. > > Will do. > > > >>> +static void __attribute__((unused)) combine_tlb(SMMUTLBEntry *tlbe, > >>> + SMMUTLBEntry *tlbe_s2, > >>> + dma_addr_t iova, > >>> + SMMUTransCfg *cfg) > >>> +{ > >>> + if (tlbe_s2->entry.addr_mask < tlbe->entry.addr_mask) { > >>> + tlbe->entry.addr_mask = tlbe_s2->entry.addr_mask; > >>> + tlbe->granule = tlbe_s2->granule; > >>> + tlbe->level = tlbe_s2->level; > >>> + } > >>> + > >>> + tlbe->entry.translated_addr = CACHED_ENTRY_TO_ADDR(tlbe_s2, > >>> + tlbe->entry.translated_addr); > >>> + > >>> + tlbe->entry.iova = iova & ~tlbe->entry.addr_mask; > >>> + /* parent_perm has s2 perm while perm has s1 perm. */ > >> suggestion: while perm keeps s1 perm. > >> > > Will do. > > > > Thanks, > > Mostafa > >>> + tlbe->parent_perm = tlbe_s2->entry.perm; > >>> + return; > >>> +} > >>> + > >>> /** > >>> * smmu_ptw - Walk the page tables for an IOVA, according to @cfg > >>> * > >>> @@ -607,9 +628,12 @@ SMMUTLBEntry *smmu_translate(SMMUState *bs, > >>> SMMUTransCfg *cfg, dma_addr_t addr, > >>> > >>> cached_entry = smmu_iotlb_lookup(bs, cfg, &tt_combined, > >>> aligned_addr); > >>> if (cached_entry) { > >>> - if ((flag & IOMMU_WO) && !(cached_entry->entry.perm & IOMMU_WO)) > >>> { > >>> + if ((flag & IOMMU_WO) && !(cached_entry->entry.perm & > >>> + cached_entry->parent_perm & IOMMU_WO)) { > >>> info->type = SMMU_PTW_ERR_PERMISSION; > >>> - info->stage = cfg->stage; > >>> + info->stage = !(cached_entry->entry.perm & IOMMU_WO) ? > >>> + SMMU_STAGE_1 : > >>> + SMMU_STAGE_2; > >>> return NULL; > >>> } > >>> return cached_entry; > >>> diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h > >>> index 09d3b9e734..1db566d451 100644 > >>> --- a/include/hw/arm/smmu-common.h > >>> +++ b/include/hw/arm/smmu-common.h > >>> @@ -77,6 +77,7 @@ typedef struct SMMUTLBEntry { > >>> IOMMUTLBEntry entry; > >>> uint8_t level; > >>> uint8_t granule; > >>> + IOMMUAccessFlags parent_perm; > >>> } SMMUTLBEntry; > >>> > >>> /* Stage-2 configuration. */ > >> Thanks > >> > >> Eric > >> >