On Mon, Nov 04, 2019 at 06:36:51PM +0000, Robin Murphy wrote: > On 04/11/2019 18:17, Will Deacon wrote: > > On Fri, Oct 25, 2019 at 07:08:35PM +0100, Robin Murphy wrote: > > > The nature of the LPAE format means that data->pg_shift is always > > > redundant with data->bits_per_level, since they represent the size of a > > > page and the number of PTEs per page respectively, and the size of a PTE > > > is constant. Thus it works out more efficient to only store the latter, > > > and derive the former via a trivial addition where necessary. > > > > > > Signed-off-by: Robin Murphy <robin.mur...@arm.com> > > > --- > > > drivers/iommu/io-pgtable-arm.c | 29 +++++++++++++---------------- > > > 1 file changed, 13 insertions(+), 16 deletions(-) > > > > > > diff --git a/drivers/iommu/io-pgtable-arm.c > > > b/drivers/iommu/io-pgtable-arm.c > > > index 4b1483eb0ccf..15b4927ce36b 100644 > > > --- a/drivers/iommu/io-pgtable-arm.c > > > +++ b/drivers/iommu/io-pgtable-arm.c > > > @@ -36,10 +36,11 @@ > > > * in a virtual address mapped by the pagetable in d. > > > */ > > > #define ARM_LPAE_LVL_SHIFT(l,d) > > > \ > > > - (((ARM_LPAE_MAX_LEVELS - 1 - (l)) * (d)->bits_per_level) + \ > > > - (d)->pg_shift) > > > + (((ARM_LPAE_MAX_LEVELS - (l)) * (d)->bits_per_level) + \ > > > + ilog2(sizeof(arm_lpae_iopte))) > > > -#define ARM_LPAE_GRANULE(d) (1UL << (d)->pg_shift) > > > +#define ARM_LPAE_GRANULE(d) > > > \ > > > + (sizeof(arm_lpae_iopte) << (d)->bits_per_level) > > > #define ARM_LPAE_PGD_SIZE(d) > > > \ > > > (sizeof(arm_lpae_iopte) << (d)->pgd_bits) > > > @@ -55,9 +56,7 @@ > > > ((1 << ((d)->bits_per_level + ARM_LPAE_PGD_IDX(l,d))) - 1)) > > > /* Calculate the block/page mapping size at level l for pagetable in d. > > > */ > > > -#define ARM_LPAE_BLOCK_SIZE(l,d) \ > > > - (1ULL << (ilog2(sizeof(arm_lpae_iopte)) + \ > > > - ((ARM_LPAE_MAX_LEVELS - (l)) * (d)->bits_per_level))) > > > +#define ARM_LPAE_BLOCK_SIZE(l,d) (1ULL << ARM_LPAE_LVL_SHIFT(l,d)) > > > /* Page table bits */ > > > #define ARM_LPAE_PTE_TYPE_SHIFT 0 > > > @@ -175,8 +174,7 @@ struct arm_lpae_io_pgtable { > > > int pgd_bits; > > > int start_level; > > > - unsigned long pg_shift; > > > - unsigned long bits_per_level; > > > + int bits_per_level; > > > void *pgd; > > > }; > > > @@ -206,7 +204,7 @@ static phys_addr_t iopte_to_paddr(arm_lpae_iopte pte, > > > { > > > u64 paddr = pte & ARM_LPAE_PTE_ADDR_MASK; > > > - if (data->pg_shift < 16) > > > + if (data->bits_per_level < 13) /* i.e. 64K granule */ > > > > nit, but: > > > > if (ARM_LPAE_GRANULE(data) < SZ_64K) > > > > might be clearer and avoid the need for a comment? > > Unfortunately GCC doesn't treat the two as directly equivalent (presumably > due to boundary conditions) so will emit the additional faff to actually > compute and compare the intermediate value every time, rather than just > trivially testing the shift. I figured the minor I$/register pressure win > was worth the small price of a comment.
Bet ya can't measure the difference ;) I'd prefer the readable version in the absence of numbers. Will _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu