On Mon, Nov 04, 2019 at 06:36:51PM +0000, Robin Murphy wrote:
> On 04/11/2019 18:17, Will Deacon wrote:
> > On Fri, Oct 25, 2019 at 07:08:35PM +0100, Robin Murphy wrote:
> > > The nature of the LPAE format means that data->pg_shift is always
> > > redundant with data->bits_per_level, since they represent the size of a
> > > page and the number of PTEs per page respectively, and the size of a PTE
> > > is constant. Thus it works out more efficient to only store the latter,
> > > and derive the former via a trivial addition where necessary.
> > > 
> > > Signed-off-by: Robin Murphy <robin.mur...@arm.com>
> > > ---
> > >   drivers/iommu/io-pgtable-arm.c | 29 +++++++++++++----------------
> > >   1 file changed, 13 insertions(+), 16 deletions(-)
> > > 
> > > diff --git a/drivers/iommu/io-pgtable-arm.c 
> > > b/drivers/iommu/io-pgtable-arm.c
> > > index 4b1483eb0ccf..15b4927ce36b 100644
> > > --- a/drivers/iommu/io-pgtable-arm.c
> > > +++ b/drivers/iommu/io-pgtable-arm.c
> > > @@ -36,10 +36,11 @@
> > >    * in a virtual address mapped by the pagetable in d.
> > >    */
> > >   #define ARM_LPAE_LVL_SHIFT(l,d)                                         
> > > \
> > > - (((ARM_LPAE_MAX_LEVELS - 1 - (l)) * (d)->bits_per_level) +      \
> > > - (d)->pg_shift)
> > > + (((ARM_LPAE_MAX_LEVELS - (l)) * (d)->bits_per_level) +          \
> > > + ilog2(sizeof(arm_lpae_iopte)))
> > > -#define ARM_LPAE_GRANULE(d)              (1UL << (d)->pg_shift)
> > > +#define ARM_LPAE_GRANULE(d)                                              
> > > \
> > > + (sizeof(arm_lpae_iopte) << (d)->bits_per_level)
> > >   #define ARM_LPAE_PGD_SIZE(d)                                            
> > > \
> > >           (sizeof(arm_lpae_iopte) << (d)->pgd_bits)
> > > @@ -55,9 +56,7 @@
> > >            ((1 << ((d)->bits_per_level + ARM_LPAE_PGD_IDX(l,d))) - 1))
> > >   /* Calculate the block/page mapping size at level l for pagetable in d. 
> > > */
> > > -#define ARM_LPAE_BLOCK_SIZE(l,d)                                 \
> > > - (1ULL << (ilog2(sizeof(arm_lpae_iopte)) +                       \
> > > -         ((ARM_LPAE_MAX_LEVELS - (l)) * (d)->bits_per_level)))
> > > +#define ARM_LPAE_BLOCK_SIZE(l,d) (1ULL << ARM_LPAE_LVL_SHIFT(l,d))
> > >   /* Page table bits */
> > >   #define ARM_LPAE_PTE_TYPE_SHIFT         0
> > > @@ -175,8 +174,7 @@ struct arm_lpae_io_pgtable {
> > >           int                     pgd_bits;
> > >           int                     start_level;
> > > - unsigned long           pg_shift;
> > > - unsigned long           bits_per_level;
> > > + int                     bits_per_level;
> > >           void                    *pgd;
> > >   };
> > > @@ -206,7 +204,7 @@ static phys_addr_t iopte_to_paddr(arm_lpae_iopte pte,
> > >   {
> > >           u64 paddr = pte & ARM_LPAE_PTE_ADDR_MASK;
> > > - if (data->pg_shift < 16)
> > > + if (data->bits_per_level < 13) /* i.e. 64K granule */
> > 
> > nit, but:
> > 
> >     if (ARM_LPAE_GRANULE(data) < SZ_64K)
> > 
> > might be clearer and avoid the need for a comment?
> 
> Unfortunately GCC doesn't treat the two as directly equivalent (presumably
> due to boundary conditions) so will emit the additional faff to actually
> compute and compare the intermediate value every time, rather than just
> trivially testing the shift. I figured the minor I$/register pressure win
> was worth the small price of a comment.

Bet ya can't measure the difference ;)

I'd prefer the readable version in the absence of numbers.

Will
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Reply via email to