On 04/11/2019 18:17, Will Deacon wrote:
On Fri, Oct 25, 2019 at 07:08:35PM +0100, Robin Murphy wrote:
The nature of the LPAE format means that data->pg_shift is always
redundant with data->bits_per_level, since they represent the size of a
page and the number of PTEs per page respectively, and the size of a PTE
is constant. Thus it works out more efficient to only store the latter,
and derive the former via a trivial addition where necessary.

Signed-off-by: Robin Murphy <robin.mur...@arm.com>
---
  drivers/iommu/io-pgtable-arm.c | 29 +++++++++++++----------------
  1 file changed, 13 insertions(+), 16 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 4b1483eb0ccf..15b4927ce36b 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -36,10 +36,11 @@
   * in a virtual address mapped by the pagetable in d.
   */
  #define ARM_LPAE_LVL_SHIFT(l,d)                                               
\
-       (((ARM_LPAE_MAX_LEVELS - 1 - (l)) * (d)->bits_per_level) +   \
-       (d)->pg_shift)
+       (((ARM_LPAE_MAX_LEVELS - (l)) * (d)->bits_per_level) +               \
+       ilog2(sizeof(arm_lpae_iopte)))
-#define ARM_LPAE_GRANULE(d) (1UL << (d)->pg_shift)
+#define ARM_LPAE_GRANULE(d)                                            \
+       (sizeof(arm_lpae_iopte) << (d)->bits_per_level)
  #define ARM_LPAE_PGD_SIZE(d)                                          \
        (sizeof(arm_lpae_iopte) << (d)->pgd_bits)
@@ -55,9 +56,7 @@
         ((1 << ((d)->bits_per_level + ARM_LPAE_PGD_IDX(l,d))) - 1))
/* Calculate the block/page mapping size at level l for pagetable in d. */
-#define ARM_LPAE_BLOCK_SIZE(l,d)                                       \
-       (1ULL << (ilog2(sizeof(arm_lpae_iopte)) +                 \
-               ((ARM_LPAE_MAX_LEVELS - (l)) * (d)->bits_per_level)))
+#define ARM_LPAE_BLOCK_SIZE(l,d)       (1ULL << ARM_LPAE_LVL_SHIFT(l,d))
/* Page table bits */
  #define ARM_LPAE_PTE_TYPE_SHIFT               0
@@ -175,8 +174,7 @@ struct arm_lpae_io_pgtable {
int pgd_bits;
        int                     start_level;
-       unsigned long           pg_shift;
-       unsigned long           bits_per_level;
+       int                     bits_per_level;
void *pgd;
  };
@@ -206,7 +204,7 @@ static phys_addr_t iopte_to_paddr(arm_lpae_iopte pte,
  {
        u64 paddr = pte & ARM_LPAE_PTE_ADDR_MASK;
- if (data->pg_shift < 16)
+       if (data->bits_per_level < 13) /* i.e. 64K granule */

nit, but:

        if (ARM_LPAE_GRANULE(data) < SZ_64K)

might be clearer and avoid the need for a comment?

Unfortunately GCC doesn't treat the two as directly equivalent (presumably due to boundary conditions) so will emit the additional faff to actually compute and compare the intermediate value every time, rather than just trivially testing the shift. I figured the minor I$/register pressure win was worth the small price of a comment.

Robin.

(I can make the change locally if you agree)

Will

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Reply via email to