Re: [PATCH v2] POWERPC: Allow 32-bit pgtable code to support 36-bit physical

Becky Bruce Tue, 02 Sep 2008 09:20:07 -0700


On Sep 1, 2008, at 12:19 AM, Benjamin Herrenschmidt wrote:


Thanks for taking the time to dig through this :)

+#ifdef CONFIG_PTE_64BIT
+#define PTE_FLAGS_OFFSET       4       /* offset of PTE flags, in bytes */
+#define LNX_PTE_SIZE           8       /* size of a linux PTE, in bytes */
+#else
+#define PTE_FLAGS_OFFSET       0
+#define LNX_PTE_SIZE           4
+#endif


s/LNX_PTE_SIZE/PTE_BYTES or PTE_SIZE, no need for that horrible LNX
prefix. In fact, if it's only used by the asm, then ditch it and
have asm-offsets.c create something based on sizeof(pte_t).

The reason I did that was to distinguish between the size of asoftware PTE entry, and the actual size of the hardware entry. In thecase of 36b physical, the hardware PTE size is the same as it alwaysis; but we've grown the Linux PTE. We can call it something else, orI can change as you suggest; I was worried that the next person tocome along might get confused. I suppose there aren't too many of usthat are crazy enough to muck around in this code, so maybe it's notthat big of a deal.

#define pte_none(pte)           ((pte_val(pte) & ~_PTE_NONE_MASK) == 0)
#define pte_present(pte)        (pte_val(pte) & _PAGE_PRESENT)
-#define pte_clear(mm,addr,ptep) do { pte_update(ptep, ~0, 0); }while (0)
+#define pte_clear(mm, addr, ptep) \
+       do { pte_update(ptep, ~_PAGE_HASHPTE, 0); } while (0)
Where does this previous definition of pte_clear comes from ? It'sbogus(and it's not like that upstream). Your "updated" ones looks okthough.
But whatever tree has the "previous" one would break hash based ppc32
if merged or is there some other related changes in Kumar tree that
make the above safe ?

That's from Kumar's tree of changes for BookE.... he'll need to answerthat. I think he put out a patchset with that work; not sure if itwas correct in this particular or not.

#define pmd_none(pmd)           (!pmd_val(pmd))
#define pmd_bad(pmd)            (pmd_val(pmd) & _PMD_BAD)

@@ -664,8 +670,30 @@ static inline void set_pte_at(struct mm_struct*mm, unsigned long addr,

                              pte_t *ptep, pte_t pte)
{
#if _PAGE_HASHPTE != 0
+#ifndef CONFIG_PTE_64BIT
        pte_update(ptep, ~_PAGE_HASHPTE, pte_val(pte) & ~_PAGE_HASHPTE);
#else
+       /*
+        * We have to do the write of the 64b pte as 2 stores.  This
+        * code assumes that the entry we're storing to is currently
+        * not valid and that all callers have the page table lock.
+        * Having the entry be not valid protects readers who might read
+        * between the first and second stores.
+        */
+       unsigned int tmp;


Do you know for sure the entry isn't valid ? On ppc64, we explicitely
test for that and if it was valid, we clear and flush it. The generic

code has been going on and off about calling set_pte_at() on analready

valid PTE, I wouldn't rely too much on it guaranteeing it will not
happen. The 32b PTE code is safe because it preserves _PAGE_HASHPTE .

IIRC, we discussed this at some point, months ago, and decided thiswas the case. If it *can* be valid, and it's possible to have a readeron another cpu, I think we end up having to go through a very grosssequence where we have to mark it invalid before we start muckingaround with it; otherwise a reader can get half-and-half. Perhaps I'mmissing a key restriction here?



Note also that once you have (which you don't now) the guarantee
that your previous PTE is invalid, then you can safely do two normal
stores instead of a lwarx/stwcx. loop. In any case, having the stw in
the middle of the loop doesn't sound very useful.

If the pte is invalid, and we're SMP, we need the store to the highword to occur before the stwcx that sets the valid bit, unless you'recertain we can't have a reader on another cpu. If we're not SMP, thenI agree, let's move that store.

I thought we discussed at some point the possibility that there mightbe an updater on another CPU? Is this not possible? If not, I'llchange it. The existing code is also lwarx/stwcxing via pte_update();is there no reason for that? We should probably change that as well,if that's the case.

+       __asm__ __volatile__("\
+1:     lwarx   %0,0,%4\n\
+       rlwimi  %L2,%0,0,30,30\n\
+       stw     %2,0(%3)\n\
+       eieio\n\
+       stwcx.  %L2,0,%4\n\
+       bne-    1b"
+       : "=&r" (tmp), "=m" (*ptep)

+ : "r" (pte), "r" (ptep), "r" ((unsigned long)(ptep) + 4),"m" (*ptep)

+       : "cc");
+#endif /* CONFIG_PTE_64BIT */
+#else /* _PAGE_HASHPTE == 0 */
#if defined(CONFIG_PTE_64BIT) && defined(CONFIG_SMP)
        __asm__ __volatile__("\
                stw%U0%X0 %2,%0\n\

diff --git a/arch/powerpc/mm/hash_low_32.S b/arch/powerpc/mm/hash_low_32.S

index b9ba7d9..d63e20a 100644
--- a/arch/powerpc/mm/hash_low_32.S
+++ b/arch/powerpc/mm/hash_low_32.S
@@ -75,7 +75,7 @@ _GLOBAL(hash_page_sync)
 * Returns to the caller if the access is illegal or there is no
 * mapping for the address.  Otherwise it places an appropriate PTE
 * in the hash table and returns from the exception.
- * Uses r0, r3 - r8, ctr, lr.
+ * Uses r0, r3 - r8, r10, ctr, lr.
 */
        .text
_GLOBAL(hash_page)
@@ -106,9 +106,15 @@ _GLOBAL(hash_page)
        addi    r5,r5,[EMAIL PROTECTED] /* kernel page table */
        rlwimi  r3,r9,32-12,29,29       /* MSR_PR -> _PAGE_USER */
112:    add     r5,r5,r7                /* convert to phys addr */
+#ifndef CONFIG_PTE_64BIT
        rlwimi  r5,r4,12,20,29          /* insert top 10 bits of address */
        lwz     r8,0(r5)                /* get pmd entry */
        rlwinm. r8,r8,0,0,19            /* extract address of pte page */
+#else
+       rlwinm  r8,r4,13,19,29          /* Compute pgdir/pmd offset */
+       lwzx    r8,r8,r5                /* Get L1 entry */
+       rlwinm. r8,r8,0,0,20            /* extract pt base address */
+#endif


Any reason you wrote the above using a different technique ? If you

believe that rlwinm/lwzx is going to be more efficient than rlwimi/lwz,

maybe we should fix the old one ... or am I missing something totally
obvious ? :-)

Nothing so exciting there - I borrowed this code from the BookEversion. I'm happy to change it.


Thanks, Ben!

Cheers,
Becky
_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Re: [PATCH v2] POWERPC: Allow 32-bit pgtable code to support 36-bit physical

Reply via email to