Excerpts from Christophe Leroy's message of January 14, 2021 10:25 pm: > > > Le 14/01/2021 à 13:09, Nicholas Piggin a écrit : >> Excerpts from Nicholas Piggin's message of January 14, 2021 1:24 pm: >>> Excerpts from Christophe Leroy's message of January 14, 2021 12:12 am: >>>> >>>> >>>> Le 13/01/2021 à 08:31, Nicholas Piggin a écrit : >>>>> The page fault handling still has some complex logic particularly around >>>>> hash table handling, in asm. Implement this in C instead. >>>>> >>>>> Signed-off-by: Nicholas Piggin <npig...@gmail.com> >>>>> --- >>>>> arch/powerpc/include/asm/book3s/64/mmu-hash.h | 1 + >>>>> arch/powerpc/kernel/exceptions-64s.S | 131 +++--------------- >>>>> arch/powerpc/mm/book3s64/hash_utils.c | 77 ++++++---- >>>>> arch/powerpc/mm/fault.c | 46 ++++-- >>>>> 4 files changed, 107 insertions(+), 148 deletions(-) >>>>> >>>>> diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h >>>>> b/arch/powerpc/include/asm/book3s/64/mmu-hash.h >>>>> index 066b1d34c7bc..60a669379aa0 100644 >>>>> --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h >>>>> +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h >>>>> @@ -454,6 +454,7 @@ static inline unsigned long hpt_hash(unsigned long >>>>> vpn, >>>>> #define HPTE_NOHPTE_UPDATE 0x2 >>>>> #define HPTE_USE_KERNEL_KEY 0x4 >>>>> >>>>> +int do_hash_fault(struct pt_regs *regs, unsigned long ea, unsigned long >>>>> dsisr); >>>>> extern int __hash_page_4K(unsigned long ea, unsigned long access, >>>>> unsigned long vsid, pte_t *ptep, unsigned >>>>> long trap, >>>>> unsigned long flags, int ssize, int >>>>> subpage_prot); >>>>> diff --git a/arch/powerpc/kernel/exceptions-64s.S >>>>> b/arch/powerpc/kernel/exceptions-64s.S >>>>> index 6e53f7638737..bcb5e81d2088 100644 >>>>> --- a/arch/powerpc/kernel/exceptions-64s.S >>>>> +++ b/arch/powerpc/kernel/exceptions-64s.S >>>>> @@ -1401,14 +1401,15 @@ END_FTR_SECTION_IFSET(CPU_FTR_HVMODE) >>>>> * >>>>> * Handling: >>>>> * - Hash MMU >>>>> - * Go to do_hash_page first to see if the HPT can be filled from an >>>>> entry in >>>>> - * the Linux page table. Hash faults can hit in kernel mode in a fairly >>>>> + * Go to do_hash_fault, which attempts to fill the HPT from an entry >>>>> in the >>>>> + * Linux page table. Hash faults can hit in kernel mode in a fairly >>>>> * arbitrary state (e.g., interrupts disabled, locks held) when >>>>> accessing >>>>> * "non-bolted" regions, e.g., vmalloc space. However these should >>>>> always be >>>>> - * backed by Linux page tables. >>>>> + * backed by Linux page table entries. >>>>> * >>>>> - * If none is found, do a Linux page fault. Linux page faults can >>>>> happen in >>>>> - * kernel mode due to user copy operations of course. >>>>> + * If no entry is found the Linux page fault handler is invoked (by >>>>> + * do_hash_fault). Linux page faults can happen in kernel mode due to >>>>> user >>>>> + * copy operations of course. >>>>> * >>>>> * KVM: The KVM HDSI handler may perform a load with MSR[DR]=1 in >>>>> guest >>>>> * MMU context, which may cause a DSI in the host, which must go to >>>>> the >>>>> @@ -1439,13 +1440,17 @@ EXC_COMMON_BEGIN(data_access_common) >>>>> GEN_COMMON data_access >>>>> ld r4,_DAR(r1) >>>>> ld r5,_DSISR(r1) >>>> >>>> We have DSISR here. I think the dispatch between page fault or do_break() >>>> should be done here: >>>> - It would be more similar to other arches >>> >>> Other sub-archs? >>> >>>> - Would avoid doing it also in instruction fault >>> >>> True but it's hidden under an unlikely branch so won't really help >>> instruction fault. >>> >>>> - Would avoid that -1 return which looks more like a hack. >>> >>> I don't really see it as a hack, we return a code to asm caller to >>> direct whether to restore registers or not, we alrady have this >>> pattern. >>> >>> (I'm hoping all that might be go away one day by conrolling NV >>> regs from C if we can get good code generation but even if not we >>> still have it in the interrupt returns). >>> >>> That said I will give it a try here. At very least it might be a >>> better intermediate step. >> >> Ah yes, this way doesn't work well for later patches because you end >> e.g., with the do_break call having to call the interrupt handler >> wrappers again when they actually expect to be in the asm entry state >> (e.g., irq soft-mask state) when called, and return via interrupt_return >> after the exit wrapper runs (which 64s uses to implement better context >> tracking for example). >> >> That could possibly be hacked up to deal with multiple interrupt >> wrappers per interrupt, but I'd rather not go backwards. >> >> That does leave the other sub archs as having this issue, but they don't >> do so much in their handlers. 32 doesn't have soft-mask or context >> tracking to deal with for example. We will need to fix this up though >> and unify things more. >> > > Not sure I understand what you mean exactly. > > On the 8xx, do_break() is called by totally different exceptions: > - Exception 0x1c00 Data breakpoint ==> do_break() > - Exception 0x1300 Instruction TLB error ==> handle_page_fault() > - Exception 0x1400 Data TLB error ==> handle_page_fault() > > On book3s/32, we now (after my patch ie patch 1 in your series ) have either > do_break() or > handle_page_fault() being called from very early in ASM. > > If you do the same in book3s/64, then there is no issue with interrupt > wrappers being called twice, > is it ?
bad_page_fault is the problem, it has to go afterwards. Once we have the changed 64s behaviour of do_page_fault, I don't know if there is any point leaving do_break in asm is there? I guess it is neat to treat it quite separately, I might need to count fast path branches... I have done the split anyway already, so I will post your way first. Thanks, Nick