Re: Found the commit for: 5.3.7 64-bits kernel doesn't boot on G5 Quad [regression]
John Paul Adrian Glaubitz writes: > Hi! > > On 12/10/19 9:35 AM, Romain Dolbeau wrote: >> Le sam. 16 nov. 2019 à 17:34, Romain Dolbeau a écrit : >>> So it seems to me that 0034d395f89d9c092bb15adbabdca5283e258b41 >>> introduced the bug that crashes the PowerMac G5 >> >> There's been some commits in that subsystem, so I tried again; as of >> 6794862a16ef41f753abd75c03a152836e4c8028, the kernel still crashes >> when trying to boot my PowerMac G5. > > If Aneesh is currently unable to look at the problem, I would suggest > reverting > the commit in question since I don't think it's acceptable that users are > unable > to boot their machines anymore after a kernel upgrade. > The PowerMac system we have internally was not able to recreate this. Hence we have not been able to make progress on this. At this point, I am not sure what would cause the Machine check with that patch series because we have not changed the VA bits in that patch. -aneesh
Re: PPC64: G5 & 4k/64k page size (was: Re: Call for report - G5/PPC970 status)
Romain Dolbeau writes: > Le jeu. 12 déc. 2019 à 22:40, Andreas Schwab a écrit : >> I'm using 4K pages, in case that matters > > Yes it does matter, as it seems to be the difference between "working" > and "not working" :-) > Thank you for the config & pointing out the culprit! > > With your config, my machine boots (though it's missing some features > as the config seems quite tuned). > > Moving from 64k pages to 4k pages on 'my' config (essentially, > Debian's 5.3 with default values for changes since), my machine boots > as well & everything seems to work fine. > > So question to Aneesh - did you try 64k pages on your G5, or only 4k? > In the second case, could you try with 64k to see if you can reproduce > the crash? I don't have direct access to this system, I have asked if we can get a run with 64K. Meanwhile is there a way to find out what caused MachineCheck? more details on this? I was checking the manual and I don't see any restrictions w.r.t effective address. We now have very high EA with 64K page size. -aneesh
Re: PPC64: G5 & 4k/64k page size (was: Re: Call for report - G5/PPC970 status)
Romain Dolbeau writes: > Le sam. 21 déc. 2019 à 05:31, Aneesh Kumar K.V > a écrit : >> I don't have direct access to this system, I have asked if we can get a run >> with 64K. > > OK, thanks! Do you know which model it is? It seems to be working on > some systems, > but we don't have enough samples to figure out why at this time, I think. > >> Meanwhile is there a way to find out what caused MachineCheck? more >> details on this? I was checking the manual and I don't see any >> restrictions w.r.t effective address. We now have very high EA with 64K >> page size. > > Sorry, no idea, completely out of my depth here. I can try some kernel > (build, runtime) options and/or patch, but someone will have to tell > me what to try, > as I have no ideas. Can you try this change. modified arch/powerpc/include/asm/book3s/64/mmu-hash.h @@ -580,7 +580,7 @@ extern void slb_set_size(u16 size); #if (MAX_PHYSMEM_BITS > MAX_EA_BITS_PER_CONTEXT) #define MAX_KERNEL_CTX_CNT (1UL << (MAX_PHYSMEM_BITS - MAX_EA_BITS_PER_CONTEXT)) #else -#define MAX_KERNEL_CTX_CNT 1 +#define MAX_KERNEL_CTX_CNT 4 #endif #define MAX_VMALLOC_CTX_CNT1 -aneesh
Re: [Regression 5.7-rc1] Random hangs on 32-bit PowerPC (PowerBook6,7)
Christophe Leroy writes: > Le 18/05/2020 à 17:19, Rui Salvaterra a écrit : >> Hi again, Christophe, >> >> On Mon, 18 May 2020 at 15:03, Christophe Leroy >> wrote: >>> >>> Can you try reverting 697ece78f8f749aeea40f2711389901f0974017a ? It may >>> have broken swap. >> >> Yeah, that was a good call. :) Linux 5.7-rc1 with the revert on top >> survives the beating. I'll be happy to test a definitive patch! >> > > Yeah I discovered recently that the way swap is implemented on powerpc > expects RW and other important bits not be one of the 3 least > significant bits (see __pte_to_swp_entry() ) The last 3 bits are there to track the _PAGE_PRESENT right? What is the RW dependency there? Are you suggesting of read/write migration entry? A swap entry should not retain the pte rw bits right? A swap entry is built using swap type + offset. And it should not have a dependency on pte RW bits. Along with type and offset we also should have the ability to mark it as a pte entry and also set not present bits. With that understanding what am I missing here? > > I guess the easiest for the time being is to revert the commit with a > proper explanation of the issue, then one day we'll modify the way > powerpc manages swap. > -aneesh
Re: [Regression 5.7-rc1] Random hangs on 32-bit PowerPC (PowerBook6,7)
On 5/20/20 7:23 PM, Christophe Leroy wrote: Le 20/05/2020 à 15:43, Aneesh Kumar K.V a écrit : Christophe Leroy writes: Le 18/05/2020 à 17:19, Rui Salvaterra a écrit : Hi again, Christophe, On Mon, 18 May 2020 at 15:03, Christophe Leroy wrote: Can you try reverting 697ece78f8f749aeea40f2711389901f0974017a ? It may have broken swap. Yeah, that was a good call. :) Linux 5.7-rc1 with the revert on top survives the beating. I'll be happy to test a definitive patch! Yeah I discovered recently that the way swap is implemented on powerpc expects RW and other important bits not be one of the 3 least significant bits (see __pte_to_swp_entry() ) The last 3 bits are there to track the _PAGE_PRESENT right? What is the RW dependency there? Are you suggesting of read/write migration entry? A swap entry should not retain the pte rw bits right? A swap entry is built using swap type + offset. And it should not have a dependency on pte RW bits. Along with type and offset we also should have the ability to mark it as a pte entry and also set not present bits. With that understanding what am I missing here? That's probably me who is missing something, I have not digged into the swap functionning yet indeed, so that was only my first feeling. By the way, the problems is definitely due to the order changes in the PTE bits, whether that's because _PAGE_RW was moved to the last 3 bits or whether that's because _PAGE_PRESENT was moved out of the last 3 bits, I don't know yet. My (bad) understanding is from the fact that __pte_to_swp_entry() is a right shift by 3 bits, so it looses the last 3 bits, and therefore __swp_entry_to_pte(__pte_to_swp_entry(pte)) looses the last 3 bits of a PTE. Is there somewhere a description of how swap works exactly ? Looking at __set_pte_at(), I am wondering whether this was due to _PAGE_HASHPTE? . This would mean we end up wrongly updating some swap entry details. We call set_pte_at() on swap pte entries. -aneesh