On Fri, Sep 8, 2017 at 12:59 AM, Jiri Kosina <ji...@kernel.org> wrote: > On Thu, 7 Sep 2017, Andy Lutomirski wrote: > >> Jiri reported a resume-from-hibernation failure triggered by PCID. >> The root cause appears to be rather odd. The hibernation asm >> restores a CR3 value that comes from the image header. If the image >> kernel has PCID on, it's entirely reasonable for this CR3 value to >> have one of the low 12 bits set. The restore code restores it with >> CR4.PCIDE=0, which means that those low 12 bits are accepted by the >> CPU but are either ignored or interpreted as a caching mode. This >> is odd, but still works. We blow up later when the image kernel >> restores CR4, though, since changing CR4.PCIDE with CR3[11:0] != 0 >> is illegal. Boom! >> >> FWIW, it's entirely unclear to me what's supposed to happen if a PAE >> kernel restores a non-PAE image or vice versa. Ditto for LA57. > > I've just performed 15 hibernation cycles with current Linus' tree > (5969d1bb3082) with these two patches applied on top of it, and I haven't > encountered any issue (and the warning in switch_mm_irqs_off() didn't > trigger either). > >> Reported-by: Jiri Kosina <ji...@kernel.org> >> Fixes: 660da7c9228f ("x86/mm: Enable CR4.PCIDE on supported systems") >> Signed-off-by: Andy Lutomirski <l...@kernel.org> > > Tested-by: Jiri Kosina <jkos...@suse.cz> >
Ingo, please do *not* apply this patch yet. The code is fine, but the comment is about to become wrong. I just found a nasty initialization order issue, and I need to rework a bunch of the way we deal with PCIDE.