* Andi Kleen <[EMAIL PROTECTED]> wrote:

> Fix early_ioremap on x86-64
> 
> [Venki, sorry for blaming PAT for this earlier. It was innocent.] 
> [Note the patch is on top of git-x86 + gbpages applied. I think there 
> will be an trivial reject without gbpages-direct applied first]
> 
> I had ACPI failures on several machines since a few days. Symptom was 
> NUMA nodes not getting detected or worse cores not getting detected. 
> They all came from ACPI not being able to read various of its tables. 
> I finally bisected it down to Jeremy's "put _PAGE_GLOBAL into 
> PAGE_KERNEL" change. With that the fix was fairly obvious. The problem 
> was that early_ioremap() didn't use a "_all" flush that would affect 
> the global PTEs too. So with global bits getting used everywhere now 
> an early_ioremap would not actually flush a mapping if something else 
> was mapped previously on that slot (which can happen with 
> early_iounmap inbetween)
> 
> This patch changes all flushes in init_64.c to be __flush_tlb_all() 
> and fixes the problem here.

ah, very nice! This is the bad commit:

  ------------------------->
  Subject:  x86: fold _PAGE_GLOBAL into __PAGE_KERNEL
  From: Jeremy Fitzhardinge <[EMAIL PROTECTED]>

  With the iounmap problem resolved, it should be OK to always set
  _PAGE_GLOBAL in __PAGE_KERNEL*.

  [ Did this patch cause problems before? ]
  <------------------------

Jeremy did suspect something about this change, as indicated in the 
changelog. But because the change was so finegrained, the bisection 
almost directly led to the fix. _That_ i think clearly demonstrates the 
power of bisection and finegrained changes.

but note what the fundamental problem is - we've turned a previously 
safe flushing API into an unsafe one - __flush_tlb() will only be safe 
in the rarest of circumstances. There are some other matches:

 ./mm/init_64.c: __flush_tlb();
 ./kernel/head64.c:      __flush_tlb();

The boot identity mappings zapped do not have PGE set at the moment, but 
they could in the future (once we do native pagetable setup straight 
from flat mode) - and this is not a performance critical path anyway.

 ./kernel/cpu/mtrr/generic.c:    __flush_tlb();
 ./kernel/cpu/mtrr/generic.c:    __flush_tlb();

these include an open-coded version of __flush_tlb_all() so they are 
safe.

and we might as well make the non-PGE flush the 'special API'. I.e. 
rename __flush_tlb() to __flush_tlb_partial() and rename 
__flush_tlb_all() to __flush_tlb(). This makes it very apparent which 
should be used by default and which does what.

        Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to