On 2016-03-01 8:41 PM, Paul Gortmaker wrote:
[Re: [poky] [PATCH 1/1] poky: update qemu* to prefer 4.4 kernel] On 13/02/2016 
(Sat 17:17) Richard Purdie wrote:

I'm moving the discussion to OE-Core and pulling in some kernel people.
I think I understand what is wrong and how to fix it but I could use
someone who actually knows this code.

To summarise the story so far, on qemux86, X doesn't start and there is
a backtrace in the logs:

x86/PAT: Xorg:705 map pfn expected mapping type uncached-minus for [mem 
0xfd000000-0xfdffffff], got write-combining

So Bruce helped me set up a reproducer locally today since he'd already
invested the time on that, and then I boiled that down to divorce it
from the slower steps of build-deploy-boot to make the bisect something
that mortal humans could tolerate.

Amusingly enough that led to:

commit 9cd25aac1f44f269de5ecea11f7d927f37f1d01c
Author: Borislav Petkov <b...@suse.de>
Date:   Thu Jun 4 18:55:10 2015 +0200

     x86/mm/pat: Emulate PAT when it is disabled

So while some of us were joking on IRC about the validity of forcibly
disabling PAT (via cmdline or Kconfig) as a workaround, the one line
shortlog above tells us that it wasn't so off the mark after all.

Bruce and I will decide what to do with this tomorrow, but since Richard
spent so much time on it, I thought he'd like to know this in the
interim.  Good times.   :-/

As another follow up. The thread can be summarized as "It doesn't
look like it should have worked before, and qemu's pat emulation
may be the issue'.

The suggestion is to run with 'nopat', which is what Richard originally
did.

So I'm going to prep a patch that drops the kernel patch, and leaves
nopat enabled on the qemu command line. That should get us put back
together in a semi-permanent way.

Bruce


Paul.
--


------------[ cut here ]------------
WARNING: CPU: 0 PID: 705 at 
/media/build1/poky/build/tmp/work-shared/qemux86/kernel-source/arch/x86/mm/pat.c:985
 untrack_pfn+0xaf/0xc0()
Modules linked in: uvesafb
CPU: 0 PID: 705 Comm: Xorg Not tainted 4.4.1-yocto-standard #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
  00000000 00000000 cf14dd78 c1397ab2 00000000 cf14dda8 c1051477 c1aa4f6c
  00000000 000002c1 c1a9fa4c 000003d9 c104b98f c104b98f cf244000 b6355000
  00000000 cf14ddb8 c1051552 00000009 00000000 cf14dde0 c104b98f cf14ddd0
Call Trace:
  [<c1397ab2>] dump_stack+0x4b/0x79
  [<c1051477>] warn_slowpath_common+0x87/0xc0
  [<c104b98f>] ? untrack_pfn+0xaf/0xc0
  [<c104b98f>] ? untrack_pfn+0xaf/0xc0
  [<c1051552>] warn_slowpath_null+0x22/0x30
  [<c104b98f>] untrack_pfn+0xaf/0xc0
  [<c104d54c>] ? kmap_atomic_prot+0x3c/0xf0
  [<c114e17f>] unmap_single_vma+0x4ef/0x500
  [<c114f007>] unmap_vmas+0x37/0x50
  [<c1154f8f>] exit_mmap+0x5f/0xf0
  [<c104eedd>] mmput+0x2d/0xb0
  [<c105009c>] copy_process+0xd2c/0x13c0
  [<c1050892>] _do_fork+0x82/0x340
  [<c105f2d1>] ? SyS_rt_sigaction+0x51/0xa0
  [<c1050c3c>] SyS_clone+0x2c/0x30
  [<c1001a03>] do_syscall_32_irqs_on+0x53/0xb0
  [<c189a94a>] entry_INT80_32+0x2a/0x2a
---[ end trace be3e0a61097feddc ]---
x86/PAT: Xorg:705 map pfn expected mapping type uncached-minus for [mem 
0xfd000000-0xfdffffff], got write-combining

The entry in question is setup by uvesafb which in its
uvesafb_ioremap() function calls ioremap_wc().

It appears that Xorg mmaps this from userspace, then later does a
fork() to execute a utility. At this point, when creating the vmas for
the new process, the pat code says "eeek!" as the protection mode for
the new vmas don't match the old one, returns -EINVAL, the process dies
and X goes with it.

There are a few hammers we can hit this with, we can boot with "nopat"
option which makes the problem go away, or turn off CONFIG_X86_PAT. No
surprises there. Changing uvesafb to use mtrr=0 doesn't help since the
ioremap_wc call still happens.

The real issue is the "expected mapping type uncached-minus for got
write-combining" message, it all goes wrong from there.

Upon looking at the code and scratching my head for a long while, I
notice that there are two ways of representing the protection mode
data, "enum page_cache_mode" and "pgprot_t & _PAGE_CACHE_MASK".

The exact meaning of pgprot_t depends on which CPU you're running,
older CPUs have errata meaning only a small number of bits can be used.
The exact mapping table is determined by __cachemode2pte_tbl and is
updated at boot by calls from update_cache_mode_entry().

The result of this if you map enum -> pgprot_t, then try to do pgprot_t
-> enum, you can get different values since its not a 1:1 mapping.

This means the comparison in reserve_pfn_range() where it does "pcm !=
want_pcm" isn't correct and can trigger even in cases where there isn't
a problem.

This can be "fixed" by doing cachemode2protval(pcm) !=
cachemode2protval(want_pcm) and checking whether the protection bits
match, rather than the enum values, since in reality this is what we
really care about.

I can confirm that if I make that change, X boots up just fine.

The problem is I really have no idea what I'm doing :).

Could someone who understands this code have a look and see whether the
above makes sense and if it does, perhaps open a discussion with
upstream about how to fix this properly (assuming my change isn't
actually the correct fix)?

We don't see this on qemux86-64 since that has more PAT bits working
and hence the values map correctly.

Bruce: Would you accept a patch doing the above for now?

Cheers,

Richard



--
_______________________________________________
Openembedded-core mailing list
Openembedded-core@lists.openembedded.org
http://lists.openembedded.org/mailman/listinfo/openembedded-core

Reply via email to