Am 19.11.15 um 11:38 schrieb Andrew Cooper:
On 19/11/15 10:24, Jan Beulich wrote:
On 19.11.15 at 00:17, <andrew.coop...@citrix.com> wrote:
The disassembly of do_IRQ now looks like a plausible function, but the
consistently faulting address has no plausible way of generating a
double fault. I suspect therefore that something has caused memory
corruption in Xen .text section.
Dump of assembler code for function do_IRQ:
0xffff82d080176577 <+0>: push %rbp
0xffff82d080176578 <+1>: mov %rsp,%rbp
0xffff82d08017657b <+4>: push %r15
0xffff82d08017657d <+6>: push %r14
0xffff82d08017657f <+8>: push %r13
0xffff82d080176581 <+10>: push %r12
0xffff82d080176583 <+12>: push %rbx
0xffff82d080176584 <+13>: lea -0x1058(%rsp),%rsp
0xffff82d08017658c <+21>: orq $0x0,(%rsp)
0xffff82d080176591 <+26>: lea 0x1020(%rsp),%rsp
The orq surely has potential for causing a double fault, if %rsp is
near the stack limit. The two LEAs look suspect, presumably a
result of some non-standard option passed to gcc. Removing that
option might already be a step forward.
Andrew, Jan - thanks again.
In terms of non-standard options passed to gcc I have tried to make sense of
what flags are actually being used during the build process. I am not
absolutely sure, but I think the options passed to gcc are as follows:
I do have system wide flags which are used for non-debug builds:
CFLAGS="-march=native -O2 -pipe -fomit-frame-pointer"
CXXFLAGS="${CFLAGS}"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
for builds with debug symbols (using splitdebug) there are system wide
overrides as follows:
CFLAGS="-march=native -O2 -pipe -ggdb"
CXXFLAGS="${CFLAGS}"
LDFLAGS: I'd assume that this inherits its value from the system wide setting
of LDFLAGS
for xen (the hypervisor) the build system seems to do the following:
CFLAGS="" (i.e. unset CFLAGS)
to me this indicates that the rest stays untouched (i.e. either standard or
debug flags)
for xen-tools (includes e.g. hvmloader) the build system appears to to the
following:
CFLAGS="" (i.e. unset CFLAGS)
CXXFLAGS="${CXXFLAGS} -fno-strict-overflow"
LDFLAGS="" (i.e. unset LDFLAGS)
So I think there's probably nothing really fancy in my options to gcc.
Actually yes - that is a huge quantity of stack usage.
(The actual behaviour looks very suspect - it appears to be completely
pointless).
The #DF handler reports that %rsp in the exception frame is within
range. Having said that,
(XEN) [ 2.788209] rbp: ffff83080ca8ed78 rsp: ffff83080ca8dcf8
r8: ffff83080ca9d558
...
(XEN) [ 2.837474] Valid stack range:
ffff83080ca8e000-ffff83080ca90000, sp=ffff83080ca8dcf8,
tss.esp0=ffff83080ca8ffc0
(XEN) [ 2.848969] No stack overflow detected. Skipping stack trace.
In this case, the stack pointer *is* out of range, and has hit the guard
page.
This means:
1) There is some bug in the stack overflow detection in the #DF handler.
2) Whatever options Gentoo compiles Xen with is sufficient to overflow
the 8K hypervisor stack.
Thanks Atom2
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel