>>> On 27.02.15 at 18:50, wrote:
> If this issue were to happen on Linux/bare-metal, this is how I'd debug it.
> Hopefully some of this will translate to Xen in one way or another.
Sadly not really - the kernel plays only a minor role (forwarding ACPI
data to the hypervisor) in C-state handling u
(Please forgive my lack of Xen-fu knowledge in advance)
If this issue were to happen on Linux/bare-metal, this is how I'd debug it.
Hopefully some of this will translate to Xen in one way or another.
dmesg | grep idle
will tell us what idle driver is running (on Dom0 kernel)
and if it is intel_id
hursday, November 27, 2014 2:28 AM
To: Steve Freitas; Dugger, Donald D; Nakajima, Jun
Cc: xen-devel@lists.xen.org; Don Slutz
Subject: Re: [Xen-devel] Regression, host crash with 4.5rc1
>>> On 27.11.14 at 06:29, wrote:
> On 11/25/2014 03:00 AM, Jan Beulich wrote:
>> Okay, so it'
ve Freitas; Dugger, Donald D; Nakajima, Jun
Cc: xen-devel@lists.xen.org; Don Slutz
Subject: Re: [Xen-devel] Regression, host crash with 4.5rc1
>>> On 27.11.14 at 06:29, wrote:
> On 11/25/2014 03:00 AM, Jan Beulich wrote:
>> Okay, so it's not really the mwait-idle driver caus
On Nov 28, 2014, at 00:50, Jan Beulich wrote:
On 28.11.14 at 09:24, wrote:
>>> And with 6 errata
>>> documented it's not all that unlikely that there's a 7th one with
>>> MONITOR/MWAIT behavior. The commit you bisected to (and
>>> which you had verified to be the culprit by just forcing
>>
>>> On 28.11.14 at 09:24, wrote:
>> And with 6 errata
>> documented it's not all that unlikely that there's a 7th one with
>> MONITOR/MWAIT behavior. The commit you bisected to (and
>> which you had verified to be the culprit by just forcing
>> arch_skip_send_event_check() to always return false)
On 11/27/2014 01:27 AM, Jan Beulich wrote:
This was precisely the reason why I told you that the numbering
differs (and is confusing and has nothing to do with actual C state
numbers): What max_cstate refers to in the mwait-idle driver is
what above is listed as type[Cx], i.e. the state at index
>>> On 27.11.14 at 06:29, wrote:
> On 11/25/2014 03:00 AM, Jan Beulich wrote:
>> Okay, so it's not really the mwait-idle driver causing the regression,
>> but it is C-state related. Hence we're now down to seeing whether all
>> or just the deeper C states are affected, i.e. I now need to ask you
>
On 11/25/2014 03:00 AM, Jan Beulich wrote:
Okay, so it's not really the mwait-idle driver causing the regression,
but it is C-state related. Hence we're now down to seeing whether all
or just the deeper C states are affected, i.e. I now need to ask you
to play with "max_cstate=". For that you'll
>>> On 25.11.14 at 10:38, wrote:
> On 11/25/2014 12:16 AM, Jan Beulich wrote:
>> Interesting, so other than for me (perhaps due to other patches
>> I have in my tree) the change resulted in C states now being used
>> again despite mwait-idle=0, which is good. Question now is - with
>> this being t
On 11/25/2014 12:16 AM, Jan Beulich wrote:
(XEN) 'c' pressed -> printing ACPI Cx structures
(XEN) ==cpu0==
(XEN) active state:C0
(XEN) max_cstate:C7
(XEN) states:
(XEN) C1:type[C1] latency[001] usage[5664] method[ FFH]
duration[4042540627]
(XEN) C2:type[C3] l
>>> On 24.11.14 at 23:17, wrote:
> I'm combining this action with your patch, see below. Please let me know
> if this was incorrect.
Thanks, that's perfectly fine.
> (XEN) 'c' pressed -> printing ACPI Cx structures
> (XEN) ==cpu0==
> (XEN) active state:C0
> (XEN) max_cstate:C7
>
On 24.11.14 at 10:08, wrote:
On Nov 24, 2014, at 00:45, Jan Beulich wrote:
On 23.11.14 at 02:28, wrote:
As promised, below is the apic-verbosity=debug log, with 'i'. Thanks!
I'm sorry, I misspelled the option, it's really "apic_verbosity=debug".
The 'i' output at least already confirms that
>>> On 24.11.14 at 10:08, wrote:
> On Nov 24, 2014, at 00:45, Jan Beulich wrote:
>
> On 23.11.14 at 02:28, wrote:
>>> With mwait-idle=0:
>>>
>>> (XEN) 'c' pressed -> printing ACPI Cx structures
>>> (XEN) ==cpu0==
>>> (XEN) active state: C0
>>> (XEN) max_cstate: C7
>>> On 24.11.14 at 10:08, wrote:
> On Nov 24, 2014, at 00:45, Jan Beulich wrote:
> On 23.11.14 at 02:28, wrote:
>>> As promised, below is the apic-verbosity=debug log, with 'i'. Thanks!
>>
>> I'm sorry, I misspelled the option, it's really "apic_verbosity=debug".
>> The 'i' output at least
On Nov 24, 2014, at 00:45, Jan Beulich wrote:
On 23.11.14 at 02:28, wrote:
>> With mwait-idle=0:
>>
>> (XEN) 'c' pressed -> printing ACPI Cx structures
>> (XEN) ==cpu0==
>> (XEN) active state: C0
>> (XEN) max_cstate: C7
>> (XEN) states:
>> (XEN) C1: type[C1]
>>> On 23.11.14 at 02:28, wrote:
> With mwait-idle=0:
>
> (XEN) 'c' pressed -> printing ACPI Cx structures
> (XEN) ==cpu0==
> (XEN) active state: C0
> (XEN) max_cstate: C7
> (XEN) states:
> (XEN) C1: type[C1] latency[001] usage[] method[ FFH]
> duration[0
On 11/21/2014 0:42, Jan Beulich wrote:
On 20.11.14 at 21:07, wrote:
Running with mwait-idle=0 solves (hides?) the problem. Next step is to
fiddle with the C states?
For that I'd first of all like to know how much use of C states the
system still makes with that option in place. For that I'd ne
>>> On 20.11.14 at 21:07, wrote:
> Running with mwait-idle=0 solves (hides?) the problem. Next step is to
> fiddle with the C states?
So this also prompted me to go over the list of errata. Just to confirm
- your CPU is family 6 model 44? What stepping? And what nominal
frequency?
There are a c
>>> On 20.11.14 at 21:07, wrote:
> Running with mwait-idle=0 solves (hides?) the problem. Next step is to
> fiddle with the C states?
For that I'd first of all like to know how much use of C states the
system still makes with that option in place. For that I'd need the
output of "xenpm get-cpuid
Hi Jan,
Thanks for all your help so far! Here's my latest update.
On 11/17/2014 23:54, Jan Beulich wrote:
Plus, without said adjustment, first just disable the
MWAIT CPU idle driver ("mwait-idle=0") and then, if that didn't make
a difference, use of C states altogether ("cpuidle=0"). If any of
>>> On 20.11.14 at 02:23, wrote:
> On 11/17/2014 23:54, Jan Beulich wrote:
>> Another thing - now that serial logging appears to be working for
>> you, did you try whether the host, once hung, still reacts to serial
>> input (perhaps force input to go to Xen right at boot via the
>> "conswitch=" o
On 11/17/2014 23:54, Jan Beulich wrote:
On 17.11.14 at 20:21, wrote:
Okay, I did a bisection and was not able to correlate the above error
message with the problem I'm seeing. Not saying it's not related, but I
had plenty of successful test runs in the presence of that error.
Took me about a w
>>> On 17.11.14 at 20:21, wrote:
> Okay, I did a bisection and was not able to correlate the above error
> message with the problem I'm seeing. Not saying it's not related, but I
> had plenty of successful test runs in the presence of that error.
>
> Took me about a week (sometimes it takes as
Hi Jan,
On 11/11/2014 0:05, Jan Beulich wrote:
And these
[ 199.775209] pcieport :00:03.0: AER: Multiple Corrected error
received: id=0018
[ 199.775238] pcieport :00:03.0: PCIe Bus Error:
severity=Corrected, type=Data Link Layer, id=0018(Transmitter ID)
[ 199.7
>>> On 10.11.14 at 21:05, wrote:
> On 11/10/2014 0:51, Jan Beulich wrote:
>> Raising the kernel log level to maximum too would have helped.
>
> Okay, I've done that and the output is here, let me know if you have any
> preferred logging flags instead:
>
> http://pastebin.com/M3yvWNTT
Hmm, I c
On 11/10/2014 0:51, Jan Beulich wrote:
On 10.11.14 at 09:03, wrote:
Sorry for the delay, took some debugging on another computer to get
serial logging working. Due to its size, I've posted the entire log of a
crashed session here: http://pastebin.com/AiPHUZRH In this case I used a
3.0 gig memor
>>> On 10.11.14 at 09:03, wrote:
> Sorry for the delay, took some debugging on another computer to get
> serial logging working. Due to its size, I've posted the entire log of a
> crashed session here: http://pastebin.com/AiPHUZRH In this case I used a
> 3.0 gig memory size for the Windows domU
On 11/04/2014 02:15 AM, Jan Beulich wrote:
In fact this would not just be nice, but is strictly needed, and (as
with any pass-through related problems) additionally requires
running with "iommu=debug" alongside the usual need of setting
the log levels to the maximum.
Hi all,
Sorry for the del
29 matches
Mail list logo