Re: Major KVM issues with kernel 4.5 on the host

2016-04-23 Thread Borislav Petkov
On Sat, Apr 23, 2016 at 08:43:41PM +0200, Marc Haber wrote: > Uncorrectable errors would still be identified by the ECC hardware, Not if the hardware decides to syncflood so that we don't even get to run the #MC handler... > and the box wouldn't be perfectly fine with an "old" kernel. Maybe the

Re: Major KVM issues with kernel 4.5 on the host

2016-04-23 Thread Dr. David Alan Gilbert
* Marc Haber (mh+linux-ker...@zugschlus.de) wrote: > On Sat, Apr 23, 2016 at 06:04:29PM +0200, Borislav Petkov wrote: > > On Thu, Apr 21, 2016 at 10:04:33PM +0200, Marc Haber wrote: > > > Yes, but there are two symptoms. The VM either suffers file system > > > issues (garbage read from files, or an

Re: Major KVM issues with kernel 4.5 on the host

2016-04-23 Thread Marc Haber
On Sat, Apr 23, 2016 at 06:04:29PM +0200, Borislav Petkov wrote: > On Thu, Apr 21, 2016 at 10:04:33PM +0200, Marc Haber wrote: > > Yes, but there are two symptoms. The VM either suffers file system > > issues (garbage read from files, or an aborted ext4 journal and > > following ro remount) or it s

Re: Major KVM issues with kernel 4.5 on the host

2016-04-23 Thread Borislav Petkov
On Thu, Apr 21, 2016 at 10:04:33PM +0200, Marc Haber wrote: > Yes, but there are two symptoms. The VM either suffers file system > issues (garbage read from files, or an aborted ext4 journal and > following ro remount) or it stops dead in its tracks. Stops dead? What does that mean exactly? Box is

Re: Major KVM issues with kernel 4.5 on the host

2016-04-21 Thread Marc Haber
On Thu, Apr 21, 2016 at 06:51:06PM +0200, Borislav Petkov wrote: > On Thu, Apr 21, 2016 at 04:50:05PM +0200, Marc Haber wrote: > > What bothers me is that since I ended up with a "suspect" commit that > > actually results in a "good" kernel (running for 22 hours now), I must > > have said "bad" to

Re: Major KVM issues with kernel 4.5 on the host

2016-04-21 Thread Borislav Petkov
On Thu, Apr 21, 2016 at 04:50:05PM +0200, Marc Haber wrote: > What bothers me is that since I ended up with a "suspect" commit that > actually results in a "good" kernel (running for 22 hours now), I must > have said "bad" to an actually "good" kernel, which means that I had > an unrelated crash or

Re: Major KVM issues with kernel 4.5 on the host

2016-04-21 Thread Marc Haber
On Thu, Apr 21, 2016 at 02:37:11PM +0200, Borislav Petkov wrote: > On Thu, Apr 21, 2016 at 10:39:48AM +0200, Marc Haber wrote: > > Currently, I cannot explain how this has happened, I must have flagged > > an actually good kernel as bad from my understanding of git bisect. > > > > Can you give adv

Re: Major KVM issues with kernel 4.5 on the host

2016-04-21 Thread Borislav Petkov
On Thu, Apr 21, 2016 at 10:39:48AM +0200, Marc Haber wrote: > Currently, I cannot explain how this has happened, I must have flagged > an actually good kernel as bad from my understanding of git bisect. > > Can you give advice how to continue here? Yap, sounds like you marked a bisection step inc

Re: Major KVM issues with kernel 4.5 on the host

2016-04-21 Thread Marc Haber
On Thu, Apr 14, 2016 at 07:22:20AM +0200, Marc Haber wrote: > On Thu, Apr 14, 2016 at 03:16:29AM +0200, Paolo Bonzini wrote: > > On 14/04/2016 00:29, Marc Haber wrote: > > > On Wed, Apr 13, 2016 at 10:36:34PM +0200, Paolo Bonzini wrote: > > >> Didn't help, but a fresh look at the list of 4.5 patche

Re: Major KVM issues with kernel 4.5 on the host

2016-04-14 Thread Marc Haber
On Thu, Apr 14, 2016 at 07:30:43PM +0200, Paolo Bonzini wrote: > On 14/04/2016 18:47, Marc Haber wrote: > >> > Ok, then I guess bisection is needed. Please first try commit > >> > 45bdbcfdf241. > > I did git checkout 45bdbcfdf241 and built the resulting kernel > > 4.4.0-rc5. This one has now been

Re: Major KVM issues with kernel 4.5 on the host

2016-04-14 Thread Paolo Bonzini
On 14/04/2016 18:47, Marc Haber wrote: >> > Ok, then I guess bisection is needed. Please first try commit >> > 45bdbcfdf241. > I did git checkout 45bdbcfdf241 and built the resulting kernel > 4.4.0-rc5. This one has now been running for ten hours, which is > threefold the longest time that a fau

Re: Major KVM issues with kernel 4.5 on the host

2016-04-14 Thread Marc Haber
On Thu, Apr 14, 2016 at 03:16:29AM +0200, Paolo Bonzini wrote: > Ok, then I guess bisection is needed. Please first try commit > 45bdbcfdf241. I did git checkout 45bdbcfdf241 and built the resulting kernel 4.4.0-rc5. This one has now been running for ten hours, which is threefold the longest time

Re: Major KVM issues with kernel 4.5 on the host

2016-04-13 Thread Marc Haber
On Thu, Apr 14, 2016 at 03:16:29AM +0200, Paolo Bonzini wrote: > Ok, then I guess bisection is needed. Please first try commit > 45bdbcfdf241. That kernel labels itself as "4.4.0-rc5+", is that correct? Greetings Marc -- -

Re: Major KVM issues with kernel 4.5 on the host

2016-04-13 Thread Marc Haber
On Thu, Apr 14, 2016 at 03:16:29AM +0200, Paolo Bonzini wrote: > On 14/04/2016 00:29, Marc Haber wrote: > > On Wed, Apr 13, 2016 at 10:36:34PM +0200, Paolo Bonzini wrote: > >> Didn't help, but a fresh look at the list of 4.5 patches helped. > >> What the hell was I thinking, I missed write_rdtscp_a

Re: Major KVM issues with kernel 4.5 on the host

2016-04-13 Thread Paolo Bonzini
On 14/04/2016 00:29, Marc Haber wrote: > On Wed, Apr 13, 2016 at 10:36:34PM +0200, Paolo Bonzini wrote: >> Didn't help, but a fresh look at the list of 4.5 patches helped. >> What the hell was I thinking, I missed write_rdtscp_aux who >> obviously uses MSR_TSC_AUX. > > I applied this patch to 4.

Re: Major KVM issues with kernel 4.5 on the host

2016-04-13 Thread Marc Haber
On Wed, Apr 13, 2016 at 10:36:34PM +0200, Paolo Bonzini wrote: > Didn't help, but a fresh look at the list of 4.5 patches helped. > What the hell was I thinking, I missed write_rdtscp_aux who > obviously uses MSR_TSC_AUX. I applied this patch to 4.5, which didn't go cleanly, I had to do it manuall

Re: Major KVM issues with kernel 4.5 on the host

2016-04-13 Thread Marc Haber
On Wed, Apr 13, 2016 at 10:36:34PM +0200, Paolo Bonzini wrote: > Didn't help, but a fresh look at the list of 4.5 patches helped. > What the hell was I thinking, I missed write_rdtscp_aux who > obviously uses MSR_TSC_AUX. So you want me to apply that to 4.5 od 4.5.1 and try that? Greetings Marc

Re: Major KVM issues with kernel 4.5 on the host

2016-04-13 Thread Paolo Bonzini
On 13/04/2016 20:22, Marc Haber wrote: >> So I'm not sure what even happens here yet. I haven't seen anything out >> > of the ordinary in Marc's dmesg and I wasn't able to reproduce either. >> > So would it be good to try with "npt=0"? Sure, why not. > npt=0 goes on the kernel command line of the

Re: Major KVM issues with kernel 4.5 on the host

2016-04-13 Thread Paolo Bonzini
On 13/04/2016 20:37, Marc Haber wrote: > On Fri, Mar 18, 2016 at 11:01:46AM +0100, Paolo Bonzini wrote: >> On 17/03/2016 19:11, Borislav Petkov wrote: >>> I'm going to try reproducing the issue on a less "important" machine >>> so that bisecting is less painful, but maybe you guys have an idea >>

Re: Major KVM issues with kernel 4.5 on the host

2016-04-13 Thread Marc Haber
On Fri, Mar 18, 2016 at 11:01:46AM +0100, Paolo Bonzini wrote: > On 17/03/2016 19:11, Borislav Petkov wrote: > > I'm going to try reproducing the issue on a less "important" machine > > so that bisecting is less painful, but maybe you guys have an idea > > what's going wrong here. > > No idea, sor

Re: Major KVM issues with kernel 4.5 on the host

2016-04-13 Thread Marc Haber
On Sun, Mar 20, 2016 at 07:58:13PM +0100, Borislav Petkov wrote: > So I'm not sure what even happens here yet. I haven't seen anything out > of the ordinary in Marc's dmesg and I wasn't able to reproduce either. > So would it be good to try with "npt=0"? Sure, why not. npt=0 goes on the kernel com

Re: Major KVM issues with kernel 4.5 on the host

2016-04-13 Thread Marc Haber
On Sun, Mar 20, 2016 at 02:31:58PM +0100, Borislav Petkov wrote: > On Sat, Mar 19, 2016 at 01:08:37AM +0100, Marc Haber wrote: > > Booting Debian Linux, apt-get update, apt-get upgrade, and run aide > > (which builds checksums for the entire filesystem, a rather disk-bound > > activity). > > So I

Re: Major KVM issues with kernel 4.5 on the host

2016-03-21 Thread Paolo Bonzini
On 19/03/2016 01:08, Marc Haber wrote: >> > >>> > > http://q.bofh.de/~mh/stuff/20160317-fan-syslog-kvm-4.5 >> > >> > This one doesn't want: >> > >> > HTTP request sent, awaiting response... 403 Forbidden >> > 2016-03-18 22:57:46 ERROR 403: Forbidden. > Idiot me. File permissions fixed. > >> >

Re: Major KVM issues with kernel 4.5 on the host

2016-03-20 Thread Borislav Petkov
On Sun, Mar 20, 2016 at 09:42:15PM +0300, Andrey Korolyov wrote: > Yes, I suggested that the issue could fall over a different family as > well to expose explicit corruption of a guest pages (as opposed to a > generic corruption in a known case). Probably, but I don't think it is microcode patch r

Re: Major KVM issues with kernel 4.5 on the host

2016-03-20 Thread Andrey Korolyov
On Sun, Mar 20, 2016 at 9:25 PM, Borislav Petkov wrote: > On Sun, Mar 20, 2016 at 08:14:58PM +0300, Andrey Korolyov wrote: >> Kinda naive question - do you run same ucode version as Marc on his device? > > Yeah, we both have 0x01dc. > > In case you're referring to the recent faulty AMD microco

Re: Major KVM issues with kernel 4.5 on the host

2016-03-20 Thread Borislav Petkov
On Sun, Mar 20, 2016 at 08:14:58PM +0300, Andrey Korolyov wrote: > Kinda naive question - do you run same ucode version as Marc on his device? Yeah, we both have 0x01dc. In case you're referring to the recent faulty AMD microcode patch - it doesn't apply here. The boxes in question are family

Re: Major KVM issues with kernel 4.5 on the host

2016-03-20 Thread Andrey Korolyov
On Sun, Mar 20, 2016 at 4:31 PM, Borislav Petkov wrote: > On Sat, Mar 19, 2016 at 01:08:37AM +0100, Marc Haber wrote: >> Booting Debian Linux, apt-get update, apt-get upgrade, and run aide >> (which builds checksums for the entire filesystem, a rather disk-bound >> activity). > > So I did that and

Re: Major KVM issues with kernel 4.5 on the host

2016-03-20 Thread Borislav Petkov
On Sat, Mar 19, 2016 at 01:08:37AM +0100, Marc Haber wrote: > Booting Debian Linux, apt-get update, apt-get upgrade, and run aide > (which builds checksums for the entire filesystem, a rather disk-bound > activity). So I did that and aide ran a whole init and check all the way through and all fine

Re: Major KVM issues with kernel 4.5 on the host

2016-03-19 Thread Paolo Bonzini
On 17/03/2016 19:11, Borislav Petkov wrote: > I'm going to try reproducing the issue on a less "important" machine > so that bisecting is less painful, but maybe you guys have an idea > what's going wrong here. No idea, sorry. :( Bisecting would be great. I'll also try reproducing and bisectin

Major KVM issues with kernel 4.5 on the host

2016-03-19 Thread Marc Haber
Hi, I have a (semi-productive[1]) system ("host") running Debian unstable. On this system, a few VMs (Debian unstable, Debian testing) ("vm1", "vm2", "vm3") are running. I roll my own kernels and take vanilla upstream sources. No distribution patches. Since host was updated to Kernel 4.5, the VMs

Re: Major KVM issues with kernel 4.5 on the host

2016-03-19 Thread Borislav Petkov
+ kvm ML. Do you have any funky messages in host's dmesg ? Can you upload a full dmesg from both a good and a bad host kernel? On Thu, Mar 17, 2016 at 05:54:35PM +0100, Marc Haber wrote: > Hi, > > I have a (semi-productive[1]) system ("host") running Debian unstable. > On this system, a few VMs

Re: Major KVM issues with kernel 4.5 on the host

2016-03-18 Thread Marc Haber
Hi Borislav, On Thu, Mar 17, 2016 at 07:11:28PM +0100, Borislav Petkov wrote: > Do you have any funky messages in host's dmesg ? Not that I see. > Can you upload a full dmesg from both a good and a bad host kernel? http://q.bofh.de/~mh/stuff/20160317-fan-syslog-kvm-4.4.5 http://q.bofh.de/~mh/st

Re: Major KVM issues with kernel 4.5 on the host

2016-03-18 Thread Marc Haber
Hi Borislav, On Fri, Mar 18, 2016 at 11:04:29PM +0100, Borislav Petkov wrote: > On Fri, Mar 18, 2016 at 07:49:29PM +0100, Marc Haber wrote: > > http://q.bofh.de/~mh/stuff/20160317-fan-syslog-kvm-4.4.5 > > This one I got. > > > http://q.bofh.de/~mh/stuff/20160317-fan-syslog-kvm-4.5 > > This one

Re: Major KVM issues with kernel 4.5 on the host

2016-03-18 Thread Borislav Petkov
On Fri, Mar 18, 2016 at 07:49:29PM +0100, Marc Haber wrote: > http://q.bofh.de/~mh/stuff/20160317-fan-syslog-kvm-4.4.5 This one I got. > http://q.bofh.de/~mh/stuff/20160317-fan-syslog-kvm-4.5 This one doesn't want: HTTP request sent, awaiting response... 403 Forbidden 2016-03-18 22:57:46 ERROR