On Sat, Apr 23, 2016 at 08:43:41PM +0200, Marc Haber wrote:
> Uncorrectable errors would still be identified by the ECC hardware,
Not if the hardware decides to syncflood so that we don't even get to
run the #MC handler...
> and the box wouldn't be perfectly fine with an "old" kernel.
Maybe the
* Marc Haber (mh+linux-ker...@zugschlus.de) wrote:
> On Sat, Apr 23, 2016 at 06:04:29PM +0200, Borislav Petkov wrote:
> > On Thu, Apr 21, 2016 at 10:04:33PM +0200, Marc Haber wrote:
> > > Yes, but there are two symptoms. The VM either suffers file system
> > > issues (garbage read from files, or an
On Sat, Apr 23, 2016 at 06:04:29PM +0200, Borislav Petkov wrote:
> On Thu, Apr 21, 2016 at 10:04:33PM +0200, Marc Haber wrote:
> > Yes, but there are two symptoms. The VM either suffers file system
> > issues (garbage read from files, or an aborted ext4 journal and
> > following ro remount) or it s
On Thu, Apr 21, 2016 at 10:04:33PM +0200, Marc Haber wrote:
> Yes, but there are two symptoms. The VM either suffers file system
> issues (garbage read from files, or an aborted ext4 journal and
> following ro remount) or it stops dead in its tracks.
Stops dead? What does that mean exactly? Box is
On Thu, Apr 21, 2016 at 06:51:06PM +0200, Borislav Petkov wrote:
> On Thu, Apr 21, 2016 at 04:50:05PM +0200, Marc Haber wrote:
> > What bothers me is that since I ended up with a "suspect" commit that
> > actually results in a "good" kernel (running for 22 hours now), I must
> > have said "bad" to
On Thu, Apr 21, 2016 at 04:50:05PM +0200, Marc Haber wrote:
> What bothers me is that since I ended up with a "suspect" commit that
> actually results in a "good" kernel (running for 22 hours now), I must
> have said "bad" to an actually "good" kernel, which means that I had
> an unrelated crash or
On Thu, Apr 21, 2016 at 02:37:11PM +0200, Borislav Petkov wrote:
> On Thu, Apr 21, 2016 at 10:39:48AM +0200, Marc Haber wrote:
> > Currently, I cannot explain how this has happened, I must have flagged
> > an actually good kernel as bad from my understanding of git bisect.
> >
> > Can you give adv
On Thu, Apr 21, 2016 at 10:39:48AM +0200, Marc Haber wrote:
> Currently, I cannot explain how this has happened, I must have flagged
> an actually good kernel as bad from my understanding of git bisect.
>
> Can you give advice how to continue here?
Yap, sounds like you marked a bisection step inc
On Thu, Apr 14, 2016 at 07:22:20AM +0200, Marc Haber wrote:
> On Thu, Apr 14, 2016 at 03:16:29AM +0200, Paolo Bonzini wrote:
> > On 14/04/2016 00:29, Marc Haber wrote:
> > > On Wed, Apr 13, 2016 at 10:36:34PM +0200, Paolo Bonzini wrote:
> > >> Didn't help, but a fresh look at the list of 4.5 patche
On Thu, Apr 14, 2016 at 07:30:43PM +0200, Paolo Bonzini wrote:
> On 14/04/2016 18:47, Marc Haber wrote:
> >> > Ok, then I guess bisection is needed. Please first try commit
> >> > 45bdbcfdf241.
> > I did git checkout 45bdbcfdf241 and built the resulting kernel
> > 4.4.0-rc5. This one has now been
On 14/04/2016 18:47, Marc Haber wrote:
>> > Ok, then I guess bisection is needed. Please first try commit
>> > 45bdbcfdf241.
> I did git checkout 45bdbcfdf241 and built the resulting kernel
> 4.4.0-rc5. This one has now been running for ten hours, which is
> threefold the longest time that a fau
On Thu, Apr 14, 2016 at 03:16:29AM +0200, Paolo Bonzini wrote:
> Ok, then I guess bisection is needed. Please first try commit
> 45bdbcfdf241.
I did git checkout 45bdbcfdf241 and built the resulting kernel
4.4.0-rc5. This one has now been running for ten hours, which is
threefold the longest time
On Thu, Apr 14, 2016 at 03:16:29AM +0200, Paolo Bonzini wrote:
> Ok, then I guess bisection is needed. Please first try commit
> 45bdbcfdf241.
That kernel labels itself as "4.4.0-rc5+", is that correct?
Greetings
Marc
--
-
On Thu, Apr 14, 2016 at 03:16:29AM +0200, Paolo Bonzini wrote:
> On 14/04/2016 00:29, Marc Haber wrote:
> > On Wed, Apr 13, 2016 at 10:36:34PM +0200, Paolo Bonzini wrote:
> >> Didn't help, but a fresh look at the list of 4.5 patches helped.
> >> What the hell was I thinking, I missed write_rdtscp_a
On 14/04/2016 00:29, Marc Haber wrote:
> On Wed, Apr 13, 2016 at 10:36:34PM +0200, Paolo Bonzini wrote:
>> Didn't help, but a fresh look at the list of 4.5 patches helped.
>> What the hell was I thinking, I missed write_rdtscp_aux who
>> obviously uses MSR_TSC_AUX.
>
> I applied this patch to 4.
On Wed, Apr 13, 2016 at 10:36:34PM +0200, Paolo Bonzini wrote:
> Didn't help, but a fresh look at the list of 4.5 patches helped.
> What the hell was I thinking, I missed write_rdtscp_aux who
> obviously uses MSR_TSC_AUX.
I applied this patch to 4.5, which didn't go cleanly, I had to do it
manuall
On Wed, Apr 13, 2016 at 10:36:34PM +0200, Paolo Bonzini wrote:
> Didn't help, but a fresh look at the list of 4.5 patches helped.
> What the hell was I thinking, I missed write_rdtscp_aux who
> obviously uses MSR_TSC_AUX.
So you want me to apply that to 4.5 od 4.5.1 and try that?
Greetings
Marc
On 13/04/2016 20:22, Marc Haber wrote:
>> So I'm not sure what even happens here yet. I haven't seen anything out
>> > of the ordinary in Marc's dmesg and I wasn't able to reproduce either.
>> > So would it be good to try with "npt=0"? Sure, why not.
> npt=0 goes on the kernel command line of the
On 13/04/2016 20:37, Marc Haber wrote:
> On Fri, Mar 18, 2016 at 11:01:46AM +0100, Paolo Bonzini wrote:
>> On 17/03/2016 19:11, Borislav Petkov wrote:
>>> I'm going to try reproducing the issue on a less "important" machine
>>> so that bisecting is less painful, but maybe you guys have an idea
>>
On Fri, Mar 18, 2016 at 11:01:46AM +0100, Paolo Bonzini wrote:
> On 17/03/2016 19:11, Borislav Petkov wrote:
> > I'm going to try reproducing the issue on a less "important" machine
> > so that bisecting is less painful, but maybe you guys have an idea
> > what's going wrong here.
>
> No idea, sor
On Sun, Mar 20, 2016 at 07:58:13PM +0100, Borislav Petkov wrote:
> So I'm not sure what even happens here yet. I haven't seen anything out
> of the ordinary in Marc's dmesg and I wasn't able to reproduce either.
> So would it be good to try with "npt=0"? Sure, why not.
npt=0 goes on the kernel com
On Sun, Mar 20, 2016 at 02:31:58PM +0100, Borislav Petkov wrote:
> On Sat, Mar 19, 2016 at 01:08:37AM +0100, Marc Haber wrote:
> > Booting Debian Linux, apt-get update, apt-get upgrade, and run aide
> > (which builds checksums for the entire filesystem, a rather disk-bound
> > activity).
>
> So I
On 19/03/2016 01:08, Marc Haber wrote:
>> >
>>> > > http://q.bofh.de/~mh/stuff/20160317-fan-syslog-kvm-4.5
>> >
>> > This one doesn't want:
>> >
>> > HTTP request sent, awaiting response... 403 Forbidden
>> > 2016-03-18 22:57:46 ERROR 403: Forbidden.
> Idiot me. File permissions fixed.
>
>> >
On Sun, Mar 20, 2016 at 09:42:15PM +0300, Andrey Korolyov wrote:
> Yes, I suggested that the issue could fall over a different family as
> well to expose explicit corruption of a guest pages (as opposed to a
> generic corruption in a known case).
Probably, but I don't think it is microcode patch r
On Sun, Mar 20, 2016 at 9:25 PM, Borislav Petkov wrote:
> On Sun, Mar 20, 2016 at 08:14:58PM +0300, Andrey Korolyov wrote:
>> Kinda naive question - do you run same ucode version as Marc on his device?
>
> Yeah, we both have 0x01dc.
>
> In case you're referring to the recent faulty AMD microco
On Sun, Mar 20, 2016 at 08:14:58PM +0300, Andrey Korolyov wrote:
> Kinda naive question - do you run same ucode version as Marc on his device?
Yeah, we both have 0x01dc.
In case you're referring to the recent faulty AMD microcode patch -
it doesn't apply here. The boxes in question are family
On Sun, Mar 20, 2016 at 4:31 PM, Borislav Petkov wrote:
> On Sat, Mar 19, 2016 at 01:08:37AM +0100, Marc Haber wrote:
>> Booting Debian Linux, apt-get update, apt-get upgrade, and run aide
>> (which builds checksums for the entire filesystem, a rather disk-bound
>> activity).
>
> So I did that and
On Sat, Mar 19, 2016 at 01:08:37AM +0100, Marc Haber wrote:
> Booting Debian Linux, apt-get update, apt-get upgrade, and run aide
> (which builds checksums for the entire filesystem, a rather disk-bound
> activity).
So I did that and aide ran a whole init and check all the way through
and all fine
On 17/03/2016 19:11, Borislav Petkov wrote:
> I'm going to try reproducing the issue on a less "important" machine
> so that bisecting is less painful, but maybe you guys have an idea
> what's going wrong here.
No idea, sorry. :( Bisecting would be great. I'll also try reproducing
and bisectin
Hi,
I have a (semi-productive[1]) system ("host") running Debian unstable.
On this system, a few VMs (Debian unstable, Debian testing) ("vm1",
"vm2", "vm3") are running. I roll my own kernels and take vanilla
upstream sources. No distribution patches.
Since host was updated to Kernel 4.5, the VMs
+ kvm ML.
Do you have any funky messages in host's dmesg ? Can you upload a full
dmesg from both a good and a bad host kernel?
On Thu, Mar 17, 2016 at 05:54:35PM +0100, Marc Haber wrote:
> Hi,
>
> I have a (semi-productive[1]) system ("host") running Debian unstable.
> On this system, a few VMs
Hi Borislav,
On Thu, Mar 17, 2016 at 07:11:28PM +0100, Borislav Petkov wrote:
> Do you have any funky messages in host's dmesg ?
Not that I see.
> Can you upload a full dmesg from both a good and a bad host kernel?
http://q.bofh.de/~mh/stuff/20160317-fan-syslog-kvm-4.4.5
http://q.bofh.de/~mh/st
Hi Borislav,
On Fri, Mar 18, 2016 at 11:04:29PM +0100, Borislav Petkov wrote:
> On Fri, Mar 18, 2016 at 07:49:29PM +0100, Marc Haber wrote:
> > http://q.bofh.de/~mh/stuff/20160317-fan-syslog-kvm-4.4.5
>
> This one I got.
>
> > http://q.bofh.de/~mh/stuff/20160317-fan-syslog-kvm-4.5
>
> This one
On Fri, Mar 18, 2016 at 07:49:29PM +0100, Marc Haber wrote:
> http://q.bofh.de/~mh/stuff/20160317-fan-syslog-kvm-4.4.5
This one I got.
> http://q.bofh.de/~mh/stuff/20160317-fan-syslog-kvm-4.5
This one doesn't want:
HTTP request sent, awaiting response... 403 Forbidden
2016-03-18 22:57:46 ERROR
34 matches
Mail list logo