[Bug 1831225] Re: guest migration 100% cpu freeze bug

2021-05-09 Thread Thomas Huth
This is an automated cleanup. This bug report has been moved to QEMU's new bug tracker on gitlab.com and thus gets marked as 'expired' now. Please continue with the discussion here: https://gitlab.com/qemu-project/qemu/-/issues/223 ** Changed in: qemu Status: Confirmed => Expired ** Bug

[Bug 1831225] Re: guest migration 100% cpu freeze bug

2021-05-04 Thread Thomas Huth
** Changed in: qemu Status: Incomplete => Confirmed -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1831225 Title: guest migration 100% cpu freeze bug Status in QEMU: Confirmed Bug descrip

[Bug 1831225] Re: guest migration 100% cpu freeze bug

2021-05-04 Thread Dion Bosschieter
Hi Thomas, We are still seeing this every once in a while. I can definitely tell you that it is connected to older Linux Guest kernels and we have not been able to identify a specific version which would make searching for a fix commit a bit easier. We are going to upgrade all our host kernels to

[Bug 1831225] Re: guest migration 100% cpu freeze bug

2021-04-21 Thread Thomas Huth
The QEMU project is currently considering to move its bug tracking to another system. For this we need to know which bugs are still valid and which could be closed already. Thus we are setting older bugs to "Incomplete" now. If you still think this bug report here is valid, then please switch the

[Bug 1831225] Re: guest migration 100% cpu freeze bug

2019-09-25 Thread Dion Bosschieter
We have found a vm that recovered from a freeze and it seems like it has jumped in time. Below I have pasted a dump of tty1, it is ocr'd though so some characters could have been misinterpreted. hild [13198552.767867] le-rss:010 Killed process 10374 (crop) total,r,4376400, anon-rss,018, tl [13

[Qemu-devel] [Bug 1831225] Re: guest migration 100% cpu freeze bug

2019-09-11 Thread Dion Bosschieter
Hi, this is an update after some extended tests and a fallback migration to 4.14. After doing another >10k migrations we are sure to say that we also encounter this issue on kernel 4.14. We migrate vpses from servers in serial (one after the other) mode. And we notice that on some servers we enc

[Qemu-devel] [Bug 1831225] Re: guest migration 100% cpu freeze bug

2019-06-28 Thread Frank Schreuder
This round we built 4.19.55, disabled the kvm_intel.preemption_timer parameter and ensured kvm.lapic_timer_advance_ns is 0, as advised by Paolo Bonzini. Sadly, yet again we encountered a freeze. Any other suggestions? -- You received this bug notification because you are a member of qemu- devel-

[Qemu-devel] [Bug 1831225] Re: guest migration 100% cpu freeze bug

2019-06-21 Thread Frank Schreuder
An update on our further research. We tried bumping the hypervisor kernel form 4.19.43 to 4.19.50 which included the following patch, which we hoped to be related to our issue: https://lore.kernel.org/lkml/20190520115253.743557...@linuxfoundation.org/#t Sadly after a few thousand migrations we en

[Qemu-devel] [Bug 1831225] Re: guest migration 100% cpu freeze bug

2019-06-21 Thread Paolo Bonzini
Could you try running the guests without the TSC_DEADLINE CPUID flag set? -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1831225 Title: guest migration 100% cpu freeze bug Status in QEMU: New Bu

[Qemu-devel] [Bug 1831225] Re: guest migration 100% cpu freeze bug

2019-06-07 Thread Dion Bosschieter
Hi David, I have digged further into our issue and we have seen issues only when migrating from servers that have a different tsc frequency. Example: Source (kernel 4.14.63) [2.068227] tsc: Refined TSC clocksource calibration: 2593.906 MHz [2.068373] clocksource: tsc: mask: 0x

[Qemu-devel] [Bug 1831225] Re: guest migration 100% cpu freeze bug

2019-06-05 Thread Dion Bosschieter
Hi Jean, Could you elaborate, is it the qemu patch that you applied and didn't apply that to the current qem u version you are running? Could you try to get a crash dump from a frozen vm working, see if you get the same kind of backtrace in there. Which specific qemu version are you running, whi

[Qemu-devel] [Bug 1831225] Re: guest migration 100% cpu freeze bug

2019-06-05 Thread Jean-Philippe Menil
Hi, i suffer fro this bug too (or very similar) on 4.15.0-50-generic, without the patch mentionned earlier (i use this patch last year to migrate from previous qemu version). Jean-Philippe -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to Q

[Qemu-devel] [Bug 1831225] Re: guest migration 100% cpu freeze bug

2019-06-05 Thread Dr. David Alan Gilbert
You say it's only happened since 4.19 - that's possible - but since this bug is so tricky to trigger it's also possible that any slight change in 4.19. You could try disabling kvm_clock? Dave -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed t

[Qemu-devel] [Bug 1831225] Re: guest migration 100% cpu freeze bug

2019-06-04 Thread Dion Bosschieter
It seems like the patch is applied to the guests source kernel. crash> clock_event_device struct clock_event_device { void (*event_handler)(struct clock_event_device *); int (*set_next_event)(unsigned long, struct clock_event_device *); int (*set_next_ktime)(ktime_t, struct clock_event

[Qemu-devel] [Bug 1831225] Re: guest migration 100% cpu freeze bug

2019-06-04 Thread Dion Bosschieter
Is there a way that we can check that it indeed is the case that the clock jumped a bit, we can try to read some kernel variables. We just got another hung guest's crash dump working, this vm also shows a weird uptime DATE: Fri Dec 23 09:06:16 2603 UPTIME: 106752 days, 00:10:35 T

[Qemu-devel] [Bug 1831225] Re: guest migration 100% cpu freeze bug

2019-05-31 Thread Dr. David Alan Gilbert
Interesting; I'd seen something similar - in rh https://bugzilla.redhat.com/show_bug.cgi?id=1538078 and as well as the bogus date we'd had lots of log messages of the form: CE: lapic increasing min_delta_ns to nsec we were reckoning the clock jumped a bit during the migrate, and then trigger

[Qemu-devel] [Bug 1831225] Re: guest migration 100% cpu freeze bug

2019-05-31 Thread Dion Bosschieter
Hi Alan, Dmesg shows nothing special: [29891577.708544] IPv6 addrconf: prefix with wrong length 48 [29891580.650637] IPv6 addrconf: prefix with wrong length 48 [29891582.013656] IPv6 addrconf: prefix with wrong length 48 [29891583.753246] IPv6 addrconf: prefix with wrong length 48 [29891585.39794

[Qemu-devel] [Bug 1831225] Re: guest migration 100% cpu freeze bug

2019-05-31 Thread Dr. David Alan Gilbert
Hi Dion, Since you've got a crash dump, can you check the dmesg in the guest to see if there's any messages? -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1831225 Title: guest migration 100% cpu

[Qemu-devel] [Bug 1831225] Re: guest migration 100% cpu freeze bug

2019-05-31 Thread Dion Bosschieter
A virsh dumpxml of one of the guests: vps12 -953c-d629-1276-0616 4194304 4194304 2 /machine hvm Westmere destroy restart restart /usr/bin/kvm