Taylor R Campbell <riastr...@netbsd.org> writes: >> Date: Wed, 04 Jan 2023 14:43:25 -0500 >> From: Brad Spencer <b...@anduin.eldar.org> >> >> So... I have a PV+PVSHIM DOMU running a pretty recent 9.x on a DOM0 >> running a 9.99.xx kernel. The DOM0 is not large, a 4 processor E-2224 >> with 32GB of memory. The DOMU has 2 VCPUs and 8GB of memory. About >> every day a very particular DOMU tosses the: >> >> WARNING: negative runtime; monotonic clock has gone backwards > > Does this still happen?
As far as I know it still happens. I ended up running just a single CPU much of the time on the DOMU, until the system needed more and then I reboot the DOMU with two vcpus. The negative runtime does not happen when there is only one vcpu on the DOMU. When I forget to reboot back to one CPU in the DOMU, the negative runtime message has always happened within a couple of days. > Can you either: Yes, I can perform as much of this as needed after I get some other stuff in life dealt with more towards the end of the month. I really won't have any time before then. > 1. share the output of `vmstat -e | grep -e tsc -e systime -e > hardclock' after you get the console warning; The DOMU currently only has 1 vcpu, but here is the output now: vcpu0 raw systime went backwards 46579 0 intr When I have real time later I will force the negative runtime to happen and run the above again. > 2. run > > dtrace -n 'sdt:xen:clock: { printf("%d %d %d %d %d %d %d", > arg0, arg1, arg2, arg3, arg4, arg5, arg6, arg7) }' > > on the system, and leave it running with output directed to a file, > and share the output when you see the console warning; or The DOMU is a 9.3_STABLE from around November 8th and when I attempted to run the above dtrace it didn't work. I got this in the messsages: [ 1792486.921759] kobj_checksyms, 988: [dtrace]: linker error: symbol `dtrace_invop_calltrap_addr' not found [ 1792486.921759] kobj_checksyms, 988: [dtrace]: linker error: symbol `dtrace_invop_jump_addr' not found [ 1792486.921759] kobj_checksyms, 988: [dtrace]: linker error: symbol `dtrace_trap_func' not found [ 1792486.921759] WARNING: module error: unable to affix module `dtrace', error 8 When I have time to get to this I can build a newer 9.x world, I just need to know if I need to do that. > 3. put `#define XEN_CLOCK_DEBUG 1' in sys/arch/xen/xen/xen_clock.c and > build a new kernel, and share the dmesg output when you get the > console warning? > > This should tell us whether it's the Xen host's fault or something > wrong in NetBSD. Some more data points from the vmstat command mentioned above (because it is simple and quick to run): 1) Another system with the same generation 9.x BUT on a different DOM0 produces: vcpu0 raw systime went backwards 32909 0 intr This second system has also been known to have negative runtime and is also currently running with one vcpu. 2) On the same DOM0 as the one mentioned in #1 there is a 10.x_BETA from January 22nd. This guest is a PVH DOMU and the above vmstat produces no output. This full PVH DOMU has two vcpus and I have never known it to produce negative runtime. 3) A third 9.x DOMU that is just a normal PV (no PVSHIM involved) produces the following with that vmstat command: vcpu0 raw systime went backwards 141532 0 intr vcpu0 missed hardclock 7 0 intr I have never known this system to have negative runtime. 4) Another 10.x_BETA DOMU PV guest with 1 vcpu from April 22nd on the same DOM0 as #3 does not produce any output from the vmstat command. Thanks for asking about this... it is more than a little annoying. I apologize that I won't be able to be very receptive to doing much more with this until later. -- Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org