batuzovk writes: >> I'm debugging an operating system with QEMU and I have a race condition in >> the OS. The problem is that each time I run QEMU I get this error in a >> different place, so it makes impossible for gdb to debug it. My plan is to >> remove this indeterminism and be able to reproduce the same error in the >> same place every time. To do that: >> >> * The test is automated (there is no user IO) >> * I've passed the options "-rtc base=2006-06-17,clock=vm,driftfixone >> -icount 2" to QEMU >> * There is no use of KVM (the modules have been removed from the kernel) >> >> So even with that, in each execution I get a different error every time. >> Do >> you have any suggestions to make the execution identical each time is >> being >> run? [...] > Actually any (not only user) I/O can cause non-determinism: it is not > known when data would be ready. The things became even more complicated if > you took into account multi-threaded nature of QEMU. Threads communicate > with each other and you can not predict context switches.
> AFAIK there is no easy guaranteed-to-work solution for your problem, but > there are some hard ones (e.g. vmware retrace, though it is not based on > QEMU). If your test case is really simple you can try disabling any > multi-threading you can in QEMU and just hope for it to work. AFAIK, you have four possible sources of guest indeterminism in QEMU: * Interactions between multiple guest CPUs Using the "-rtc base=2006-06-17,clock=vm" and "-icount 2" arguments, this should be deterministic when ignoring interactions from external devices and other QEMU threads. * I/O devices raising interrupts Using the "-rtc clock=vm" argument and *assuming* devices use the guest clock to program interrupts, this should also be deterministic on the guest. * I/O devices writing into memory the guest is concurrently reading This is hard to solve when devices work in a thread separate from that of guest CPUs, but I think it can be ignored. * A host thread interrupting the guest CPU thread This will force the guest CPU thread to recompute the number of guest instructions until a guest CPU switch. My understanding is that you're suggesting the problem Zeus is encountering is the last one. But then my question is, when is the guest CPU thread interrupted? That is, ignoring incoming interrupts from external devices (covered in the 2nd point). Lluis -- "And it's much the same thing with knowledge, for whenever you learn something new, the whole world becomes that much richer." -- The Princess of Pure Reason, as told by Norton Juster in The Phantom Tollbooth