Re: [Qemu-devel] [RFC] reverse execution.

Mark Burton Fri, 17 May 2013 15:56:44 -0700

I wish I could say I understood it better, but at this point any insight would 
be gratefully received. However, what does seem clear is that the intent and 
purpose of Icount is subtly different, and possibly orthogonal to what we're 
trying to achieve.

And - actually, determinism (or the lack of it), is defiantly an issue, but - 
for now - we have spent much of this week finding a bit of code that avoids any 
non-determanistic behavior - simply so we can make sure the mechanisms work - 
THEN we will tackle the thorny subject of what is causing non-determanistic 
behavior (by which, I _suspect_ I mean, what devices or timers are not adhering 
to the icount mechanism).

To recap, as I understand things, setting the icount value in the command line 
is intended to give a rough "instructions per second" mechanism. One of the 
effects of that is to make things more deterministic.  Our overall intent is to 
allow the user that has hit a bug, to step backwards.

After much discussion (!) I'm convinced by the argument that I might in the end 
want both of these things. I might want to set some sort of instructions per 
second value (and change it between runs), and if/when I hit a bug, go 
backwards.

Thus far, so good. 

underneath the hood, icount keeps a counter in the TCG environment which is 
decremented (as Fred says) and the icount mechanism plays with it as it feels 
fit.
The bottom line is that, orthogonal to this, we need a separate 'counter' which 
is almost identical to the icount counter, in order to count instructions for 
the reverse execution mechanism.

We have looked at re-using the icount counter as Fred said, but that soon ends 
you up in a whole heap of pain. Our conclusion - it would be much cleaner to 
have a separate dedicated counter, then you can simply use either mechanism 
independent of the other.
On this subject - I would like to hear any and all views.

Having said all of that, in BOTH cases, we need determinism.

In our case, determinism is very tightly defined (which - I suspect may not be 
the case for icount). In our case, having returned to a snapshot, the 
subsequent execution must follow the EXACT SAME path that it did last time. no 
if's no buts. Not IO no income tax, no VAT, no money back no guarantee….

Right now, what Fred has found is that sometimes things 'drift'… we will (of 
course) be looking into that. But, for now, our principle concern is to take a 
simple bit of code, with no IO, and nothing that causes non-determanism - save 
a snapshot at the beginning of the sequence, run, hit a breakpoint, return to 
the breakpoint, and be able to _exactly_ return to the place we came from.

As Fred said, we imagined that we could do this based on TBs, at least as a 
'block' level (which actually may be good enough for us). However, our 
mechanism for counting TB's was badly broken. None the less, we learnt a lot 
about TB's - and about some of the non-determaistic behavior that will come to 
haunt us later. We also concluded that counting TBs is always going to be 
second rate, and if we're going to do this properly, we need to count 
instructions. Finally, we have concluded that re-using the icount counters is 
going to be very painful, we need to re-use the same mechanism, but we need 
dedicated counters…

Again, please, all - pitch in and say what you think. Fred and I have been 
scratching out head all week on this, and I'm not convinced we have come up 
with the right answers, so any input would be most welcome.

Cheers

Mark.

On 17 May 2013, at 19:54, Peter Maydell wrote:

> On 17 May 2013 18:23, KONRAD Frédéric <fred.kon...@greensocs.com> wrote:
>> It appeared that the replay is not deterministic even with icount:
>>    - the whole icount mechanism is not saved with save_vm (which can be
>> achieved by moving qemu_icount to TimerState according to Paolo)
>>    - replaying two times the same thing and stopping at a specific
>> breakpoint show two differents vmclock, so replaying the
>>        same amount of time don't work, and we abandoned this idea.
> 
> Personally I think icount is supposed to be deterministic,
> and if it isn't then it should be fixed to be so. Does anybody
> who understands it better than me disagree?
> 
> thanks
> -- PMM

         +44 (0)20 7100 3485 x 210
 +33 (0)5 33 52 01 77x 210
 707-356-0783 x 210
        +33 (0)603762104
        mark.burton

Re: [Qemu-devel] [RFC] reverse execution.

Reply via email to