> From: Kevin Wolf [mailto:kw...@redhat.com] > Am 11.02.2016 um 12:00 hat Pavel Dovgalyuk geschrieben: > > > From: Kevin Wolf [mailto:kw...@redhat.com] > > > Am 11.02.2016 um 07:05 hat Pavel Dovgalyuk geschrieben: > > > > > From: Kevin Wolf [mailto:kw...@redhat.com] > > > > > Am 10.02.2016 um 13:51 hat Pavel Dovgalyuk geschrieben: > > > > > > However, I don't understand yet which layer do you offer as the > > > > > > candidate > > > > > > for record/replay? What functions should be changed? > > > > > > I would like to investigate this way, but I don't got it yet. > > > > > > > > > > At the core, I wouldn't change any existing function, but introduce a > > > > > new block driver. You could copy raw_bsd.c for a start and then tweak > > > > > it. Leave out functions that you don't want to support, and add the > > > > > necessary magic to .bdrv_co_readv/writev. > > > > > > > > > > Something like this (can probably be generalised for more than just > > > > > reads as the part after the bdrv_co_reads() call should be the same > > > > > for > > > > > reads, writes and any other request types): > > > > > > > > > > int blkreplay_co_readv() > > > > > { > > > > > BlockReplayState *s = bs->opaque; > > > > > int reqid = s->reqid++; > > > > > > > > > > bdrv_co_readv(bs->file, ...); > > > > > > > > > > if (mode == record) { > > > > > log(reqid, time); > > > > > } else { > > > > > assert(mode == replay); > > > > > bool *done = req_replayed_list_get(reqid) > > > > > if (done) { > > > > > *done = true; > > > > > } else { > > > > > req_completed_list_insert(reqid, qemu_coroutine_self()); > > > > > qemu_coroutine_yield(); > > > > > } > > > > > } > > > > > } > > > > > > > > > > /* called by replay.c */ > > > > > int blkreplay_run_event() > > > > > { > > > > > if (mode == replay) { > > > > > co = req_completed_list_get(e.reqid); > > > > > if (co) { > > > > > qemu_coroutine_enter(co); > > > > > } else { > > > > > bool done = false; > > > > > req_replayed_list_insert(reqid, &done); > > > > > /* wait synchronously for completion */ > > > > > while (!done) { > > > > > aio_poll(); > > > > > } > > > > > } > > > > > } > > > > > } > > > > > > > > > > Where we could consider changing existing code is that it might be > > > > > desirable to automatically put an instance of this block driver on top > > > > > of every block device when record/replay is used. If we don't do that, > > > > > you need to explicitly specify -drive driver=blkreplay,... > > > > > > > > As far, as I understand, all synchronous read/write request are also > > > > passed > > > > through this coroutines layer. > > > > > > Yes, all read/write requests go through the same function internally, no > > > matter which external interface was used. > > > > > > > It means that every disk access in replay phase should match the > > > > recording phase. > > > > > > Right. If I'm not mistaken, this was the fundamental requirement you > > > have, so I wouldn't have suggested this otherwise. > > > > > > > Record/replay is intended to be used for debugging and analysis. > > > > When execution is replayed, guest machine cannot notice analysis > > > > overhead. > > > > Some analysis methods may include disk image reading. E.g., qemu-based > > > > analysis framework DECAF uses sleuthkit for disk forensics ( > > > https://github.com/sycurelab/DECAF ). > > > > If similar framework will be used with replay, forensics disk access > > > > operations > > > > won't work if we will record/replay the coroutines. > > > > > > Sorry, I'm not sure if I can follow. > > > > > > If such analysis software runs in the guest, it's not a replay any more > > > and I completely fail to see what you're doing. > > > > > > If it's a qemu component independent from the guest, then my method > > > gives you a clean way to bypass the replay driver that wouldn't be > > > possible with yours. > > > > The second one. qemu may be extended with some components that > > perform guest introspection. > > > > > If your plan was to record/replay only async requests and then use sync > > > requests to bypass the record/replay, let me clearly state that this is > > > the wrong approach: There are still guest devices which do synchronous > > > I/O and need to be considered in the replay log, and you shouldn't > > > prevent the analysis code from using AIO (in fact, using sync I/O in new > > > code is very much frowned upon). > > > > Why do guest synchronous requests have to be recorded? > > Aren't they completely deterministic? > > Good point. I think you're right in practice. In theory, with dataplane > (i.e. when running the request in a separate thread) it could happen, > but I guess that isn't very compatible with replay anyway - and at the > first sight I couldn't see it performing synchronous requests either. > > > > I can explain in more detail what the block device structure looks like > > > and how to access an image with and without record/replay, but first let > > > me please know whether I guessed right what your problem is. Or if I > > > missed your point, can you please describe in detail a case that > > > wouldn't work? > > > > You have understood it correctly. > > And what is the solution for bypassing one of the layers from component that > > should not affect the replay? > > For this, you need to understand how block drivers are stacked in qemu. > Each driver in the stack has a separate struct BlockDriverState, which > can be used to access its data. You could hook up things like this: > > virtio-blk NBD server > -------------- ------------ > | | > v | > +------------+ | > | blkreplay | | > +------------+ | > | | > v | > +------------+ | > | qcow2 | <---------------+ > +------------+ > | > v > +------------+ > | raw-posix | > +------------+ > | > v > -------------- > filesystem > > As you see, what I've chosen for the external analysis interface is just > an NBD server as this is the component that we already have today. You > could hook up any other (new) code there; the important part is that it > doesn't work on the BDS of the blkreplay driver, but directly on the BDS > of the qcow2 driver. > > On the command line, it could look like this (this assumes that we don't > add syntactic sugar that creates the blkreplay part automatically - we > can always do that): > > -drive file=image.qcow2,if=none,id=img-direct > -drive driver=blkreplay,if=none,image=img-direct,id=img-blkreplay > -device virtio-blk-pci,drive=img-blkreplay > > The NBD server can't be started on the command line, so you'd go to the > monitor and start it there with the direct access: > > (qemu) nbd_server_start unix:/tmp/my_socket > (qemu) nbd_server_add img-direct > > (Exact syntax is untested, but this should roughly be how it works.)
Thank you! I'll try this approach and come back either with patches or new questions. Pavel Dovgalyuk