On 12/22/2017 11:13 AM, Marc-André Lureau wrote:
Hi
On Fri, Dec 22, 2017 at 4:59 PM, Stefan Berger
<stef...@linux.vnet.ibm.com> wrote:
On 12/22/2017 07:49 AM, Marc-André Lureau wrote:
Hi
On Fri, Nov 10, 2017 at 3:11 PM, Stefan Berger
<stef...@linux.vnet.ibm.com> wrote:
This set of patches implements support for migrating the state of the
external 'swtpm' TPM emulator as well as that of the emulated device
interfaces. I have primarily tested this with the TIS and TPM 1.2 so
far, but it also seems to work with TPM 2.
The TIS is simplified first by reducing the number of buffers and read
and write offsets into these buffers. Following the state machine of the
TIS, a single buffer and r/w offset is enough for all localities since
only one locality can ever be active.
This series applies on top of my tpm-next branch.
One of the challenges that is addressed by this set of patches is the
fact
that the TPM emulator may be processing a command while the state
serialization of the devices is supposed to happen. A necessary first
step
has been implemented here that ensures that a response has been received
from the exernal emulator and the bottom half function, which delivers
the
response and adjusts device registers (TIS or CRB), has been executed,
before the device's state is serialized.
A subsequent extension may need to address the live migration loop and
delay
the serialization of devices until the response from the external TPM has
been received. Though the likelihood that someone executes a long-lasting
TPM command while this is occurring is certainly rare.
Stefan
Stefan Berger (13):
tpm_tis: convert uint32_t to size_t
tpm_tis: limit size of buffer from backend
tpm_tis: remove TPMSizeBuffer usage
tpm_tis: move buffers from localities into common location
tpm_tis: merge read and write buffer into single buffer
tpm_tis: move r/w_offsets to TPMState
tpm_tis: merge r/w_offset into rw_offset
tpm: Implement tpm_sized_buffer_reset
ok for the above (you could queue/pull-req this now)
tpm: Introduce condition to notify waiters of completed command
tpm: Introduce condition in TPM backend for notification
tpm: implement tpm_backend_wait_cmd_completed
tpm: extend TPM emulator with state migration support
tpm_tis: extend TPM TIS with state migration support
Much of the complexity from this migration series comes with the
handling & synchronization of the IO thread.
I think having a seperate thread doesn't bring much today TPM thread.
it is a workaround for the chardev API being mostly synchronous. Am I
wrong? (yes, passthrough doesn't use chardev, but it should probably
use qio or chardev internally)
Other kind of devices using a seperate process suffer the same
problem. Just grep for qemu_chr_fe_write and look for the preceding
comment, it's a common flaw in qemu code base. Code use the
synchronous API, and sometime use non-blocking write
(hw/usb/redirect.c seems quite better!)
I would like to get rid of the seperate thread in TPM before we add
migration support. We should try to improve the chardev API to make it
easier to do non-blocking IO. This should considerably simplify the
above patches and benefit the rest of qemu (instead of having everyone
doing its own thing with seperate thread or other kind of continuation
state).
What do you think?
I am wondering whether it will help us to get rid of the
conditions/notifications, like patches 9-11 of this series. I doubt 12 and
13 will change. At some point device state is supposed to be written out and
in case of the TPM we have to wait for the response to have come back into
the backend. We won't start listening on the file descriptor for an
outstanding response, so I guess we will still wait for a notification in
that case as well. So I am not sure which parts are going to be simpler...
Why would qemu need to wait for completion of emulator? Couldn't qemu
& emulator save the current state, including in-flights commands?
That's apparently what usbredir does.
The thread is sending the commands to the external emulator and waits
for the reception of the response. Once the response is received, the
delivery of the response is handed off to the bottom half. Using the
bottom half avoids synchronization primitives since we know that no
other thread is in the device emulator when it runs (except for our
thread sending to and receiving from the emulator).
This series of patches introduces a tpm_busy flag that the device
emulation thread sets once it hands off a command to the thread for
processing. It does this in tpm_backend_deliver_request(). It resets
that flag once it has scheduled the bottom half to run for delivering
the response to the device emulator (TIS, CRB etc.). This in turn is
happening in tpm_backend_worker_thread().
For the migration we have the following cases once the device (frontend
or backend) is supposed to write out its state:
[Once the state of the device is supposed to be written out we know what
the OS is paused and no further TIS registers will be written to after
that and no further thread is busy emulating writes to TIS registers.
TIS being an example here for 'device emulation'].
- no TPM command is currently processed: the tpm_busy flag is false and
we can write out the state immediately. The .pre_save function does
nothing in this case.
- a TPM command is currently processed: we have to wait for the response
to come back and the thread to set the tpm_busy flag to false. This is
what tpm_backend_wait_cmd_completed() is good for. Once that has
happened, the bottom half will have been scheduled by the thread but the
bottom half will not run at this point, so we have to do what the bottom
half is supposed to do. All this is happening in the .pre_save function.
Here we call tpm_backend_wait_cmd_completed() which returns once the
response has been received back. We then check a state variable that
indicates whether the front-end is executing a command and run the
bottom half's function. (Improvement: we could also return true in case
we had to wait for the response to come back and avoid the check on the
state variable and just run the bottom half's function due to the
preceding wait)
Your refactoring has actually helped a lot in simplifying this and
pushed a lot of the code into the common tpm_backend_* functions. The
code to suspend/resume looks rather simple at this point. It mostly
comes down to refactoring the function the bottom half is supposed to
run so it can be run from the .pre_save function as well.
TIS:
https://github.com/stefanberger/qemu-tpm/commit/3b0689619be9a1f3fbbea2aa2f98802512f96099
CRB:
https://github.com/stefanberger/qemu-tpm/commit/62db4ace06e77c8e5f8274e8cb9240e5f8d663ab
SPAPR:
https://github.com/stefanberger/qemu-tpm/commit/1f31cd8b7f690f750405d911923f624b5f856a04
I may now modify tpm_backend_wait_cmd_completed() to 'bool
run_bh_function = tpm_backend_wait_cmd_completed(s)'. This should help
the cause and make the 3 devices' suspend code even look more similar.
Regards,
Stefan