Re: [Qemu-devel] [PATCH v3 00/13] tpm: Extend TPM with state migration support (not 2.11)

Stefan Berger Wed, 27 Dec 2017 07:01:00 -0800

On 12/22/2017 11:13 AM, Marc-André Lureau wrote:

Hi


On Fri, Dec 22, 2017 at 4:59 PM, Stefan Berger
<stef...@linux.vnet.ibm.com> wrote:

On 12/22/2017 07:49 AM, Marc-André Lureau wrote:

Hi

On Fri, Nov 10, 2017 at 3:11 PM, Stefan Berger
<stef...@linux.vnet.ibm.com> wrote:

This set of patches implements support for migrating the state of the
external 'swtpm' TPM emulator as well as that of the emulated device
interfaces. I have primarily tested this with the TIS and TPM 1.2 so
far, but it also seems to work with TPM 2.

The TIS is simplified first by reducing the number of buffers and read
and write offsets into these buffers. Following the state machine of the
TIS, a single buffer and r/w offset is enough for all localities since
only one locality can ever be active.

This series applies on top of my tpm-next branch.

One of the challenges that is addressed by this set of patches is the
fact
that the TPM emulator may be processing a command while the state
serialization of the devices is supposed to happen. A necessary first
step
has been implemented here that ensures that a response has been received
from the exernal emulator and the bottom half function, which delivers
the
response and adjusts device registers (TIS or CRB), has been executed,
before the device's state is serialized.

A subsequent extension may need to address the live migration loop and
delay
the serialization of devices until the response from the external TPM has
been received. Though the likelihood that someone executes a long-lasting
TPM command while this is occurring is certainly rare.

     Stefan

Stefan Berger (13):
    tpm_tis: convert uint32_t to size_t
    tpm_tis: limit size of buffer from backend
    tpm_tis: remove TPMSizeBuffer usage
    tpm_tis: move buffers from localities into common location
    tpm_tis: merge read and write buffer into single buffer
    tpm_tis: move r/w_offsets to TPMState
    tpm_tis: merge r/w_offset into rw_offset
    tpm: Implement tpm_sized_buffer_reset

ok for the above (you could queue/pull-req this now)

    tpm: Introduce condition to notify waiters of completed command
    tpm: Introduce condition in TPM backend for notification
    tpm: implement tpm_backend_wait_cmd_completed
    tpm: extend TPM emulator with state migration support
    tpm_tis: extend TPM TIS with state migration support

Much of the complexity from this migration series comes with the
handling & synchronization of the IO thread.

I think having a seperate thread doesn't bring much today TPM thread.
it is a workaround for the chardev API being mostly synchronous. Am I
wrong? (yes, passthrough doesn't use chardev, but it should probably
use qio or chardev internally)

Other kind of devices using a seperate process suffer the same
problem. Just grep for qemu_chr_fe_write and look for the preceding
comment, it's a common flaw in qemu code base. Code use the
synchronous API, and sometime use non-blocking write
(hw/usb/redirect.c seems quite better!)

I would like to get rid of the seperate thread in TPM before we add
migration support. We should try to improve the chardev API to make it
easier to do non-blocking IO. This should considerably simplify the
above patches and benefit the rest of qemu (instead of having everyone
doing its own thing with seperate thread or other kind of continuation
state).

What do you think?


I am wondering whether it will help us to get rid of the
conditions/notifications, like patches 9-11 of this series. I doubt 12 and
13 will change. At some point device state is supposed to be written out and
in case of the TPM we have to wait for the response to have come back into
the backend. We won't start listening on the file descriptor for an
outstanding response, so I guess we will still wait for a notification in
that case as well. So I am not sure which parts are going to be simpler...

Why would qemu need to wait for completion of emulator? Couldn't qemu
& emulator save the current state, including in-flights commands?
That's apparently what usbredir does.

The thread is sending the commands to the external emulator and waitsfor the reception of the response. Once the response is received, thedelivery of the response is handed off to the bottom half. Using thebottom half avoids synchronization primitives since we know that noother thread is in the device emulator when it runs (except for ourthread sending to and receiving from the emulator).

This series of patches introduces a tpm_busy flag that the deviceemulation thread sets once it hands off a command to the thread forprocessing. It does this in tpm_backend_deliver_request(). It resetsthat flag once it has scheduled the bottom half to run for deliveringthe response to the device emulator (TIS, CRB etc.). This in turn ishappening in tpm_backend_worker_thread().

For the migration we have the following cases once the device (frontendor backend) is supposed to write out its state:[Once the state of the device is supposed to be written out we know whatthe OS is paused and no further TIS registers will be written to afterthat and no further thread is busy emulating writes to TIS registers.TIS being an example here for 'device emulation'].

- no TPM command is currently processed: the tpm_busy flag is false andwe can write out the state immediately. The .pre_save function doesnothing in this case.

- a TPM command is currently processed: we have to wait for the responseto come back and the thread to set the tpm_busy flag to false. This iswhat tpm_backend_wait_cmd_completed() is good for. Once that hashappened, the bottom half will have been scheduled by the thread but thebottom half will not run at this point, so we have to do what the bottomhalf is supposed to do. All this is happening in the .pre_save function.Here we call tpm_backend_wait_cmd_completed() which returns once theresponse has been received back. We then check a state variable thatindicates whether the front-end is executing a command and run thebottom half's function. (Improvement: we could also return true in casewe had to wait for the response to come back and avoid the check on thestate variable and just run the bottom half's function due to thepreceding wait)

Your refactoring has actually helped a lot in simplifying this andpushed a lot of the code into the common tpm_backend_* functions. Thecode to suspend/resume looks rather simple at this point. It mostlycomes down to refactoring the function the bottom half is supposed torun so it can be run from the .pre_save function as well.

TIS:https://github.com/stefanberger/qemu-tpm/commit/3b0689619be9a1f3fbbea2aa2f98802512f96099CRB:https://github.com/stefanberger/qemu-tpm/commit/62db4ace06e77c8e5f8274e8cb9240e5f8d663abSPAPR:https://github.com/stefanberger/qemu-tpm/commit/1f31cd8b7f690f750405d911923f624b5f856a04

I may now modify tpm_backend_wait_cmd_completed() to 'boolrun_bh_function = tpm_backend_wait_cmd_completed(s)'. This should helpthe cause and make the 3 devices' suspend code even look more similar.


Regards,
  Stefan

Re: [Qemu-devel] [PATCH v3 00/13] tpm: Extend TPM with state migration support (not 2.11)

Reply via email to