Stefan Berger <stef...@linux.vnet.ibm.com> writes: > [argh, just posted this to qemu-trivial -- it's not trivial] > > > Hello! > > I am posting this message to revive the previous discussions about the > design of vNVRAM / blobstore cc'ing (at least) those that participated > in this discussion 'back then'. > > The first goal of the implementation is to provide an vNVRAM storage for > a software implementation of a TPM to store its different blobs into. > Some of the data that the TPM writes into persistent memory needs to > survive a power down / power up cycle of a virtual machine, therefore > this type of persistent storage is needed. For the vNVRAM not to become > a road-block for VM migration, we would make use of block device > migration and layer the vNVRAM on top of the block device, therefore > using virtual machine images for storing the vNVRAM data. > > Besides the TPM blobs the vNVRAM should of course also be able able to > accommodate other use cases where persistent data is stored into > NVRAM,
Well let's focus more on the "blob store". What are the semantics of this? Is there a max number of blobs? Are the sizes fixed or variable? How often are new blobs added/removed? Regards, Anthony Liguori > BBRAM (battery backed-up RAM) or EEPROM. As far as I know more recent > machines with UEFI also have such types of persistent memory. I believe > the current design of the vNVRAM layer accommodates other use cases as > well, though additional 'glue devices' would need to be implemented to > interact with this vNVRAM layer. > > Here is a reference to the previous discussion: > > http://lists.gnu.org/archive/html/qemu-devel/2011-09/msg01791.html > http://lists.gnu.org/archive/html/qemu-devel/2011-09/msg01967.html > > Two aspects of the vNVRAM seem of primary interest: its API and how the > data is organized in the virtual machine image leaving its inner > workings to the side for now. > > > API of the vNVRAM: > ------------------ > > The following functions and data structures are important for devices: > > > enum NVRAMRWOp { > NVRAM_OP_READ, > NVRAM_OP_WRITE, > NVRAM_OP_FORMAT > }; > > /** > * Callback function a device must provide for reading and writing > * of blobs as well as for indicating to the NVRAM layer the maximum > * blob size of the given entry. Due to the layout of the data in the > * NVRAM, the device must always write a blob with the size indicated > * during formatting. > * @op: Indication of the NVRAM operation > * @v: Input visitor in case of a read operation, output visitor in > * case of a write or format operation. > * @entry_name: Unique name of the NVRAM entry > * @opaque: opaque data previously provided when registering the NVRAM > * entry > * @errp: Pointer to an Error pointer for the visitor to indicate error > */ > typedef int (*NVRAMRWData)(enum NVRAMRWOp op, Visitor *v, > const NVRAMEntryName *entry_name, void *opaque, > Error **errp); > > /** > * nvram_setup: > * @drive_id: The ID of the drive to be used as NVRAM. Following the > command > * line '-drive if=none,id=tpm-bs,file=<file>' 'tpm-bs' would > * have to be passed. > * @errcode: Pointer to an integer for an error code > * @resetfn : Device reset function > * @dev: The DeviceState to be passed to the device reset function @resetfn > * > * This function returns a pointer to VNVRAM or NULL in case an error > occurred > */ > VNVRAM *nvram_setup(const char *drive_id, int *errcode, > qdev_resetfn resetfn, DeviceState *dev); > > /** > * nvram_delete: > * @nvram: The NVRAM to destroy > * > * Destroy the NVRAM previously allocated using nvram_setup. > */ > int nvram_delete(VNVRAM *nvram); > > /** > * nvram_start: > * @nvram: The NVRAM to start > * @fail_on_encrypted_drive: Fail if the drive is encrypted but no > * key was provided so far to lower layers. > * > * After all blobs that the device intends to write have been registered > * with the NVRAM, this function is used to start up the NVRAM. In case > * no error occurred, 0 is returned, an error code otherwise. > */ > int nvram_start(VNVRAM *nvram, bool fail_on_encrypted_drive); > > /** > * nvram_process_requests: > * > * Have the NVRAM layer process all outstanding requests and wait > * for their completion. > */ > void nvram_process_requests(void); > > /** > * nvram_register_entry: > * > * @nvram: The NVRAM to register an entry with > * @entry_name: The unique name of the blob to register > * @rwdata_callback: Callback function for the NVRAM layer to > * invoke for asynchronous requests such as > * delivering the results of a read operation > * or requesting the maximum size of the blob > * when formatting. > * @opaque: Data to pass to the callback function > * > * Register an entry for the NVRAM layer to write. In case of success > * this function returns 0, an error code otherwise. > */ > int nvram_register_entry(VNVRAM *nvram, const NVRAMEntryName *entry_name, > NVRAMRWData rwdata_callback, void *opaque); > > /** > * nvram_deregister_entry: > * @nvram: The NVRAM to deregister an entry from > * @entry_name: the unique name of the entry > * > * Deregister an NVRAM entry previously registered with the NVRAM layer. > * The function returns 0 on success, an error code on failure. > */ > int nvram_deregister_entry(VNVRAM *nvram, const NVRAMEntryName *entry_name); > > /** > * nvram_had_fatal_error: > * @nvram: The NVRAM to check > * > * Returns true in case the NVRAM had a fatal error and > * is unusable, false if the device can be used. > */ > bool nvram_had_fatal_error(VNVRAM *nvram); > > /** > * nvram_write_data: > * @nvram: The NVRAM to write the data to > * @entry_name: The name of the blob to write > * @flags: Flags indicating sychronouse or asynchronous > * operation and whether to wait for completion > * of the operation. > * @cb: callback to invoke for an async write > * @opaque: data to pass to the callback > * > * Write data into NVRAM. This function will invoke the callback > * provided in nvram_setup where an output visitor will be > * provided for writing the blob. This function returns 0 in case > * of success, an error code otherwise. > */ > int nvram_write_data(VNVRAM *nvram, const NVRAMEntryName *entry_name, > int flags, NVRAMRWFinishCB cb, void *opaque); > > /** > * nvram_write_data: > * @nvram: The NVRAM to read the data from > * @entry_name: The name of the blob tow rite > * @flags: Flags indicating sychronouse or asynchronous > * operation and whether to wait for completion > * of the operation. > * @cb: callback to invoke for an async read > * @opaque: data to pass to the callback > * > * Read data from NVRAM. This function will invoke the callback > * provided in nvram_setup where an input visitor will be > * provided for reading the data. This function return 0 in > * case of success, an error code otherwise. > */ > int nvram_read_data(VNVRAM *nvram, const NVRAMEntryName *entry_name, > int flags, NVRAMRWFinishCB cb, void *opaque); > > /* flags used above */ > #define VNVRAM_ASYNC_F (1 << 0) > #define VNVRAM_WAIT_COMPLETION_F (1 << 1) > > > > Organization of the data in the virtual machine image: > ------------------------------------------------------ > > All data on the VM image are written as a single ASN.1 stream with a > header followed by the individual fixed-sized NVRAM entries. The NVRAM > layer creates the header during an NVRAM formatting step that must be > initiated by the user (or libvirt) through an HMP or QMP command. > > The fact that we are writing ASN.1 formatted data into the virtual > machine image is also the reason for the recent posts of the ASN.1 > visitor patch series. > > > /* > * The NVRAM on-disk format is as follows: > * Let '{' and '}' denote an ASN.1 sequence start and end. > * > * { > * NVRAM-header: "qemu-nvram" > * # a sequence of blobs: > * { > * 1st NVRAM entry's name > * 1st NVRAM entry's ASN.1 blob (fixed size) > * } > * { > * 2nd NVRAM entry's name > * 2nd NVRAM entry's ASN.1 blob (fixed size) > * } > * ... > * } > */ > > NVRAM entries are read by searching for the entry identified by its > unique name. If it is found, the device's callback function is invoked > with an input visitor for the device to read the data. > > NVRAM entries are written by searching for the entry identified by its > unique name. If it is found, the device's callback function is invoked > with an output visitor positioned to where the data need to be written > to in the VM image. The device then uses the visitor directly to write > the blob. > > The ASN.1 blobs have to be of fixed size since an inflating or deflating > 1st blob would require that all subsequent blobs be moved or destroy the > integrity of the ASN.1 stream. > > > One complication is the requirements on size of the NVRAM and the fact > the virtual machine images typically don't grow. Here users may need > a-priori knowledge as to what the size of the NVRAM has to be for the > device to properly work. In case of the the TPM for example, the TPM > requires a virtual machine image of a certain size for it to be able to > write all its blobs into. It may be necessary for human users to start > QEMU once to find out the required size of the NVRAM image (using an HMP > command) or get it through documentation. In the case of libvirt the > required image size could be hard coded into libvirt since it will not > change anymore and is a property of the device. Another possibility > would be to use QEMU APIs to re-size the image before formatting (this > at least did not work a few months ago if I recall correctly, or did not > work with all VM image formats; details here faded from memory...) > > I think this is enough detail for now. Please let me know of any > comments you may have. My primary concern for now is to get clarity on > the layout of the data inside the virtual machine image. The ASN.1 > visitors were written for this purpose. > > > Thanks and regards, > Stefan