On Thu, Jun 03, 2021 at 12:04:06PM +0200, Klaus Jensen wrote: > On Jun 3 10:48, Stefan Hajnoczi wrote: > > On Tue, Jun 01, 2021 at 08:07:48PM +0530, Gollu Appalanaidu wrote: > > > @@ -5546,6 +5665,47 @@ static void nvme_write_bar(NvmeCtrl *n, hwaddr > > > offset, uint64_t data, > > > NVME_GUEST_ERR(pci_nvme_ub_mmiowr_cmbsz_readonly, > > > "invalid write to read only CMBSZ, ignored"); > > > return; > > > + case 0x44: /* BPRSEL */ > > > + n->bar.bprsel = data & 0xffffffff; > > > + size_t bp_len = NVME_BPRSEL_BPRSZ(n->bar.bprsel) * 4 * KiB; > > > + int64_t bp_offset = NVME_BPRSEL_BPROF(n->bar.bprsel) * 4 * KiB; > > > + int64_t off = 0; > > > + struct nvme_bp_read_ctx *ctx; > > > + > > > + trace_pci_nvme_mmio_bprsel(data, n->bar.bprsel, > > > + NVME_BPRSEL_BPID(n->bar.bpinfo), > > > + bp_offset, bp_len); > > > + > > > + if (bp_len + bp_offset > n->bp_size) { > > > + NVME_BPINFO_CLEAR_BRS(n->bar.bpinfo); > > > + NVME_BPINFO_SET_BRS(n->bar.bpinfo, NVME_BPINFO_BRS_ERROR); > > > + return; > > > + } > > > + > > > + off = NVME_BPRSEL_BPID(n->bar.bpinfo) * n->bp_size + bp_offset; > > > + > > > + NVME_BPINFO_CLEAR_BRS(n->bar.bpinfo); > > > + NVME_BPINFO_SET_BRS(n->bar.bpinfo, NVME_BPINFO_BRS_READING); > > > + > > > + ctx = g_new(struct nvme_bp_read_ctx, 1); > > > + > > > + ctx->n = n; > > > + > > > + pci_dma_sglist_init(&ctx->qsg, &n->parent_obj, 1); > > > + > > > + qemu_sglist_add(&ctx->qsg, n->bar.bpmbl, bp_len); > > > + > > > + dma_blk_read(n->blk_bp, &ctx->qsg, off , BDRV_SECTOR_SIZE, > > > + nvme_bp_read_cb, ctx); > > > > The returned BlockAIOCB is not stored. Two questions: > > > > Thanks for these comments Stefan, I've commented below how I think they > can be resolved. > > > 1. Can the guest allocate unbounded amounts of QEMU memory (struct > > nvme_bp_read_ctx) by repeatedly writing to this register? > > > > Yeah, the QSQ should just be a member of the controller struct and then have > that as the cb_arg to dma_blk_read. Then, the QSG can be initialized and the > host address added when BPMBL is written instead of here. > > We don't want the QSG to change while the read is in progress, so the write > to BPMBL should check BPINFO.BRS and not do anything if the QSG is "in use". > The spec says that the host *shall* not modify the registers when a read is > in progress, so we can safely just ignore the write. > > > 2. What happens if the NVMe device is hot unplugged or reset while a > > boot partition read request is in flight? > > The spec also says that the host *shall* not reset or shutdown the > controller while a read is in progress, but again, we need to protect QEMU > so the aiocb must be saved on the controller struct and cancelled > appropriately.
Sounds good. There is documentation about secure coding practices in docs/devel/secure-coding-practices.rst (especially the stuff specific to device emulation is interesting). It's possible that -device nvme will become common in production virtual machines and it's hard to address stability and security after the code has been written. Thanks for taking this feedback on board even though it may not be relevant to NVMe test/bringup/prototyping use cases. Stefan
signature.asc
Description: PGP signature