Stefan Hajnoczi <stefa...@gmail.com> 于2022年11月18日周五 03:12写道: > > Hi Sam, > Please send a git repo URL so Thomas can fetch the commit without > email/file size limitations.
I'll push it to the zbd branch after fixing the bellowing. https://github.com/sgzerolc/qemu-web/zbd > > > diff --git a/_posts/2022-11-17-zoned-emulation.md > > b/_posts/2022-11-17-zoned-emulation.md > > new file mode 100644 > > index 0000000..69ce4d7 > > --- /dev/null > > +++ b/_posts/2022-11-17-zoned-emulation.md > > @@ -0,0 +1,45 @@ > > +--- > > +layout: post > > +title: "Introduction to Zoned Storage Emulation" > > +date: 2022-11-17 > > +author: Sam Li > > +categories: [storage, gsoc, outreachy, internships] > > +--- > > + > > +## Zoned block devices > > + > > +Aimed for at-scale data infrastructures, > > I don't know what at-scale data infrastructure is. Is it something > readers can relate to? Otherwise there's a risk that readers will > decide this doesn't apply to them and stop reading. Yes, I'll remove it. > > > zoned block devices (ZBDs) divide the LBA space into block regions called > > zones that are larger than the LBA size. > > LBA is not defined and also not used again after this sentence. > Readers will be familiar with disks but may not know what an LBA is. > Since the concept isn't used again I suggest dropping it: > > zoned block devices (ZBDs) are divided into regions called zones > that can only be written sequentially. > > > By only allowing sequential writes, it can reduce write amplification in > > SSDs, > > This sounds more natural: > > By only allowing sequential writes, SSD write amplification can be reduced > > It might also be nice to provide a little bit of extra context: > > ... reduced by eliminating the need for a <a > href="https://en.wikipedia.org/wiki/Flash_translation_layer">Flash > Translation Layer</a> > > > and potentially lead to higher throughput and increased capacity. Providing > > new storage software stack, > > s/Providing new/Providing a new/ > > > zoned storage concept is standardized as ZBC(SCSI standard), ZAC(ATA > > standard), ZNS(NVMe). > > Small tweaks: > > zoned storage concepts are standardized in ZBC (SCSI standard), ZAC > (ATA standard), ZNS (NVMe). > > There is a space before opening parentheses: hello (world) instead of > hello(world). Please check the rest of the article for more instances > of this. > > It would be nice to include links but I didn't find good pages for > ZBC/ZAC/ZNS aside from the full standards that they are part of. > > This intro section would be a good place to link to https://zonedstorage.io/! Good idea! Zoned storage site also has a brief introduction to those standards. https://zonedstorage.io/docs/introduction/smr#governing-standards https://zonedstorage.io/docs/introduction/zns > > > Meanwhile, the virtio protocol for block devices(virtio-blk) should also be > > aware of ZBDs instead of taking them as regular block devices. It should be > > able to pass such devices through to the guest. An overview of necessary > > work is as follows: > > + > > +1. Virtio protocol: [extend virtio-blk protocol with main zoned storage > > concept](https://lwn.net/Articles/914377/), Dmitry Fomichev > > +2. Linux: [implement the virtio specification > > extensions](https://www.spinics.net/lists/linux-block/msg91944.html), > > Dmitry Fomichev > > +3. QEMU: add zoned emulation support to virtio-blk, Sam Li, [Outreachy > > 2022 > > project](https://wiki.qemu.org/Internships/ProjectIdeas/VirtIOBlkZonedBlockDevices) > > You could split the QEMU work into 2 points if you like: > 3. QEMU: add zoned storage APIs to the block layer, Sam Li > 4. QEMU: implement zoned storage support in virtio-blk emulation, Sam Li > > > + > > +<img src="/screenshots/zbd.png" alt="zbd" style="zoom:50%;" /> > > + > > +## Zoned emulation > > + > > +Currently, QEMU can support zoned devices by virtio-scsi or PCI device > > passthrough. It needs to specify the device type it is talking to. While > > storage controller emulation uses block layer APIs instead of directly > > accessing disk images. Extending virtio-blk emulation avoids code > > duplication and simplify the support by hiding the device types under a > > unified zoned storage interface, simplifying VM deployment for different > > type of zoned devices. > > Another advantages that come to mind: > 1. virtio-blk can be implemented in hardware. If those devices wish to > follow the zoned storage model then the virtio-blk specification needs > to natively support zoned storage. > 2. Individual NVMe namespaces or anything that is a zoned Linux block > device can be exposed to the guest without passing through a full > device. Thanks! > > > + > > +For zoned storage emulation, zoned storage APIs support three zoned > > models(conventional, host-managed, host-aware) , four zone management > > commands(Report Zone, Open Zone, Close Zone, Finish Zone), and Append Zone. > > QEMU block storage > > Maybe: > s/QEMU block storage/The QEMU block layer/ > > > has a BlockDriverState graph that propagates device information inside > > block layer. A root pointer at BlockBackend points to the graph. There are > > three type of block driver nodes: filter node, format node, protocol node. > > File-posix driver is the lowest level within the graph where zoned storage > > APIs reside. > > Is it possible to remove "A root pointer at BlockBackend points to the > graph. There are three type of block driver nodes: filter node, format > node, protocol node." so there are fewer new concepts? I didn't see > further use of BlockBackend or filter/format nodes in the text. Yes, it can be removed. > > > + > > +<img src="/screenshots/storage_overview.png" alt="storage_overview" > > style="zoom: 50%;" /> > > + > > +After receiving the block driver states, Virtio-blk emulation recognizes > > zoned devices and sends the zoned feature bit to guest. Then the guest can > > see the zoned device in the host. When the guest executes zoned operations, > > virtio-blk driver issues corresponding requests that will be captured by > > virito-blk > > s/virito/virtio/ > > > device inside QEMU. Afterwards, virtio-blk device sends the requests to > > file-posix driver which will perform zoned operations. > > + > > +Unlike zone management operations, Linux doesn't have a user API > > The Linux userspace API (<linux/blkzoned.h>) hasn't been mentioned > before. Maybe the previous paragraph should explain that file-posix > performs zoned operations using <linux/blkzoned.h> ioctls? Then this > sentence will be easier to understand. > > > to issue zone append requests to zoned devices from user space. With the > > help of write pointer emulation tracking locations of write pointer of each > > zone, QEMU block layer performs append writes by modifying regular writes. > > Write pointer locks guarantee the execution of requests. Upon failure it > > must not update the write pointer location which is only got updated when > > the request is successfully finished. > > + > > +Problems can always be sovled > > s/sovled/solved/ Thanks for your comments, Sam