Am 25.09.2023 um 13:58 schrieb Dimitry Andric:
# nvmecontrol identify nda0 and # nvmecontrol identify nvd0 (after hw.nvme.use_nvd="1" and reboot) give the same result:
Number of LBA Formats:       1
Current LBA Format:          LBA Format #00
LBA Format #00: Data Size:   512  Metadata Size:     0  Performance: Best
...
Optimal I/O Boundary:        0 blocks
NVM Capacity:                1000204886016 bytes
Preferred Write Granularity: 32 blocks
Preferred Write Alignment:   8 blocks
Preferred Deallocate Granul: 9600 blocks
Preferred Deallocate Align:  9600 blocks
Optimal Write Size:          256 blocks
My guess is that the "Preferred Write Granularity" is the optimal size, in this 
case 32 'blocks' of 512 bytes, so 16 kiB. This also matches the stripe size reported by 
geom, as you showed.

The "Preferred Write Alignment" is 8 * 512 = 4 kiB, so you should align 
partitions etc to at least this. However, it cannot hurt to align everything to 16 kiB 
either, which is an integer multiple of 4 kiB.

Eugene gave me a tip, so I looked into the drivers.

dev/nvme/nvme_ns.c:
nvme_ns_get_stripesize(struct nvme_namespace *ns)
{
        uint32_t ss;

        if (((ns->data.nsfeat >> NVME_NS_DATA_NSFEAT_NPVALID_SHIFT) &
            NVME_NS_DATA_NSFEAT_NPVALID_MASK) != 0) {
                ss = nvme_ns_get_sector_size(ns);
                if (ns->data.npwa != 0)
                        return ((ns->data.npwa + 1) * ss);
                else if (ns->data.npwg != 0)
                        return ((ns->data.npwg + 1) * ss);
        }
        return (ns->boundary);
}

cam/nvme/nvme_da.c:
        if (((nsd->nsfeat >> NVME_NS_DATA_NSFEAT_NPVALID_SHIFT) &
            NVME_NS_DATA_NSFEAT_NPVALID_MASK) != 0 && nsd->npwg != 0)
                disk->d_stripesize = ((nsd->npwg + 1) * disk->d_sectorsize);
        else
                disk->d_stripesize = nsd->noiob * disk->d_sectorsize;

So it seems, that nvd uses "sectorsize * Write Alignment" as stripesize  while nda uses "sectorsize * Write Granularity".

My current interpretation is, that the nvd driver reports the wrong value for maximum performance and reliability. I should make a backup and re-create the pool. Maybe we should note in the 14.0 release notes, that the switch to nda is not a "nop".

--
Frank Behrens
Osterwieck, Germany


Reply via email to