On Fri, Dec 06, 2024 at 10:51:20PM +0700, Max Nikulin wrote:
Michael, thank you for the long message. Actually I wonder what is
"idle" that allows drive to perform self-maintenance. I expect that
the device should not be in some deep power saving state (I am yet to
discover available tunables that allows drive to "sleep"). Should it
be some period of time (seconds? minutes?) completely without any IO
or is it enough if read/write speed is below some threshold and
from/to another chip?
Basically it means that it isn't busy doing I/O; if you're reading or
writing, the drive can't also be reading and writing. It doesn't need to
be absolutely unused.
As to erase block size, I am aware of it. On the other hand I am
surprised that a drive does not allow kernel to optimize writes on a
higher level (as uSD does):
grep '' /sys/block/*/queue/discard_granularity
...
/sys/block/mmcblk0/queue/discard_granularity:4194304
/sys/block/nvme0n1/queue/discard_granularity:4096
/sys/block/sda/queue/discard_granularity:4096 # hdd (shingled)
The discard_granularity *limits* how the kernel can tell the drive that
there are free blocks--a granularity of 4M means that the kernel can
only issue a TRIM command when it has at least 4M of empty space *and*
that empty space is aligned on a 4M boundary. (That is, you can't
discard locations 2-5M on the drive, only 0-3M, 4-7M, etc.) It's a big
number on the sd card because sd cards are pretty much junk. On a decent
NVMe drive it'll typically be 512 (i.e., you can discard any logical
block) or maybe 4096 if you're in 4k mode.