Mouse <mo...@rodents-montreal.org> writes:

>> The semantics of fdiscard(2) are a bit messy, with TRIM and undefined
>> contents.
>
> The messy bits apply only when discarding space on a device, as I read
> the (9.1) manpage.  I suspect that's because that's how devices work:
> dropped blocks return, essentially, whatever the implementation finds
> most convenient.  Requiring anything else would mean it simply couldn't
> work on existing devices.

Agreed about devices, but fdiscard is a filesystem operation.    I find
it bizarre that a fs op whose basic semantics are:

  turn this region of a file into a hole, more or less as it if were not
  written

to change to

  for this part of the file, either make it a whole, or tell the
  underlying device -- whatever that means -- that you are ok with those
  data blocks having UB when read


The essence of fdicard, as I read it, is to drop those blocks from the
inode.  What happens to those blocks after that is another matter.

>> That's surprising to me, as I see telling the hardware that blocks
>> are no longer needed seems separable from a file no longer
>> referencing those blocks.
>
> Conceptually, so do I.  But, on a device with a filesystem on it, when
> discarding part of a file, the blocks are no longer used, so they might
> as well be discarded; when discarding blocks, they'd better no longer
> be part of a file.  So tying them together makes at least some sense
> when discarding space in a filesystem file.

I agree that while separable, there can be a sequential relationship.

It is reasonable to tell a device that the OS no longer cares about
block contents, when those blocks are placed on the free list, via
fdiscard(2), just as via (the last) unlink(2).

It does not seem reasonable to tell a device the OS no longer cares if
the blocks are still allocated to a file.  I could see defining a
fmakeub(2) call that tells the OS that it's ok to return arbitrary bits
for that range, vs making them a hole -- while keeping the blocks in the
file.  But I can't see anyone wanting to use that, except maybe a
database that's essentially implementing a filesystem within a file.

I would want to understand what kinds of discard operations are
available in relatively modern SSDs and flash drives (which I suspect
differ), what they actually do, if there is any point in calling them,
before completing a design for fdiscard(2).

I could also see a reasonable option of implementing fdiscard(2) as a
FS-layer operation, and not making any TRIM-like calls.

Reply via email to