Hello,

Thanks for working on this!

Contiguous writes is indeed one of the things we want to coalesce.
(we'd also want to prefetch coalesced reads, there's a whole gsoc
project idea about it on the wiki)

Milos Nikic, le lun. 25 août 2025 15:00:27 -0700, a ecrit:
> No stalls with 2-block chunks; larger chunks can starve writers, so the ext2
> default is 2.

2 looks like a very small aggregation...

I'm thinking that you'd probably want to avoid pending_blocks_add call
pending_blocks_write, i.e. check for consecutive block numbers yourself,
and break the for(b) loop when it is not, i.e. use non-consecutive block
allocations as unlock hints, leading to calling store_write exactly once
per lock acquisition. You'd probably still want to set an upper limit on
the number of consecutive blocks, but it'd probably be fine with larger
values, avoiding to make a lot of store_write calls when block numbers
are not consecutive at all.

> Only FILE_DATA uses pager_write_pages(). DISK (swap, etc.) remains per-page 
> for
> now.
> 
> If this approach looks good, I can follow up with:
> 
>   • an adaptive chunk size (back off when the rdlock is contended),

I don't know how you'd test that the rdlock is contended? Just using a
tryrdlock to determine whether you can acquire it immediately or not?

About the yield() call, does it actually help? thread wake ups will
already try to switch to the awakened thread if dynamic priorities tell
so ; yield() won't do more that this.

>   • optional support for other filesystems,

Why optional?

>   • a DISK bulk path if we decide it’s worthwhile.

I don't think it can be not worthwhile?


> @@ -378,6 +379,121 @@ pending_blocks_add (struct pending_blocks *pb, block_t 
> block)
>    pb->num++;
>    return 0;
>  }
> +
> +/* Keep per-chunk work small to avoid starving writers on alloc_lock. */
> +#define EXT2_BULK_CHUNK_BLOCKS  2  /* 2 * block_size per flush; */

Rather put it at the top of the file for better visibility.


> +  if (offset >= node->allocsize)
> +    left = 0;
> +  else if (offset + left > node->allocsize)
> +    left = node->allocsize - offset;

Better put these at the top of the while loop so it's factorized.

Samuel

Reply via email to