On Tue, Apr 29, 2025 at 01:09:33AM +0200, Max Kellermann wrote: > If bio_add_folio() fails (because it is full), > erofs_fileio_scan_folio() needs to submit the I/O request via > erofs_fileio_rq_submit() and allocate a new I/O request with an empty > `struct bio`. Then it retries the bio_add_folio() call. > > However, at this point, erofs_onlinefolio_split() has already been > called which increments `folio->private`; the retry will call > erofs_onlinefolio_split() again, but there will never be a matching > erofs_onlinefolio_end() call. This leaves the folio locked forever > and all waiters will be stuck in folio_wait_bit_common(). > > This bug has been added by commit ce63cb62d794 ("erofs: support > unencoded inodes for fileio"), but was practically unreachable because > there was room for 256 folios in the `struct bio` - until commit > 9f74ae8c9ac9 ("erofs: shorten bvecs[] for file-backed mounts") which > reduced the array capacity to 16 folios. > > It was now trivial to trigger the bug by manually invoking readahead > from userspace, e.g.: > > posix_fadvise(fd, 0, st.st_size, POSIX_FADV_WILLNEED); > > This should be fixed by invoking erofs_onlinefolio_split() only after > bio_add_folio() has succeeded. This is safe: asynchronous completions > invoking erofs_onlinefolio_end() will not unlock the folio because > erofs_fileio_scan_folio() is still holding a reference to be released > by erofs_onlinefolio_end() at the end. > > Fixes: ce63cb62d794 ("erofs: support unencoded inodes for fileio") > Fixes: 9f74ae8c9ac9 ("erofs: shorten bvecs[] for file-backed mounts") > Cc: sta...@vger.kernel.org > Signed-off-by: Max Kellermann <max.kellerm...@ionos.com>
Thanks for catching this! LGTM: Reviewed-by: Gao Xiang <xi...@kernel.org> Thanks, Gao Xiang