On Wed, Apr 29, 2026 at 11:35:36AM -0400, Zi Yan wrote: >collapse_file() is capable of collapsing pagecache folios from writable >files to PMD folios. Now enable clean pagecache folio collapse in addition >to read-only pagecache folio collapse by removing the >inode_is_open_for_write() from file_thp_enabled() and only performing >filemap_flush() if the file is read-only. > >This means userspace needs to explicitly flush the content of pagecache >folios before khugepaged can collapse the folios, or use >madvise(MADV_COLLAPSE), which does the flush in the retry. The reason is >that blindly enabling dirty pagecache folio from writable files collapse >makes khugepaged flush these folios all the time. It is undesirable to >cause system level pagecache flushes. > >To properly support dirty pagecache folio collapse, filemap_flush() needs >to be avoided. Potentially, merging associated buffer instead of dropping >it with filemap_release_folio() might be needed. > >NOTE: this breaks khugepaged selftests for writable file pagecache >collapse, which is set to fail all the time. The next commit fix it. > >Signed-off-by: Zi Yan <[email protected]> >--- > mm/huge_memory.c | 2 +- > mm/khugepaged.c | 9 ++++++++- > 2 files changed, 9 insertions(+), 2 deletions(-) > >diff --git a/mm/huge_memory.c b/mm/huge_memory.c >index 9b3abb98a7e51..e1e9d59db6e70 100644 >--- a/mm/huge_memory.c >+++ b/mm/huge_memory.c >@@ -97,7 +97,7 @@ static inline bool file_thp_enabled(struct vm_area_struct >*vma) > if (!mapping_pmd_folio_support(vma->vm_file->f_mapping)) > return false; > >- return !inode_is_open_for_write(inode) && S_ISREG(inode->i_mode); >+ return S_ISREG(inode->i_mode); > } > > /* If returns true, we are unable to access the VMA's folios. */ >diff --git a/mm/khugepaged.c b/mm/khugepaged.c >index 1ee15b48962a3..fb7ff643973cc 100644 >--- a/mm/khugepaged.c >+++ b/mm/khugepaged.c >@@ -2345,7 +2345,14 @@ static enum scan_result collapse_file(struct mm_struct >*mm, unsigned long addr, > * forcing writeback in loop. > */
Nit: the comment above now looks stale. It still says: "There won’t be new dirty pages." That was true when file_thp_enabled() rejected writable-open files, but not after this patch ;) Otherwise, LGTM. Reviewed-by: Lance Yang <[email protected]> > xas_unlock_irq(&xas); >- filemap_flush(mapping); >+ /* >+ * Only flush for read-only files. Writable >+ * files can have their folios dirty at any >+ * time; blindly flushing them would cause >+ * undesirable system-wide writeback. >+ */ >+ if (!inode_is_open_for_write(mapping->host)) >+ filemap_flush(mapping); > result = SCAN_PAGE_DIRTY_OR_WRITEBACK; > goto xa_unlocked; > } else if (folio_test_writeback(folio)) { >-- >2.53.0 > >

