On 4/29/26 17:35, Zi Yan wrote: > collapse_file() is capable of collapsing pagecache folios from writable > files to PMD folios. Now enable clean pagecache folio collapse in addition > to read-only pagecache folio collapse by removing the > inode_is_open_for_write() from file_thp_enabled() and only performing > filemap_flush() if the file is read-only. > > This means userspace needs to explicitly flush the content of pagecache > folios before khugepaged can collapse the folios, or use > madvise(MADV_COLLAPSE), which does the flush in the retry. The reason is > that blindly enabling dirty pagecache folio from writable files collapse > makes khugepaged flush these folios all the time. It is undesirable to > cause system level pagecache flushes. > > To properly support dirty pagecache folio collapse, filemap_flush() needs > to be avoided. Potentially, merging associated buffer instead of dropping > it with filemap_release_folio() might be needed. > > NOTE: this breaks khugepaged selftests for writable file pagecache > collapse, which is set to fail all the time. The next commit fix it. > > Signed-off-by: Zi Yan <[email protected]> > --- > mm/huge_memory.c | 2 +- > mm/khugepaged.c | 9 ++++++++- > 2 files changed, 9 insertions(+), 2 deletions(-) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 9b3abb98a7e51..e1e9d59db6e70 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -97,7 +97,7 @@ static inline bool file_thp_enabled(struct vm_area_struct > *vma) > if (!mapping_pmd_folio_support(vma->vm_file->f_mapping)) > return false; > > - return !inode_is_open_for_write(inode) && S_ISREG(inode->i_mode); > + return S_ISREG(inode->i_mode); > } > > /* If returns true, we are unable to access the VMA's folios. */ > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index 1ee15b48962a3..fb7ff643973cc 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -2345,7 +2345,14 @@ static enum scan_result collapse_file(struct mm_struct > *mm, unsigned long addr, > * forcing writeback in loop. > */ > xas_unlock_irq(&xas); > - filemap_flush(mapping); > + /* > + * Only flush for read-only files. Writable > + * files can have their folios dirty at any > + * time; blindly flushing them would cause > + * undesirable system-wide writeback. > + */
That comment should really be merged in the comment above. Also, there we say "khugepaged only works on read-only fd" ... which is now just wrong? Please revise that whole comment as you incorporate your comment. Apart from that I guess this is fine ... or we'll learn rather quickly, haha. -- Cheers, David

