On Mon, Jan 28, 2019 at 10:16:27AM +0100, Jan Kara wrote: > On Sun 27-01-19 16:36:34, valdis.kletni...@vt.edu wrote: > > On Sun, 27 Jan 2019 17:00:27 +0100, Pavel Machek said: > > > > > I've noticed this as well on earlier kernels (next-20181224 to > > > > > 20190115) > > > > > Some more info: > > > > > 1) echo 3 > /proc/sys/vm/drop_caches unwedges kcompactd in 1-3 > > > > > seconds. > > > > This aspect is curious as it indicates that kcompactd could potentially > > > > be infinite looping but it's not something I've experienced myself. By > > > > any chance is there a preditable reproduction case for this? > > > > > > I seen it exactly once, so not sure how reproducible this is. x86-32 > > > machine, running chromium browser, so yes, there was some swapping > > > involved. > > > > I don't have a surefire replicator, but my laptop (x86_64, so it's not a > > 32-bit > > only issue) triggers it fairly often, up to multiple times a day. Doesn't > > seem to > > be just the Chrome browser that triggers it - usually I'm doing other stuff > > as > > well, like a compile or similar. The fact that 'drop_caches' clears it > > makes me > > wonder if we're hitting a corner case where cache data isn't being > > automatically > > cleared and clogging something up. > > So my buffer_migrate_page_norefs() is certainly buggy in its current > incarnation (as a result block device page cache is not migratable at all). > I've sent Andrew a patch over week ago but so far it got ignored. The patch > is attached, can you give it a try whether it changes something for you? > Thanks! >
Definetly worth trying and hopefully both the migration and compaction patches sync up soon. In the event this patch does not help, I would appreciate the following 1) A trace while kcompactd is pegged at 100% trace-cmd record -a -e compaction -e migrate -e kmem:mm_page_alloc -e vmscan:mm_vmscan_kswapd_wake -e vmscan:mm_vmscan_kswapd_sleep sleep 10 Compress the resulting trace.dat and email it to me. If it's too big for a reasonable email, drop "-e kmem:mm_page_alloc" from the command line and it should be a more reasonable size. If not, reduce the sleep time to gather a shorter inverval. 2) Sample stack traces of kcompact while pegged at 100% echo -n > /tmp/kcompactd-stack; for i in `seq 1 100`; do echo sample $i >> /tmp/kcompactd-stack; cat /proc/`pidof kcompactd0`/stack >> /tmp/kcompactd-stack; done; gzip -f /tmp/kcompactd-stack And mail me the resulting /tmp/kcompactd-stack.gz Thanks. -- Mel Gorman SUSE Labs