Hi,
Add top/free info when our applicaiton pipeline is running.
> Hi,
>
> some question about workqueue for percpu.
>
> > > >
> > > > And a question about this,
> > > > > > > > upper caller:
> > > > > > > > nofs_flag = memalloc_nofs_save();
> > > > > > > > ret = btrfs_drew_lock_init(&root->snapshot_lock);
> > > > > > > > memalloc_nofs_restore(nofs_flag);
> > > > >
> > > > > The issue is here. nofs is set which means percpu attempts an atomic
> > > > > allocation. If it cannot find anything already allocated it isn't
> > > > > happy.
> > > > > This was done before memalloc_nofs_{save/restore}() were pervasive.
> > > > >
> > > > > Percpu should probably try to allocate some pages if possible even if
> > > > > nofs is set.
> > > >
> > > > Should we check and pre-alloc memory inside memalloc_nofs_restore()?
> > > > another memalloc_nofs_save() may come soon.
> > > >
> > > > something like this in memalloc_nofs_save()?
> > > > if (pcpu_nr_empty_pop_pages[type] < PCPU_EMPTY_POP_PAGES_LOW)
> > > > pcpu_schedule_balance_work();
> > > >
> > >
> > > Percpu does do this via a workqueue item. The issue is in v5.9 we
> > > introduced 2 types of chunks. However, the free float page number was
> > > for the total. So even if 1 chunk type dropped below, the other chunk
> > > type might have enough pages. I'm queuing this for 5.12 and will send it
> > > out assuming it does fix your problem.
>
> workqueue for percpu maybe not strong enough( not scheduled?) when high
> CPU load?
>
> this is our application pipeline.
> file_pre_process |
> bwa.nipt xx |
> samtools.nipt sort xx |
> file_post_process
>
> file_pre_process/file_post_process is fast, so often are blocked by
> pipe input/output.
>
> 'bwa.nipt xx' is a high-cpu-load, almost all of CPU cores.
>
> 'samtools.nipt sort xx' is a high-mem-load, it keep the input in memory.
> if the memory is not enough, it will save all the buffer to temp file,
> so it is sometimes high-IO-load too(write 60G or more to file).
>
>
> xfstests(generic/476) is just high-IO-load, cpu/memory load is NOT high.
> so xfstests(generic/476) maybe easy than our application pipeline.
# nproc
40
# top
top - 15:43:06 up 10:16, 1 user, load average: 41.39, 37.90, 35.98
Tasks: 488 total, 3 running, 485 sleeping, 0 stopped, 0 zombie
%Cpu(s): 99.6 us, 0.1 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.3 hi, 0.0 si, 0.0 st
MiB Mem : 58.3/193384.1 [||||||||||||||||||||||||||||||||||||||||||||||||||||||
]
MiB Swap: 0.0/0.0 [
]
# free -h
total used free shared buff/cache available
Mem: 188Gi 98Gi 5.8Gi 17Mi 84Gi 78Gi
Swap: 0B 0B 0B
memory reclaim from 'buff/cache' is easy to happen.
Best Regards
Wang Yugui ([email protected])
2021/04/09
> Although there is yet not a simple reproducer for another problem
> happend here, but there is a little high chance that something is wrong
> in btrfs/mm/fs-buffer.
> > but another problem(os freezed without call trace, PANIC without OOPS?,
> > the reason is yet unkown) still happen.
>
> Best Regards
> Wang Yugui ([email protected])
> 2021/04/09
>