On Wed, 2 Aug 2017 22:46:30 -0700 Cong Wang <xiyou.wangc...@gmail.com> wrote:
> We saw many list corruption warnings on shmem shrinklist: > > ... > > The problem is that shmem_unused_huge_shrink() moves entries > from the global sbinfo->shrinklist to its local lists and then > releases the spinlock. However, a parallel shmem_setattr() > could access one of these entries directly and add it back to > the global shrinklist if it is removed, with the spinlock held. > > The logic itself looks solid since an entry could be either > in a local list or the global list, otherwise it is removed > from one of them by list_del_init(). So probably the race > condition is that, one CPU is in the middle of INIT_LIST_HEAD() Where is this INIT_LIST_HEAD()? > but the other CPU calls list_empty() which returns true > too early then the following list_add_tail() sees a corrupted > entry. > > list_empty_careful() is designed to fix this situation. > I'm not sure I'm understanding this. AFAICT all the list operations to which you refer are synchronized under spin_lock(&sbinfo->shrinklist_lock)?