Re: possible deadlock in dquot_commit

Dmitry Vyukov Fri, 12 Feb 2021 03:06:04 -0800

On Thu, Feb 11, 2021 at 10:46 PM Theodore Ts'o <ty...@mit.edu> wrote:
>
> On Thu, Feb 11, 2021 at 12:47:18PM +0100, Dmitry Vyukov wrote:
> > > This actually looks problematic: We acquired &ei->i_data_sem/2 (i.e.,
> > > I_DATA_SEM_QUOTA subclass) in ext4_map_blocks() called from
> > > ext4_block_write_begin(). This suggests that the write has been happening
> > > directly to the quota file (or that lockdep annotation of the inode went
> > > wrong somewhere). Now we normally protect quota files with IMMUTABLE flag
> > > so writing it should not be possible. We also don't allow clearing this
> > > flag on used quota file. Finally I'd checked lockdep annotation and
> > > everything looks correct. So at this point the best theory I have is that 
> > > a
> > > filesystem has been suitably corrupted and quota file supposed to be
> > > inaccessible from userspace got exposed but I'd expect other problems to
> > > hit first in that case. Anyway without a reproducer I have no more 
> > > ideas...
> >
> > There is a reproducer for 4.19 available on the dashboard. Maybe it will 
> > help.
> > I don't why it did not pop up on upstream yet, there lots of potential
> > reasons for this.
>
> The 4.19 version of the syzbot report has a very different stack
> trace.  Instead of it being related to an apparent write to the quota
> file, it is apparently caused by a call to rmdir:
>
>  dump_stack+0x22c/0x33e lib/dump_stack.c:118
>  print_circular_bug.constprop.0.cold+0x2d7/0x41e kernel/locking/lockdep.c:1221
>    ...
>  __mutex_lock+0xd7/0x13f0 kernel/locking/mutex.c:1072
>  dquot_commit+0x4d/0x400 fs/quota/dquot.c:469
>  ext4_write_dquot+0x1f2/0x2a0 fs/ext4/super.c:5644
>    ...
>  ext4_evict_inode+0x933/0x1830 fs/ext4/inode.c:298
>  evict+0x2ed/0x780 fs/inode.c:559
>  iput_final fs/inode.c:1555 [inline]
>    ...
>  vfs_rmdir fs/namei.c:3865 [inline]
>  do_rmdir+0x3af/0x420 fs/namei.c:3943
>  __do_sys_unlinkat fs/namei.c:4105 [inline]
>  __se_sys_unlinkat fs/namei.c:4099 [inline]
>  __x64_sys_unlinkat+0xdf/0x120 fs/namei.c:4099
>  do_syscall_64+0xf9/0x670 arch/x86/entry/common.c:293
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> Which leads me to another apparent contradiction.  Looking at the C
> reproducer source code, and running the C reproducer under "strace
> -ff", there is never any attempt to run rmdir() on the corrupted file
> system that is mounted.  Neither as observed by my running the C
> reproducer, or by looking at the C reproducer source code.
>
> Looking at the code, I did see a number of things which seemed to be
> bugs; procid never gets incremented, so all of the threads only
> operate on /dev/loop0, and each call to the execute() function tries
> to setup two file systems on /dev/loop0.  So the each thread to run
> creates a temp file, binds it to /dev/loop0, and then creates another
> temp file, tries to bind it to /dev/loop0 (which will fail), tries to
> mount /dev/loop0 (again) on the samee mount point (which will
> succeed).
>
> I'm not sure if this is just some insanity that was consed up by the
> fuzzer... or I'm wondering if this was an unfaithful translation of
> the syzbot repro to C.  Am I correct in understanding that when syzbot
> is running, it uses the syzbot repro, and not the C repro?


Hi Ted,

The 4.19 reproducer may reproducer something else, you know better. I
just want to answer points re syzkaller reproducers. FTR the 4.19
reproducer/reproducer is here:
https://syzkaller.appspot.com/bug?id=b6cacc9fa48fea07154b8797236727de981c1e02

> there is never any attempt to run rmdir() on the corrupted file system that 
> is mounted.

Recursive rmdir happens as part of test cleanup implicitly, you can
see rmdir call in remove_dir function in the C reproducer:
https://syzkaller.appspot.com/text?tag=ReproC&x=12caea37900000

> procid never gets incremented, so all of the threads only operate on 
> /dev/loop0

This is intentional. procid is supposed to "isolate" parallel test
processes (if any). This reproducer does not use parallel test
processes, thus procid has constant value.

> Am I correct in understanding that when syzbot is running, it uses the syzbot 
> repro, and not the C repro?

It tries both. If first tries to interpret "syzkaller program" as it
was done when the bug was triggered during fuzzing. But then it tries
to convert it to a corresponding stand-alone C program and confirms
that it still triggers the bug. If it provides a C reproducer, it
means that it did trigger the bug using this exact C program on a
freshly booted kernel (and the provided kernel oops is the
corresponding oops obtained on this exact program).
If it fails to reproduce the bug with a C reproducer, then it provides
only the "syzkaller program" to not mislead developers.

Re: possible deadlock in dquot_commit

Reply via email to