Hi,

Unfortunately, the patch does not seem to solve the problem. Running
Ubuntu 18.04 LTS with kernel 4.15.0-64, the crash still occurs with the
same signature :

Oct 14 08:00:01 uzorldsp01 kernel: [109699.415372] BUG: unable to handle kernel 
NULL pointer dereference at 0000000000000038
Oct 14 08:00:01 uzorldsp01 kernel: [109699.420042] IP: 
smb2_push_mandatory_locks+0x10d/0x3c0 [cifs]
Oct 14 08:00:01 uzorldsp01 kernel: [109699.420042] PGD 0 P4D 0
Oct 14 08:00:01 uzorldsp01 kernel: [109699.420042] Oops: 0000 [#1] SMP PTI
Oct 14 08:00:01 uzorldsp01 kernel: [109699.420042] Modules linked in: btrfs 
zstd_compress xor raid6_pq ufs qnx4 minix ntfs msdos jfs xfs dm_snapshot 
dm_bufio cmac arc4 md4 nls_utf8 cifs ccm fscache nf_conntr
ack_ipv4 nf_defrag_ipv4 xt_owner xt_conntrack nf_conntrack libcrc32c 
iptable_security sb_edac kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul 
ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd
glue_helper cryptd intel_rapl_perf input_leds serio_raw hyperv_fb hv_balloon 
joydev mac_hid sch_fq_codel nfsd auth_rpcgss nfs_acl lockd grace sunrpc 
ip_tables x_tables autofs4 hid_generic hid_hyperv hv_util
s hv_storvsc ptp hyperv_keyboard hv_netvsc hid scsi_transport_fc pps_core 
psmouse i2c_piix4 pata_acpi hv_vmbus floppy
Oct 14 08:00:01 uzorldsp01 kernel: [109699.472261] CPU: 0 PID: 50766 Comm: 
kworker/0:0 Not tainted 4.15.0-64-generic #73-Ubuntu
Oct 14 08:00:01 uzorldsp01 kernel: [109699.472261] Hardware name: Microsoft 
Corporation Virtual Machine/Virtual Machine, BIOS 090007  06/02/2017
Oct 14 08:00:01 uzorldsp01 kernel: [109699.472261] Workqueue: cifsoplockd 
cifs_oplock_break [cifs]
Oct 14 08:00:01 uzorldsp01 kernel: [109699.472261] RIP: 
0010:smb2_push_mandatory_locks+0x10d/0x3c0 [cifs]
Oct 14 08:00:01 uzorldsp01 kernel: [109699.472261] RSP: 0000:ffffab360da2bdd8 
EFLAGS: 00010246
Oct 14 08:00:01 uzorldsp01 kernel: [109699.472261] RAX: 0000000000000000 RBX: 
ffff9887646617d8 RCX: 0000000000000000
Oct 14 08:00:01 uzorldsp01 kernel: [109699.472261] RDX: 0000000000001000 RSI: 
0000000000000000 RDI: ffff98876d006b80
Oct 14 08:00:02 uzorldsp01 kernel: [109699.472261] RBP: ffffab360da2be28 R08: 
ffff988568596000 R09: ffff98876d006b80
Oct 14 08:00:02 uzorldsp01 kernel: [109699.472261] R10: ffffab360da2bd98 R11: 
ffff988568596000 R12: 00000000000000aa
Oct 14 08:00:02 uzorldsp01 kernel: [109699.472261] R13: ffff9887646617d8 R14: 
ffff9887646617c0 R15: ffff988764d7f200
Oct 14 08:00:02 uzorldsp01 kernel: [109699.472261] FS:  0000000000000000(0000) 
GS:ffff98876d600000(0000) knlGS:0000000000000000
Oct 14 08:00:02 uzorldsp01 kernel: [109699.472261] CS:  0010 DS: 0000 ES: 0000 
CR0: 0000000080050033
Oct 14 08:00:02 uzorldsp01 kernel: [109699.472261] CR2: 0000000000000038 CR3: 
000000046d80a003 CR4: 00000000001606f0
Oct 14 08:00:02 uzorldsp01 kernel: [109699.472261] Call Trace:
Oct 14 08:00:02 uzorldsp01 kernel: [109699.472261]  
cifs_oplock_break+0x131/0x410 [cifs]
Oct 14 08:00:02 uzorldsp01 kernel: [109699.472261]  process_one_work+0x1de/0x420
Oct 14 08:00:02 uzorldsp01 kernel: [109699.472261]  worker_thread+0x32/0x410
Oct 14 08:00:02 uzorldsp01 kernel: [109699.472261]  kthread+0x121/0x140
Oct 14 08:00:02 uzorldsp01 kernel: [109699.472261]  ? 
process_one_work+0x420/0x420
Oct 14 08:00:02 uzorldsp01 kernel: [109699.472261]  ? 
kthread_create_worker_on_cpu+0x70/0x70
Oct 14 08:00:02 uzorldsp01 kernel: [109699.472261]  ret_from_fork+0x35/0x40
Oct 14 08:00:02 uzorldsp01 kernel: [109699.472261] Code: c8 00 00 00 00 48 89 
45 b0 49 39 c6 0f 84 e5 00 00 00 4d 89 fb 4d 8b 7e 10 49 8b 5e 18 4d 8d 6e 18 
49 8b 87 90 00 00 00 4c 39 eb <48> 8b 40 38 48 89 4
5 d0 0f 84 ae 00 00 00 45 31 d2 4c 89 75 b8
Oct 14 08:00:02 uzorldsp01 kernel: [109699.472261] RIP: 
smb2_push_mandatory_locks+0x10d/0x3c0 [cifs] RSP: ffffab360da2bdd8
Oct 14 08:00:02 uzorldsp01 kernel: [109699.472261] CR2: 0000000000000038
Oct 14 08:00:02 uzorldsp01 kernel: [109699.472261] ---[ end trace 
ee6628b4e2b5174b ]---

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-azure in Ubuntu.
https://bugs.launchpad.net/bugs/1795659

Title:
  kernel panic using CIFS share in smb2_push_mandatory_locks()

Status in linux package in Ubuntu:
  Confirmed
Status in linux-azure package in Ubuntu:
  Fix Released
Status in linux source package in Xenial:
  Invalid
Status in linux-azure source package in Xenial:
  Fix Released
Status in linux source package in Bionic:
  Fix Released
Status in linux-azure source package in Bionic:
  Invalid
Status in linux source package in Cosmic:
  Won't Fix
Status in linux-azure source package in Cosmic:
  Invalid
Status in linux source package in Disco:
  Fix Released
Status in linux-azure source package in Disco:
  Invalid

Bug description:
  [Impact]

  * We got reports of a kernel crash in cifs module with the following
  signature:

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
  IP: smb2_push_mandatory_locks+0x10e/0x3b0 [cifs]
  PGD 0 P4D 0
  RIP: 0010:smb2_push_mandatory_locks+0x10e/0x3b0 [cifs]
  Call Trace:
   cifs_oplock_break+0x12f/0x3d0 [cifs]
   process_one_work+0x14d/0x410
   worker_thread+0x4b/0x460
   kthread+0x105/0x140
  [...]

  * Low-level analysis (decodecode script output and the objdump of the
  function) revealed that we are crashing in a NULL ptr dereference when
  trying to access "cfile->tlink"; below, a snippet of the objdump at
  function smb2_push_mandatory_locks():

  [...]
  mov    0x10(%r14),%r15   # %r15 = cifsFileInfo *cfile
  mov    0x18(%r14),%rbx   # %rbx = cifsLockInfo *li = (fdlocks->locks)
  lea    0x18(%r14),%r12
  mov    0x90(%r15),%rax   # %rax = struct tcon_link *tlink (cfile->tlink)
  cmp    %r12,%rbx
  mov    0x38(%rax),%rax   # <--- TRAP [trying to get cifs_tcon *tl_tcon]
  [...]

  * After discussing the issue with CIFS maintainers (Steve French and
  Pavel Shilovsky) they suggested commit b98749cac4a6 ("CIFS: keep
  FileInfo handle live during oplock break")
  [http://git.kernel.org/linus/b98749cac4a6] as a fix for multiple
  reports of this kind of crash.

  * The fix was sent to stable kernels and is present in Ubuntu kernels
  5.0 and newer. We are requesting the SRU for this patch here in order
  to fix the crashes, after reports of successful testing with the patch
  (see below section) and since the patch is restricted to the cifs
  module scope and accepted on linux stable.

  * Alternatively the issue is known to be avoided when oplocks are
  disabled using "cifs.enable_oplocks=N" module parameter.

  [Test case]

  * Unfortunately we cannot reproduce the issue. The patch proposed here was
  validated by us with xfstests (instructions followed from
  https://wiki.samba.org/index.php/Xfstesting-cifs) and fio. Also, we
  have a user report of test validation using LISA 
(https://github.com/LIS/LISAv2).

  * Using xfstest with the exclusions proposed in the link above we
  managed to get the same results as a non-patched kernel, i.e., the
  same tests failed in both kernels, we didn't get worse results with
  the patch. Fio also didn't show noticeable performance regression with
  the patch.

  [Regression potential]

  * The patch was validated by the cifs filesystem maintainers (in fact
  they suggested its inclusion in Ubuntu) and by the aforementioned
  tests; also, the scope is restricted to cifs only so the likelihood of
  regressions is considered low.

  * Due to the nature of the code modification (add a new reference of a
  file handler and manipulate it in different places), I consider that
  if we have a regression it'll manifest as deadlock/blocked tasks, not
  something more serious like crashes or data corruption.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1795659/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to