[Kernel-packages] [Bug 1780470] Re: BUG: scheduling while atomic (Kernel : Ubuntu-3.13 + VMware: 6.0 and late)

Eric Desrochers Tue, 10 Jul 2018 18:42:02 -0700

I currently have a user who run the test kernel (ppa:slashd/sf185584) on
both of his environment (Vmware 5.5 and VMware 6.5), to see if it fixes
the situation on 6.5 with v3.13 guests and to make sure everything still
looks good on VMware 5.5 (to avoid potential regression on version <
6.5)


-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1780470

Title:
  BUG: scheduling while atomic (Kernel : Ubuntu-3.13 + VMware: 6.0 and
  late)

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Trusty:
  In Progress

Bug description:
  [Impact]
  It has been brought to my attention that VMware Guest[1] randomly crashes 
after moving the VMs from a VMware "5.5" env to VMware 6.5 env.

  Notes:
  * The crashes wasn't present in VMware 5.5 (with the same VMs). Only started 
to happens with Vmware 6.5
  * The Trusty HWE kernel (Ubuntu-4.4.0-X) doesn't exhibit the situation on 
VMware 6.5.

  Here's the stack trace took from the .vmss converted to be readable by
  Linux debugger:

  [17007961.187411] BUG: scheduling while atomic: swapper/3/0/0x00000100
  [17007961.189794] Modules linked in: arc4 md4 nls_utf8 cifs nfsv3 nfs_acl 
nfsv4 nfs lockd sunrpc fscache veth ipt_MASQUERADE nf_conntrack_netlink 
nfnetlink xfrm_user xfrm_algo iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 
nf_nat_ipv4 xt_addrtype iptable_filter ip_tables xt_conntrack x_tables nf_nat 
nf_conntrack bridge stp llc aufs vmw_vsock_vmci_transport vsock ppdev vmwgfx 
serio_raw coretemp ttm drm vmw_balloon vmw_vmci shpchp i2c_piix4 parport_pc 
mac_hid xfs lp libcrc32c parport psmouse floppy vmw_pvscsi vmxnet3 pata_acpi
  [17007961.189856] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 
3.13.0-135-generic #184-Ubuntu
  [17007961.189862] Hardware name: VMware, Inc. VMware Virtual Platform/440BX 
Desktop Reference Platform, BIOS 6.00 10/22/2013
  [17007961.189867] 0000000000000000 ffff88042f263b90 ffffffff8172d959 
ffff88042f263d30
  [17007961.189874] ffff88042f273180 ffff88042f263ba0 ffffffff81726d8c 
ffff88042f263c00
  [17007961.189879] ffffffff81731c8f ffff880428c29800 0000000000013180 
ffff880428c25fd8
  [17007961.189885] Call Trace:
  [17007961.189889] <IRQ> [<ffffffff8172d959>] dump_stack+0x64/0x82
  [17007961.189913] [<ffffffff81726d8c>] __schedule_bug+0x4c/0x5a
  [17007961.189922] [<ffffffff81731c8f>] __schedule+0x6af/0x7f0
  [17007961.189929] [<ffffffff81731df9>] schedule+0x29/0x70
  [17007961.189935] [<ffffffff81731049>] schedule_timeout+0x279/0x310
  [17007961.189947] [<ffffffff810a357b>] ? select_task_rq_fair+0x56b/0x6f0
  [17007961.189955] [<ffffffff810a9852>] ? enqueue_task_fair+0x422/0x6d0
  [17007961.189962] [<ffffffff810a0de5>] ? sched_clock_cpu+0xb5/0x100
  [17007961.189971] [<ffffffff81732906>] wait_for_completion+0xa6/0x150
  [17007961.189977] [<ffffffff8109e2a0>] ? wake_up_state+0x20/0x20
  [17007961.189987] [<ffffffff810ccce0>] ? __call_rcu+0x2d0/0x2d0
  [17007961.189993] [<ffffffff810ca2eb>] wait_rcu_gp+0x4b/0x60
  [17007961.189999] [<ffffffff810ca280>] ? 
ftrace_raw_output_rcu_utilization+0x50/0x50
  [17007961.190006] [<ffffffff810cc45a>] synchronize_sched+0x3a/0x50
  [17007961.190047] [<ffffffffa01a8936>] vmci_event_unsubscribe+0x76/0xb0 
[vmw_vmci]
  [17007961.190063] [<ffffffffa01895f1>] vmci_transport_destruct+0x21/0xe0 
[vmw_vsock_vmci_transport]
  [17007961.190078] [<ffffffffa017f837>] vsock_sk_destruct+0x17/0x60 [vsock]
  [17007961.190087] [<ffffffff8161a9df>] __sk_free+0x1f/0x180
  [17007961.190092] [<ffffffff8161ab59>] sk_free+0x19/0x20
  [17007961.190102] [<ffffffffa018a2c0>] 
vmci_transport_recv_stream_cb+0x200/0x2f0 [vmw_vsock_vmci_transport]
  [17007961.190114] [<ffffffffa01a7efc>] 
vmci_datagram_invoke_guest_handler+0xbc/0xf0 [vmw_vmci]
  [17007961.190126] [<ffffffffa01a8dbf>] vmci_dispatch_dgs+0xcf/0x230 [vmw_vmci]
  [17007961.190138] [<ffffffff8106f8ee>] tasklet_action+0x11e/0x130
  [17007961.190145] [<ffffffff8106fd8c>] __do_softirq+0xfc/0x310
  [17007961.190153] [<ffffffff81070315>] irq_exit+0x105/0x110
  [17007961.190161] [<ffffffff817407e6>] do_IRQ+0x56/0xc0
  [17007961.190170] [<ffffffff81735e6d>] common_interrupt+0x6d/0x6d
  [17007961.190173] <EOI> [<ffffffff81051586>] ? native_safe_halt+0x6/0x10
  [17007961.190190] [<ffffffff8101db7f>] default_idle+0x1f/0x100
  [17007961.190197] [<ffffffff8101e496>] arch_cpu_idle+0x26/0x30
  [17007961.190205] [<ffffffff810c2b91>] cpu_startup_entry+0xc1/0x2b0
  [17007961.190214] [<ffffffff810427fd>] start_secondary+0x21d/0x2d0
  [17007961.190221] bad: scheduling from the idle thread!

  [Other infos]

  I have identified a patch series[2] which seems to fix the exact same
  situation.

  A full discussion can be found on patchworks[3], suggesting a certain
  patch series[2] authored by Vmware.

  [1] - VM details :
  Release: Trusty
  Kernel: Ubuntu-3.13.0-135

  [2] Upstream patch series
  8ab18d7 VSOCK: Detach QP check should filter out non matching QPs.
  8566b86 VSOCK: Fix lockdep issue.
  4ef7ea9 VSOCK: sock_put wasn't safe to call in interrupt context

  ----------------
  commit 4ef7ea9
  Author: Jorgen Hansen <jhan...@vmware.com>
  Date:   Wed Oct 21 04:53:56 2015 -0700

      VSOCK: sock_put wasn't safe to call in interrupt context

      ...
      Multiple customers have been hitting this issue when using
      VMware tools on vSphere 2015.
      ...
  ----------------

  VSphere 2015 == VMware 6.0 (release in 2015) and late.

  [3] - https://patchwork.kernel.org/patch/9948741/

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1780470/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1780470] Re: BUG: scheduling while atomic (Kernel : Ubuntu-3.13 + VMware: 6.0 and late)

Reply via email to