Den 2017-02-09 18:30, skrev Roger Pau Monné:
On Mon, Feb 06, 2017 at 12:31:20AM +0100, Håkon Alstadheim wrote:
I get the BUG below in dom0 when trying to start a windows 10 domu
(hvm,
with some pv-drivers installed ) . Below is "xl info", then comes
dmesg
output, and finally domu config attached at end.
This domain is started very rarely, so may have been broken for some
time. All my other domains ar linux. This message is just a data-point
for whoever is interested, with possibly more data if anybody wants to
ask me anything. NOT expecting quick resolution of this :-/ .
The domain boots part of the way, screen resolution gets changed and
then it keeps spinning for ~ 5 seconds before stopping.
[...]
[339809.663061] br0: port 12(vif7.0) entered blocking state
[339809.663063] br0: port 12(vif7.0) entered disabled state
[339809.663123] device vif7.0 entered promiscuous mode
[339809.664885] IPv6: ADDRCONF(NETDEV_UP): vif7.0: link is not ready
[339809.742522] br0: port 13(vif7.0-emu) entered blocking state
[339809.742523] br0: port 13(vif7.0-emu) entered disabled state
[339809.742573] device vif7.0-emu entered promiscuous mode
[339809.744386] br0: port 13(vif7.0-emu) entered blocking state
[339809.744388] br0: port 13(vif7.0-emu) entered forwarding state
[339864.059095] xen-blkback: backend/vbd/7/768: prepare for reconnect
[339864.138002] xen-blkback: backend/vbd/7/768: using 1 queues,
protocol
1 (x86_64-abi)
[339864.241039] xen-blkback: backend/vbd/7/832: prepare for reconnect
[339864.337997] xen-blkback: backend/vbd/7/832: using 1 queues,
protocol
1 (x86_64-abi)
[339875.245306] vif vif-7-0 vif7.0: Guest Rx ready
[339875.245345] IPv6: ADDRCONF(NETDEV_CHANGE): vif7.0: link becomes
ready
[339875.245391] br0: port 12(vif7.0) entered blocking state
[339875.245395] br0: port 12(vif7.0) entered forwarding state
[339894.122151] ------------[ cut here ]------------
[339894.122169] kernel BUG at block/bio.c:1786!
[339894.122173] invalid opcode: 0000 [#1] SMP
[339894.122176] Modules linked in: xt_physdev iptable_filter ip_tables
x_tables nfsd auth_rpcgss oid_registry nfsv4 dns_resolver nfsv3
nfs_acl
binfmt_misc intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp
crc32c_intel pcspkr serio_raw i2c_i801 i2c_smbus iTCO_wdt
iTCO_vendor_support amdgpu drm_kms_helper syscopyarea bcache
input_leds
sysfillrect sysimgblt fb_sys_fops ttm drm uas shpchp ipmi_ssif
rtc_cmos
acpi_power_meter wmi tun snd_hda_codec_realtek snd_hda_codec_generic
snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer
snd
usbip_host usbip_core pktcdvd tmem lpc_ich xen_wdt nct6775 hwmon_vid
dm_zero dm_thin_pool dm_persistent_data dm_bio_prison dm_service_time
dm_round_robin dm_queue_length dm_multipath dm_log_userspace cn
virtio_pci virtio_scsi virtio_blk virtio_console virtio_balloon
[339894.122233] xts gf128mul aes_x86_64 cbc sha512_generic
sha256_generic sha1_generic libiscsi scsi_transport_iscsi virtio_net
virtio_ring virtio tg3 libphy e1000 fuse overlay nfs lockd grace
sunrpc
jfs multipath linear raid10 raid1 raid0 dm_raid raid456
async_raid6_recov async_memcpy async_pq async_xor xor async_tx
raid6_pq
dm_snapshot dm_bufio dm_crypt dm_mirror dm_region_hash dm_log dm_mod
hid_sunplus hid_sony hid_samsung hid_pl hid_petalynx hid_monterey
hid_microsoft hid_logitech ff_memless hid_gyration hid_ezkey
hid_cypress
hid_chicony hid_cherry hid_a4tech sl811_hcd xhci_plat_hcd ohci_pci
ohci_hcd uhci_hcd aic94xx lpfc qla2xxx aacraid sx8 DAC960 hpsa cciss
3w_9xxx 3w_xxxx mptsas mptfc scsi_transport_fc mptspi mptscsih mptbase
atp870u dc395x qla1280 imm parport dmx3191d sym53c8xx gdth initio
BusLogic
[339894.122325] arcmsr aic7xxx aic79xx sg pdc_adma sata_inic162x
sata_mv sata_qstor sata_vsc sata_uli sata_sis sata_sx4 sata_nv
sata_via
sata_svw sata_sil24 sata_sil sata_promise pata_sis usbhid led_class
igb
ptp dca i2c_algo_bit ehci_pci ehci_hcd xhci_pci megaraid_sas xhci_hcd
[339894.122350] CPU: 3 PID: 23514 Comm: 7.hda-0 Tainted: G W
4.9.8-gentoo #1
[339894.122353] Hardware name: ASUSTeK COMPUTER INC. Z10PE-D8
WS/Z10PE-D8 WS, BIOS 3304 06/22/2016
[339894.122358] task: ffff880244b55b00 task.stack: ffffc90042fcc000
[339894.122361] RIP: e030:[<ffffffff813c6af7>] [<ffffffff813c6af7>]
bio_split+0x9/0x89
[339894.122370] RSP: e02b:ffffc90042fcfb18 EFLAGS: 00010246
[339894.122373] RAX: 00000000000000a8 RBX: ffff8802433ee900 RCX:
ffff88023f537080
[339894.122377] RDX: 0000000002400000 RSI: 0000000000000000 RDI:
ffff8801fc8b7890
[339894.122380] RBP: ffffc90042fcfba8 R08: 0000000000000000 R09:
00000000000052da
[339894.122383] R10: 0000000000000002 R11: 0005803fffffffff R12:
ffff8801fc8b7890
[339894.122387] R13: 00000000000000a8 R14: ffffc90042fcfbb8 R15:
0000000000000000
[339894.122394] FS: 0000000000000000(0000) GS:ffff8802498c0000(0000)
knlGS:ffff8802498c0000
[339894.122398] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033
[339894.122401] CR2: 00007f99b78e3349 CR3: 0000000216d43000 CR4:
0000000000042660
[339894.122405] Stack:
[339894.122407] ffffffff813d1bce 0000000000000002 ffffc90042fcfb50
ffff88023f537080
[339894.122413] 0000000000000002 0000000100000000 0000000000000000
0000000100000000
[339894.122419] 0000000000000000 000000000d2ee022 0000000200006fec
0000000000000000
[339894.122424] Call Trace:
[339894.122429] [<ffffffff813d1bce>] ? blk_queue_split+0x448/0x48b
[339894.122435] [<ffffffff813cd7f3>] blk_queue_bio+0x44/0x289
[339894.122439] [<ffffffff813cc226>] generic_make_request+0xbd/0x160
[339894.122443] [<ffffffff813cc3c9>] submit_bio+0x100/0x11d
[339894.122446] [<ffffffff813d2b8a>] ? next_bio+0x1d/0x40
[339894.122450] [<ffffffff813c4d10>] submit_bio_wait+0x4e/0x62
[339894.122454] [<ffffffff813d2df3>] blkdev_issue_discard+0x71/0xa9
[339894.122459] [<ffffffff81534fd4>] __do_block_io_op+0x4f0/0x579
[339894.122463] [<ffffffff81534fd4>] ? __do_block_io_op+0x4f0/0x579
[339894.122469] [<ffffffff81770005>] ? sha_transform+0xf47/0x1069
[339894.122474] [<ffffffff81535544>] xen_blkif_schedule+0x318/0x63c
[339894.122478] [<ffffffff81777498>] ? __schedule+0x32e/0x4e8
[339894.122484] [<ffffffff81088f9b>] ? wake_up_atomic_t+0x2c/0x2c
[339894.122488] [<ffffffff8153522c>] ? xen_blkif_be_int+0x2c/0x2c
[339894.122492] [<ffffffff810742aa>] kthread+0xa6/0xae
[339894.122496] [<ffffffff81074204>] ? init_completion+0x24/0x24
[339894.122501] [<ffffffff8177a335>] ret_from_fork+0x25/0x30
Are you using some kind of software RAID or similar backend for the
disk
images?
Yes, I'm using bcache over an lvm volume over md-raid (raid-6) on some
sas-drives attached to an LSI megaraid 2008 as JBOD, all in dom0, the
bcache volume gets passed to the VM.
It looks like someone (not blkback) is trying to split a discard bio
(or maybe even a discard bio with 0 sectors), and that's causing a BUG
to
trigger. TBH, I would expect blkdev_issue_discard to either ignore or
reject
such requests, but it doesn't seem to do so (or at least I cannot find
it).
Could you try the below patch and report back what output do you get?
I have since found that disabling tmem seems to work around this issue,
as well as another issue viz: no youtube on my linux-desktop-vm, which
has display on an amd graphics card which gets passed through.
I'll apply the patch and try to get it tested in the next couple of
days.
Thanks, Roger.
Thank you for taking an interest :-) .
---8<---
diff --git a/drivers/block/xen-blkback/blkback.c
b/drivers/block/xen-blkback/blkback.c
index 726c32e..1964e9c 100644
--- a/drivers/block/xen-blkback/blkback.c
+++ b/drivers/block/xen-blkback/blkback.c
@@ -1027,6 +1027,8 @@ static int dispatch_discard_io(struct
xen_blkif_ring *ring,
(req->u.discard.flag & BLKIF_DISCARD_SECURE)) ?
BLKDEV_DISCARD_SECURE : 0;
+ pr_info("Sending discard, sector %llu nr %llu\n",
+ req->u.discard.sector_number, req->u.discard.nr_sectors);
err = blkdev_issue_discard(bdev, req->u.discard.sector_number,
req->u.discard.nr_sectors,
GFP_KERNEL, secure);
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel