Package: linux-doc-2.6.18 Severity: important
Hi! Subject already says it: Debian kernel 2.6.18-3-xen-vserver-amd64 causes kernel oopses (and therefore probably data loss or broken systems) when using LVM snapshots with Xen on a dual amd64 machine. More details: Machine is a AMD Athlon(tm) 64 X2 Dual Core Processor 3800+. 2G Ram system with two identical 250GB harddisks hooked up as Software-RAID1. One RAID1 partition is setup for LVM to host the disks for Xen domUs. It's Etch. I'm using LVM snapshots to backup the running Xen Domains. When removing the snapshots again, the Kernel oopses like this: Jan 11 11:36:40 wbs-euserv kernel: ----------- [cut here ] --------- [please bite here ] --------- Jan 11 11:36:40 wbs-euserv kernel: Kernel BUG at mm/slab.c:595 Jan 11 11:36:40 wbs-euserv kernel: invalid opcode: 0000 [1] SMP Jan 11 11:36:40 wbs-euserv kernel: CPU 0 Jan 11 11:36:40 wbs-euserv kernel: Modules linked in: xfrm4_mode_tunnel esp4 netloop tun ipv6 ipt_LOG ipt_iprange xt_physdev ipt_ULOG ipt_recent ipt_REJECT xt_tcpudp xt_state ip_conntrack nfnetlink iptable_fi lter iptable_mangle ip_tables x_tables bridge deflate zlib_deflate twofish serpent aes blowfish des sha256 sha1 crypto_null af_key ext3 jbd mbcache loop snd_mpu401 snd_mpu401_uart snd_rawmidi snd_seq_device s nd analog irtty_sir i2c_nforce2 sir_dev parport_pc gameport soundcore psmouse irda floppy parport serial_core serio_raw crc_ccitt evdev i2c_core pcspkr xfs dm_mirror dm_snapshot dm_mod raid1 md_mod ide_generi c sd_mod sata_nv libata scsi_mod ehci_hcd amd74xx forcedeth generic ide_core ohci_hcd fan Jan 11 11:36:40 wbs-euserv kernel: Pid: 7579, comm: lvremove Not tainted 2.6.18-3-xen-vserver-amd64 #1 Jan 11 11:36:40 wbs-euserv kernel: RIP: e030:[<ffffffff80207119>] [<ffffffff80207119>] kmem_cache_free+0x58/0xca Jan 11 11:36:40 wbs-euserv kernel: RSP: e02b:ffff88006f79fc68 EFLAGS: 00010202 Jan 11 11:36:40 wbs-euserv kernel: RAX: 0000000000000068 RBX: 0000000000000000 RCX: 0000000000000000 Jan 11 11:36:40 wbs-euserv kernel: RDX: ffff8800035b0e70 RSI: ffff880070c42588 RDI: ffff88007fc45840 Jan 11 11:36:40 wbs-euserv kernel: RBP: ffff880070c42588 R08: ffff88006f79e000 R09: 0000000000000000 Jan 11 11:36:40 wbs-euserv kernel: R10: ffff8800720ab200 R11: ffff88007386f880 R12: ffff88007fc45840 Jan 11 11:36:40 wbs-euserv kernel: R13: 0000000000003020 R14: 0000000000000302 R15: 0000000000000800 Jan 11 11:36:40 wbs-euserv kernel: FS: 00002ae26846bc80(0000) GS:ffffffff804d3000(0000) knlGS:0000000000000000 Jan 11 11:36:40 wbs-euserv kernel: CS: e033 DS: 0000 ES: 0000 Jan 11 11:36:40 wbs-euserv kernel: Process lvremove (pid: 7579[#0], threadinfo ffff88006f79e000, task ffff88007386f880) Jan 11 11:36:40 wbs-euserv kernel: Stack: 0000000000000000 0000000000000000 ffff880072233ee8 ffffc200001ef020 Jan 11 11:36:40 wbs-euserv kernel: 0000000000003020 ffffffff880dc9d4 ffff88007fc45840 ffff880072233e80 Jan 11 11:36:40 wbs-euserv kernel: ffffc200001a4080 0000000000000000 Jan 11 11:36:40 wbs-euserv kernel: Call Trace: Jan 11 11:36:40 wbs-euserv kernel: [<ffffffff880dc9d4>] :dm_snapshot:exit_exception_table+0x3c/0x67 Jan 11 11:36:40 wbs-euserv kernel: [<ffffffff880cef4a>] :dm_mod:dev_remove+0x0/0xb5 Jan 11 11:36:40 wbs-euserv kernel: [<ffffffff880dcaa5>] :dm_snapshot:snapshot_dtr+0xa6/0xe6 Jan 11 11:36:40 wbs-euserv kernel: [<ffffffff880cc856>] :dm_mod:dm_table_put+0x58/0xc5 Jan 11 11:36:40 wbs-euserv kernel: [<ffffffff880cb92a>] :dm_mod:dm_put+0x90/0x151 Jan 11 11:36:40 wbs-euserv kernel: [<ffffffff880cefec>] :dm_mod:dev_remove+0xa2/0xb5 Jan 11 11:36:40 wbs-euserv kernel: [<ffffffff880cf52d>] :dm_mod:ctl_ioctl+0x213/0x25e Jan 11 11:36:40 wbs-euserv kernel: [<ffffffff802431fa>] do_ioctl+0x55/0x6b Jan 11 11:36:40 wbs-euserv kernel: [<ffffffff80231e5a>] vfs_ioctl+0x364/0x38b Jan 11 11:36:40 wbs-euserv kernel: [<ffffffff8024dacf>] sys_ioctl+0x59/0x78 Jan 11 11:36:40 wbs-euserv kernel: [<ffffffff8025ed5e>] system_call+0x86/0x8b Jan 11 11:36:40 wbs-euserv kernel: [<ffffffff8025ecd8>] system_call+0x0/0x8b Jan 11 11:36:40 wbs-euserv kernel: Jan 11 11:36:40 wbs-euserv kernel: Jan 11 11:36:40 wbs-euserv kernel: Code: 0f 0b 68 05 a6 41 80 c2 53 02 4c 39 62 28 74 0a 0f 0b 68 05 Jan 11 11:36:40 wbs-euserv kernel: RIP [<ffffffff80207119>] kmem_cache_free+0x58/0xca Jan 11 11:36:40 wbs-euserv kernel: RSP <ffff88006f79fc68> The funny thing is: this happens only on the AMD64 X2 machine. We have a second machine which is nearly identical to the above: a AMD Athlon(tm) 64 Processor 3500+, 2G Ram, 2x 120GB disks softRaid1+LVM. The second machine runs the same procedure for its Xen domains: snapshot, backup, remove snapshot - but it doesn't show *any* kernel oopses in its logfiles. This makes me guess that it could have something to do with the dual core CPU on the AMD64 X2 machine. (a very short test on a dual Xeon i386 machine didn't show oopses, but really not deeply tested.) Regards, Ingo -- System Information: Debian Release: 4.0 APT prefers testing APT policy: (500, 'testing') Architecture: amd64 (x86_64) Shell: /bin/sh linked to /bin/bash Kernel: Linux 2.6.18-3-xen-vserver-amd64 Locale: LANG=C, LC_CTYPE=C (charmap=ANSI_X3.4-1968) -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]