On Thu, Feb 08, 2007 at 05:11:32PM +0200, Sami Liedes wrote: > XFS, but that triggers it easily and often. A fix was merged upstream > in 2.6.18.6 ("[PATCH] dm crypt: Fix data corruption with dm-crypt over > RAID5"), but is not apparently included in the Debian kernel (or at > least I ran into this with a very similar backtrace). See:
Hmm, seems it (the entire 2.6.18.6) IS included in the Debian kernel. I wonder which fix is missing then, or if the bug is still in the vanilla kernel tree. Here's the oops: ------------------------------------------------------------ Feb 8 04:43:08 lh kernel: Filesystem "dm-7": Disabling barriers, not supported by the underlying device Feb 8 04:43:08 lh kernel: XFS mounting filesystem dm-7 Feb 8 04:43:08 lh kernel: Ending clean XFS mount for filesystem: dm-7 Feb 8 04:46:10 lh kernel: Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP: Feb 8 04:46:10 lh kernel: [<ffffffff802a749a>] page_to_pfn+0x0/0x33 Feb 8 04:46:10 lh kernel: PGD 24a6c067 PUD 1da31067 PMD 0 Feb 8 04:46:10 lh kernel: Oops: 0000 [1] SMP Feb 8 04:46:10 lh kernel: CPU 0 Feb 8 04:46:10 lh kernel: Modules linked in: sha256 aes dm_crypt snd_intel8x0 xfs ipt_owner ipt_REJECT xt_state xt_tcpudp iptable_filter ipt_MASQUERADE iptable_nat ip_nat ip_conntrack nfnetlink ip_tables x_tables radeon drm binfmt_misc freq_table ppdev lp button ac battery ipv6 nls_iso8859_1 nls_cp437 vfat fat ext2it87 hwmon_vid i2c_isa eeprom usbmouse ide_cd cdrom tsdev snd_ac97_codec snd_ac97_bus snd_opl3_lib snd_pcm_oss snd_mixer_oss snd_hwdep snd_mpu401 snd_mpu401_uart i2c_nforce2 snd_rawmidi snd_seq_device analog i2c_core parport_pc parport snd_pcm snd_timer psmouse serio_raw snd snd_page_alloc gameport evdev floppy soundcore pcspkr ext3 jbd mbcache dm_mirror dm_snapshot dm_mod ide_generic sd_mod ide_disk sata_nv libata scsi_mod 3c59x mii forcedeth generic amd74xx ide_core ehci_hcd ohci_hcd thermal processor fan Feb 8 04:46:10 lh kernel: Pid: 198, comm: pdflush Not tainted 2.6.18-4-amd64 #1 Feb 8 04:46:10 lh kernel: RIP: 0010:[<ffffffff802a749a>] [<ffffffff802a749a>] page_to_pfn+0x0/0x33 Feb 8 04:46:10 lh kernel: RSP: 0018:ffff81003e7e97d8 EFLAGS: 00010297 Feb 8 04:46:10 lh kernel: RAX: 0000000000000000 RBX: ffff81000bce2640 RCX: 0000000000000000 Feb 8 04:46:10 lh kernel: RDX: 0000000000000056 RSI: ffff81000bce2640 RDI: 0000000000000000 Feb 8 04:46:10 lh kernel: RBP: ffff81003b3c8000 R08: 0000000000000000 R09: ffff810037ade870 Feb 8 04:46:10 lh kernel: R10: 0000000000000000 R11: ffff81000c1a1ec0 R12: ffff81000bce2640 Feb 8 04:46:10 lh kernel: R13: 0000000000000000 R14: 0000000000000000 R15: ffff81003e8f8088 Feb 8 04:46:10 lh kernel: FS: 00002b4d40df3d20(0000) GS:ffffffff80521000(0000) knlGS:00000000f7b446c0 Feb 8 04:46:10 lh kernel: CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b Feb 8 04:46:10 lh kernel: CR2: 0000000000000000 CR3: 000000001e0c6000 CR4: 00000000000006e0 Feb 8 04:46:10 lh kernel: Process pdflush (pid: 198, threadinfo ffff81003e7e8000, task ffff810037ade870) Feb 8 04:46:10 lh kernel: Stack: ffffffff8022bf96 ffff810037ade870 000000000000d400 0000000000000000 Feb 8 04:46:10 lh kernel: ffff810000000001 0000000000000001 ffff81000bce2640 ffff81003e8f8088 Feb 8 04:46:10 lh kernel: ffff8100192517c0 ffff810007f997a8 0000000000000056 000000000002a000 Feb 8 04:46:10 lh kernel: Call Trace: Feb 8 04:46:10 lh kernel: [<ffffffff8022bf96>] blk_recount_segments+0x7e/0x21b Feb 8 04:46:10 lh kernel: [<ffffffff802bb9ae>] __bio_clone+0x71/0x8a Feb 8 04:46:10 lh kernel: [<ffffffff802bb9fc>] bio_clone+0x35/0x3d Feb 8 04:46:10 lh kernel: [<ffffffff8822776a>] :dm_crypt:crypt_map+0xcd/0x304 Feb 8 04:46:10 lh kernel: [<ffffffff880d92bf>] :dm_mod:__map_bio+0x47/0x9b Feb 8 04:46:10 lh kernel: [<ffffffff880d9c1f>] :dm_mod:__split_bio+0x172/0x37d Feb 8 04:46:10 lh kernel: [<ffffffff880da432>] :dm_mod:dm_request+0x101/0x110 Feb 8 04:46:10 lh kernel: [<ffffffff80219f55>] generic_make_request+0x13a/0x14d Feb 8 04:46:10 lh kernel: [<ffffffff80231028>] submit_bio+0xcb/0xd2 Feb 8 04:46:10 lh kernel: [<ffffffff8022aaa5>] __bio_add_page+0x188/0x1ce Feb 8 04:46:10 lh kernel: [<ffffffff883ccd8b>] :xfs:xfs_submit_ioend_bio+0x1e/0x27 Feb 8 04:46:10 lh kernel: [<ffffffff883cd7c3>] :xfs:xfs_page_state_convert+0xa2f/0xb6e Feb 8 04:46:10 lh kernel: [<ffffffff883cdb30>] :xfs:xfs_vm_writepage+0xa7/0xdd Feb 8 04:46:10 lh kernel: [<ffffffff8021ac61>] mpage_writepages+0x1a6/0x34d Feb 8 04:46:10 lh kernel: [<ffffffff883cda89>] :xfs:xfs_vm_writepage+0x0/0xdd Feb 8 04:46:10 lh kernel: [<ffffffff80256d07>] do_writepages+0x20/0x2f Feb 8 04:46:10 lh kernel: [<ffffffff8022dbd7>] __writeback_single_inode+0x1b4/0x38b Feb 8 04:46:10 lh kernel: [<ffffffff880d9a46>] :dm_mod:dm_any_congested+0x38/0x3f Feb 8 04:46:10 lh kernel: [<ffffffff880db58a>] :dm_mod:dm_table_any_congested+0x46/0x63 Feb 8 04:46:10 lh kernel: [<ffffffff8021edb1>] sync_sb_inodes+0x1d1/0x2b5 Feb 8 04:46:10 lh kernel: [<ffffffff802901be>] keventd_create_kthread+0x0/0x61 Feb 8 04:46:10 lh kernel: [<ffffffff8024c991>] writeback_inodes+0x7d/0xd3 Feb 8 04:46:10 lh kernel: [<ffffffff802a894a>] background_writeout+0x82/0xb5 Feb 8 04:46:10 lh kernel: [<ffffffff8025242d>] pdflush+0x0/0x1ed Feb 8 04:46:10 lh kernel: [<ffffffff80252570>] pdflush+0x143/0x1ed Feb 8 04:46:10 lh kernel: [<ffffffff802a88c8>] background_writeout+0x0/0xb5 Feb 8 04:46:10 lh kernel: [<ffffffff8023055a>] kthread+0xd4/0x107 Feb 8 04:46:10 lh kernel: [<ffffffff80259360>] child_rip+0xa/0x12 Feb 8 04:46:10 lh kernel: [<ffffffff802901be>] keventd_create_kthread+0x0/0x61 Feb 8 04:46:10 lh kernel: [<ffffffff80230486>] kthread+0x0/0x107 Feb 8 04:46:10 lh kernel: [<ffffffff80259356>] child_rip+0x0/0x12 Feb 8 04:46:10 lh kernel: Feb 8 04:46:10 lh kernel: Feb 8 04:46:10 lh kernel: Code: 48 8b 07 48 c1 e8 3a 48 8b 14 c5 20 d0 52 80 48 b8 b7 6d db Feb 8 04:46:10 lh kernel: RIP [<ffffffff802a749a>] page_to_pfn+0x0/0x33 Feb 8 04:46:10 lh kernel: RSP <ffff81003e7e97d8> Feb 8 04:46:10 lh kernel: CR2: 0000000000000000 ------------------------------------------------------------ This happened in the beginning of copying a large amount of data (/ and /home) to an empty XFS filesystem in a dm-crypted EVMS partition. Specifically, I have /dev/evms/XFS1-crypted, which is mapped rather directly to a single hard disk in a setup where it resides in an LVM2 volume group that spans sda5 and sda6, but XFS1-crypted resides entirely in the sda5 area. From XFS1-crypted a decrypted volume XFS1-decrypted has been dm-crypt-mapped using "cryptsetup luksOpen". This was formatted with mkfs.xfs, mounted and I was copying data to it when it oopsed. > 2. http://bugzilla.kernel.org/show_bug.cgi?id=7799 > > (esp. the last comment: > "Bug in dmcrypt. There's been several bugs in dmcrypt that > only XFS has triggered and the last of these that I know about > was fixed in 2.6.19.") I wonder also which fix this refers to. The changes in the 2.6.19 branch which are clearly bug fixes seem to have been fixes for low-memory situations, which this was not. I can try to reproduce and debug this if that's helpful, just tell me what I can do to help (I have some, but not too much, kernel experience, mainly some drivers for the 2.4 series). Sami
signature.asc
Description: Digital signature