device mapper not reporting no-barrier-support?
Hi, I'm currently stuck between Kernel LVM and DRBD, as I'm using Kernel 2.6.24.2 with DRBD 8.2.5 on top of an LVM2 device (LV). -LVM2/device mapper doesn't support write barriers -DRBD uses blkdev_issue_flush() to flush its metadata to disk. On a no-barrier-device, DRBD should receive EOPNOTSUPP, but it really does receive an EIO. Promptly, DRBD gives the error message "drbd0: local disk flush failed with status -5". The physical disk (in LVM speak) is a RAID1 on a 3ware 9650SE-2LP controller; the driver 3w-9xxx supports barriers and after moving my D RBD device from the LV to a single partition on the same RAID1, the error messages from DRBD vanished. I've posted a lengty summary of my findings to http://lists.linbit.com/pipermail/drbd-user/2008-February/008665.html ... where Lars Ellenberg from DRBD basically responded in http://lists.linbit.com/pipermail/drbd-user/2008-February/008666.html ... that DRBD does catch the EOPNOTSUPP for blkdev_issue_flush and BIO_RW_BARRIER, but the lvm implementation of blkdev_issue_flush in 2.6.24.2 aparently does return EIO for blkdev_issue_flush. So simply the question: how should a top-layer driver check wether a lower device does support barriers? md-raid does check this way differently than e.g. XFS does, while DRBD also adds a third way to check this. Or is this "merely" a bug in drivers/md/dm.c? Anders -- 1&1 Internet AG System Architect Brauerstrasse 48 v://49.721.91374.50 D-76135 Karlsruhef://49.721.91374.225 Amtsgericht Montabaur HRB 6484 Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Andreas Gauger, Thomas Gottschlich, Matthias Greve, Robert Hoffmann, Markus Huhn, Achim Weiss Aufsichtsratsvorsitzender: Michael Scheeren -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [dm-devel] Re: device mapper not reporting no-barrier-support?
On Tue, Feb 26 2008 Jens Axboe wrote: > On Tue, Feb 26 2008, Alasdair G Kergon wrote: > > On Mon, Feb 25, 2008 at 03:20:50PM -0800, Andrew Morton wrote: > > > On Mon, 25 Feb 2008 14:26:15 +0100 Anders Henke <[EMAIL PROTECTED]> wrote: > > > > I'm currently stuck between Kernel LVM and DRBD, as I'm using Kernel > > > > 2.6.24.2 with DRBD 8.2.5 on top of an LVM2 device (LV). > > > > -LVM2/device mapper doesn't support write barriers > > > > That's right. > > > > > > -DRBD uses blkdev_issue_flush() to flush its metadata to disk. > > > > Which won't work if device-mapper is underneath. > > > > > > On a no-barrier-device, DRBD should receive EOPNOTSUPP, but > > > > it really does receive an EIO. Promptly, DRBD gives the > > > > error message "drbd0: local disk flush failed with status -5". > > > > I've posted a lengty summary of my findings to > > > > http://lists.linbit.com/pipermail/drbd-user/2008-February/008665.html > > > > ... that DRBD does catch the EOPNOTSUPP for blkdev_issue_flush and > > > > BIO_RW_BARRIER, but the lvm implementation of blkdev_issue_flush in > > > > 2.6.24.2 aparently does return EIO for blkdev_issue_flush. > > > I'd say it's a DM bug. > > > > The dm code is unchanged, but look at the limited endio handling in > > ll_rw_blk.c: > > > > static void bio_end_empty_barrier(struct bio *bio, int err) > > { > > if (err) > > clear_bit(BIO_UPTODATE, &bio->bi_flags); > > > > complete(bio->bi_private); > > } > > > > int blkdev_issue_flush(struct block_device *bdev, sector_t *error_sector) > > { > > ... > > wait_for_completion(&wait); > > if (error_sector) > > *error_sector = bio->bi_sector; > > ret = 0; > > if (!bio_flagged(bio, BIO_UPTODATE)) > > ret = -EIO; > > You are right, the return value got broken there. Does this make it > return -EOPNOTSUPP properly for you? No, it doesn't. I've applied your patch manually, as 2.6.24.2. doesn't have a "blk-barrier.c": ---cut --- linux-2.6.24.2/block/ll_rw_blk.c.prepatch 2008-02-11 06:51:11.0 +0100 +++ linux-2.6.24.2/block/ll_rw_blk.c2008-02-26 20:02:28.514641620 +0100 @@ -2667,8 +2667,11 @@ static void bio_end_empty_barrier(struct bio *bio, int err) { - if (err) + if (err) { + if (err == -EOPNOTSUPP) + set_bit(BIO_EOPNOTSUPP, &bio->bi_flags); clear_bit(BIO_UPTODATE, &bio->bi_flags); + } complete(bio->bi_private); } ---cut ... and the resulting kernel shows exactly the same behaviour than before: [ 752.301388] drbd0: Writing meta data super block now. [ 752.349713] drbd0: local disk flush failed with status -5 [ 752.416256] drbd0: local disk flush failed with status -5 [ 753.419254] drbd0: local disk flush failed with status -5 [ 753.925726] drbd0: local disk flush failed with status -5 [ 754.551176] drbd0: local disk flush failed with status -5 [ 754.806052] drbd0: local disk flush failed with status -5 [ 755.327988] drbd0: local disk flush failed with status -5 [ 755.781863] drbd0: local disk flush failed with status -5 [ 756.266694] drbd0: local disk flush failed with status -5 Anders > diff --git a/block/blk-barrier.c b/block/blk-barrier.c > index 6901eed..55c5f1f 100644 > --- a/block/blk-barrier.c > +++ b/block/blk-barrier.c > @@ -259,8 +259,11 @@ int blk_do_ordered(struct request_queue *q, struct > request **rqp) > > static void bio_end_empty_barrier(struct bio *bio, int err) > { > - if (err) > + if (err) { > + if (err == -EOPNOTSUPP) > + set_bit(BIO_EOPNOTSUPP, &bio->bi_flags); > clear_bit(BIO_UPTODATE, &bio->bi_flags); > + } > > complete(bio->bi_private); > } > @@ -309,7 +312,9 @@ int blkdev_issue_flush(struct block_device *bdev, > sector_t *error_sector) > *error_sector = bio->bi_sector; > > ret = 0; > - if (!bio_flagged(bio, BIO_UPTODATE)) > + if (bio_flagged(bio, BIO_EOPNOTSUPP)) > + ret = -EOPNOTSUPP; > + else if (!bio_flagged(bio, BIO_UPTODATE)) > ret = -EIO; > > bio_put(bio); > > -- > Jens Axboe > -- 1&1 Internet AG "Use the --force, Luke" Brauerstrasse 48 v://49.721.91374.50 D-76135 Karlsruhef://49.721.91374.225 Amtsgericht Montabaur HRB 6484 Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Andreas Gauger, Thomas Gottschlich, Matthias Greve, Robert Hoffmann, Markus Huhn, Achim Weiss Aufsichtsratsvorsitzender: Michael Scheeren -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: broken dpt_i2o in 2.6.23 (was: ext2 check page: bad entry in directory) (fwd)
Hi, I'd like to let you now that my boxes are running a 32-bit kernel, so the 64-bit-uncleanliness shouldn't apply to my boxes; however, http://www.miquels.cistron.nl/linux/dpt_i2o-64bit-2.6.23.patch fixed the issue on my testbox. I took a clean 2.6.23, applied patch, recompiled the kernel, reboot: works. Regards, Anders PS: Sorry for breaking the threading, I'm not a regular subscriber to linux-kernel and haven't received Miguel's message by mail. -- 1&1 Internet AG System Design Brauerstrasse 48 v://49.721.91374.50 D-76135 Karlsruhef://49.721.91374.225 Amtsgericht Montabaur HRB 6484 Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Andreas Gauger, Thomas Gottschlich, Matthias Greve, Robert Hoffmann, Norbert Lang, Achim Weiss Aufsichtsratsvorsitzender: Michael Scheeren -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: broken dpt_i2o in 2.6.23 (was: ext2 check page: bad entry in directory) (fwd)
Am 12.12.2007 schrieb Miquel van Smoorenburg: > On Wed, 2007-12-12 at 03:38 -0800, Andrew Morton wrote: > > On Wed, 12 Dec 2007 11:58:41 +0100 Anders Henke <[EMAIL PROTECTED]> wrote: > > > > > Hi, > > > > > > I'd like to let you now that my boxes are running a 32-bit kernel, so > > > the 64-bit-uncleanliness shouldn't apply to my boxes; however, > > > > > > http://www.miquels.cistron.nl/linux/dpt_i2o-64bit-2.6.23.patch > > > > > > fixed the issue on my testbox. > > > > > > I took a clean 2.6.23, applied patch, recompiled the kernel, reboot: > > > works. > > > > What a huge patch :( > > > > We already reverted the offening patch so I assume that 2.6.24-rc5 is > > working for you? > > > > I guess we need to look at restoring "dpt_i2o: convert to SCSI hotplug > > model" and then absorbing what Miquel has done there. > > This was just a patch I had lying around, if it worked it would confirm > my suspicion, which it has. > > The minimal patch which is suitable for 2.6.23-stable and 2.6.24 would > be the attached one-liner. The "dpt_i2o: convert to SCSI hotplug model" > patch could be restored then. > > (if the list eats the attachment, it's also available here: > http://www.miquels.cistron.nl/linux/linux-2.6.23+24-dpt_i2o-dma64.patch > ) > > Anders, does this one-liner patch work for you ? Got it - and it works! I took a clean 2.6.23, applied the patch, recompiled the kernel and rebooted my testbox: came up with the fresh-compiled kernel (verified by "uname -a"). Regards, Anders -- 1&1 Internet AG System Design Brauerstrasse 48 v://49.721.91374.50 D-76135 Karlsruhef://49.721.91374.225 Amtsgericht Montabaur HRB 6484 Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Andreas Gauger, Thomas Gottschlich, Matthias Greve, Robert Hoffmann, Norbert Lang, Achim Weiss Aufsichtsratsvorsitzender: Michael Scheeren -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: broken dpt_i2o in 2.6.23 (was: ext2 check page: bad entry in directory) (fwd)
Am 12.12.2007 schrieb Andrew Morton: > On Wed, 12 Dec 2007 11:58:41 +0100 Anders Henke <[EMAIL PROTECTED]> wrote: > > > Hi, > > > > I'd like to let you now that my boxes are running a 32-bit kernel, so > > the 64-bit-uncleanliness shouldn't apply to my boxes; however, > > > > http://www.miquels.cistron.nl/linux/dpt_i2o-64bit-2.6.23.patch > > > > fixed the issue on my testbox. > > > > I took a clean 2.6.23, applied patch, recompiled the kernel, reboot: works. > > What a huge patch :( > > We already reverted the offening patch so I assume that 2.6.24-rc5 is > working for you? Yes, the vanilla 2.6.24-rc5 works fine (at least it's booting :-). Linux rdb140 2.6.24-rc5 #1 SMP Wed Dec 12 15:06:05 CET 2007 i686 GNU/Linux > I guess we need to look at restoring "dpt_i2o: convert to SCSI hotplug > model" and then absorbing what Miquel has done there. I've tried 2.6.23 with http://www.miquels.cistron.nl/linux/linux-2.6.23+24-dpt_i2o-dma64.patch ... and that's enough to make my boxes boot again. Regards, Anders -- 1&1 Internet AGEnter any 11-digit prime number to continue. Brauerstrasse 48 v://49.721.91374.50 D-76135 Karlsruhe f://49.721.91374.225 Amtsgericht Montabaur HRB 6484 Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Andreas Gauger, Thomas Gottschlich, Matthias Greve, Robert Hoffmann, Norbert Lang, Achim Weiss Aufsichtsratsvorsitzender: Michael Scheeren -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
broken dpt_i2o (was: ext2_check_page: bad entry in directory)
Hi, I've been bitten by the problem noted in the lkml message of rougly the same subject, dated back on Oct/24/2007. My boxes were running 2.6.19 and have been upgraded to 2.6.23.1, but their bootup failed when trying to mount the root (ext2) filesystem: ---cut serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A 00:08: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A 00:09: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A Loading Adaptec I2O RAID: Version 2.4 Build 5go Detecting Adaptec I2O RAID controllers... ACPI: PCI Interrupt :04:08.0[A] -> GSI 48 (level, low) -> IRQ 16 Adaptec I2O RAID controller 0 irq=16 BAR0 f888 - size= 10 BAR1 f8a0 - size= 100 dpti: If you have a lot of devices this could take a few minutes. dpti0: Reading the hardware resource table. TID 008 Vendor: ADAPTEC Device: AIC-7902 Rev: 0001 TID 009 Vendor: ADAPTEC Device: AIC-7902 Rev: 0001 TID 515 Vendor: ESG-SHV SDevice: SCA HSBP M21 Rev: 0.080 TID 518 Vendor: ADAPTEC RDevice: RAID-1 Rev: 3B0AD scsi0 : Vendor: Adaptec Model: 2010SFW:3B0A scsi 0:1:0:0: Direct-Access ADAPTEC RAID-1 3B0A PQ: 0 ANSI: 2 scsi 0:1:6:0: Processor ESG-SHV SCA HSBP M21 0.08 PQ: 0 ANSI: 2 Adaptec aacraid driver 1.1-5[2449]-ms GDT-HA: Storage RAID Controller Driver. Version: 3.05 GDT-HA: Found 0 PCI Storage RAID Controllers 3ware Storage Controller device driver for Linux v1.26.02.002. 3ware 9000 Storage Controller device driver for Linux v2.26.02.010. sd 0:1:0:0: [sda] 143374336 512-byte hardware sectors (73408 MB) sd 0:1:0:0: [sda] Write Protect is off sd 0:1:0:0: [sda] Write cache: enabled, read cache: enabled, supports DPO and FUA sd 0:1:0:0: [sda] 143374336 512-byte hardware sectors (73408 MB) sd 0:1:0:0: [sda] Write Protect is off sd 0:1:0:0: [sda] Write cache: enabled, read cache: enabled, supports DPO and FUA sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 > sd 0:1:0:0: [sda] Attached SCSI disk PNP: PS/2 Controller [PNP0303:PS2K] at 0x60,0x64 irq 1 PNP: PS/2 appears to have AUX port disabled, if this is incorrect please boot with i8042.nopnp serio: i8042 KBD port at 0x60,0x64 irq 1 mice: PS/2 mouse device common for all mice md: raid1 personality registered for level 1 EDAC MC: Ver: 2.1.0 Oct 23 2007 TCP cubic registered NET: Registered protocol family 1 NET: Registered protocol family 17 Starting balanced_irq Using IPI Shortcut mode md: Autodetecting RAID arrays. md: autorun ... md: ... autorun DONE. VFS: Mounted root (ext2 filesystem) readonly. Freeing unused kernel memory: 264k freed EXT2-fs error (device sda1): ext2_check_page: bad entry in directory #2: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0 Warning: unable to open an initial console. Kernel panic - not syncing: No init found. Try passing init= option to kernel. Rebooting in 30 seconds.. ---cut Rebooting the box into 2.6.19 works without any problems. I've checked the changelogs for 2.6.24-rc*, but haven't come across a solution for this issue; but maybe I've also overseen the point. http://lkml.org/lkml/2007/10/24/224, this bug has been reported earlier. I've contacted Jan Kara off-list; as booting into 2.6.19 works and e2fsck on an e2image file doesn't show any errors, we assumed that the Ext2 itself is fine. As "everything is reported as being zero" is quite odd an Jan took a guess that it might be block-layer or driver-related, I've assumed that the driver is responsible for this; just out of the curiousity, I've manually replaced the dpt_i2o driver by the 2.6.19 one by copying driver/scsi/dpt_i2o.c driver/scsi/dpti.h and driver/scsi/dpt/ into a vanilla 2.6.23.1. kernel; using this kernel fixed the issue for me. I haven't yet fine-tested from which kernel release on the dpt_i2o driver behaves like this and spews out zeroed blocks when trying to mount the rootfs. Maybe this is just some timing issue. For some strange reason, this doesn't affect all boxes running the dpt_i2o driver. Affected (verified on 6 out of 6 tested boxes so far): Intel SE7501WV2S using an Adaptec 2010S with the following "lspci -vn"-section: :04:08.0 0104: 1044:a511 (rev 01) Subsystem: 1044:c035 Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 16 BIST result: 00 Memory at fe90 (32-bit, non-prefetchable) [size=1M] Memory at fb00 (32-bit, prefetchable) [size=16M] Memory at f800 (32-bit, prefetchable) [size=32M] Expansion ROM at f620 [disabled] [size=32K] Capabilities: [44] Power Management version 2 Not affected are e.g. a box with a Supermicro X5DPR using an Adaptec 2015S and the following "lspci -vn"-section: :03:03.0 0104: 1044:a511 (rev 01) Subsystem: 1044:c034 Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 16 BIST result: 00 Memory at f830 (32-bit, non-prefetchable) [size=1M] Memory at fb00 (32-bit,
Re: broken dpt_i2o in 2.6.23 (was: ext2_check_page: bad entry in directory)
On November 28 2007, Anders Henke wrote: > As "everything is reported as being zero" is quite odd an Jan took a > guess that it might be block-layer or driver-related, I've assumed > that the driver is responsible for this; just out of the curiousity, > I've manually replaced the dpt_i2o driver by the 2.6.19 one by copying > driver/scsi/dpt_i2o.c driver/scsi/dpti.h and driver/scsi/dpt/ into a > vanilla 2.6.23.1. kernel; using this kernel fixed the issue for me. > > I haven't yet fine-tested from which kernel release on the dpt_i2o driver > behaves like this and spews out zeroed blocks when trying to mount > the rootfs. Maybe this is just some timing issue. I've started the fine-tests and can say so far that dpt_i2o from 2.6.22 is still fine. Test is simple: [EMAIL PROTECTED]:/usr/src/linux-2.6.22/drivers/scsi/dpt$ cp -r dpt/ dpt_i2o.c dpti.h /usr/src/linux-2.6.23.1/drivers/scsi/ ... recompile the kernel, reboot: works. 2.6.22 and 2.6.23 differ in terms of the dpt_i2o driver by two different patch sets: -one 2 Kb small set of patches from 2.6.22 to 2.6.22-rc1 -one 7 Kb set of patches from 2.6.23-rc2 to 2.6.23-rc3 -one 162 Kb set of patches from 2.6.23-rc9 to 2.6.23-rc10. When applying the 2.6.23-rc1-based driver to "my" 2.6.31.1 kernel, the "zero blocks"-symptom show up, so it's the "lucky" situation that the smallest patch actually seams to be the broken one. According to the 2.6.23-rc1 short-form changelog, there is one major edit on the dpt_i2o driver: FUJITA Tomonori [SCSI] dpt_i2o: convert to use the data buffer accessors Stephen Rothwell dpt_i2o depends on virt_to_bus Fujita, would you please take a look at this? I think that something's broken in there, leading to the dpt_i2o sending out blocks of zeroes right after initialization, at least on some specific controllers (in this case, Adaptec 2010S on Intel SE7501WV2S-based boxes). I don't have insight kernel driver development knowledge, so I'm quite out of help right now. Nevertheless, I'll add the diff from 2.6.22 to 2.6.23-rc1 in terms of dpt_i2o: ---cut diff -Nur linux-2.6.22/drivers/scsi/dpt_i2o.c linux-2.6.23-rc1/drivers/scsi/dpt_i2o.c --- linux-2.6.22/drivers/scsi/dpt_i2o.c 2007-07-09 01:32:17.0 +0200 +++ linux-2.6.23-rc1/drivers/scsi/dpt_i2o.c 2007-07-22 22:41:00.0 +0200 @@ -2078,12 +2078,13 @@ u32 *lenptr; int direction; int scsidir; + int nseg; u32 len; u32 reqlen; s32 rcode; memset(msg, 0 , sizeof(msg)); - len = cmd->request_bufflen; + len = scsi_bufflen(cmd); direction = 0x; scsidir = 0x; // DATA NO XFER @@ -2140,21 +2141,21 @@ lenptr=mptr++; /* Remember me - fill in when we know */ reqlen = 14;// SINGLE SGE /* Now fill in the SGList and command */ - if(cmd->use_sg) { - struct scatterlist *sg = (struct scatterlist *)cmd->request_buffer; - int sg_count = pci_map_sg(pHba->pDev, sg, cmd->use_sg, - cmd->sc_data_direction); + nseg = scsi_dma_map(cmd); + BUG_ON(nseg < 0); + if (nseg) { + struct scatterlist *sg; len = 0; - for(i = 0 ; i < sg_count; i++) { + scsi_for_each_sg(cmd, sg, nseg, i) { *mptr++ = direction|0x1000|sg_dma_len(sg); len+=sg_dma_len(sg); *mptr++ = sg_dma_address(sg); - sg++; + /* Make this an end of list */ + if (i == nseg - 1) + mptr[-2] = direction|0xD000|sg_dma_len(sg); } - /* Make this an end of list */ - mptr[-2] = direction|0xD000|sg_dma_len(sg-1); reqlen = mptr - msg; *lenptr = len; @@ -2163,16 +2164,8 @@ len, cmd->underflow); } } else { - *lenptr = len = cmd->request_bufflen; - if(len == 0) { - reqlen = 12; - } else { - *mptr++ = 0xD000|direction|cmd->request_bufflen; - *mptr++ = pci_map_single(pHba->pDev, - cmd->request_buffer, - cmd->request_bufflen, - cmd->sc_data_direction); - } + *lenptr = len = 0; + reqlen = 12; } /* Stick the headers on */ @@ -2232,7 +2225,7 @@ hba_status = detailed_status >> 8; // calculate resid for sg - cmd->resid = cmd->request_bu
Re: broken dpt_i2o in 2.6.23 (was: ext2_check_page: bad entry in directory) (fwd)
On Nov 29 2007, FUJITA Tomonori wrote: > On Thu, 29 Nov 2007 14:03:19 +0100 > Jan Kara <[EMAIL PROTECTED]> wrote: > > > Adding relevant people and lists to CC... > > > > Honza > > > > - Forwarded message from Anders Henke <[EMAIL PROTECTED]> - > > > > Date: Thu, 29 Nov 2007 13:31:50 +0100 > > From: Anders Henke <[EMAIL PROTECTED]> > > To: linux-kernel@vger.kernel.org > > Subject: Re: broken dpt_i2o in 2.6.23 (was: ext2_check_page: bad entry in > > directory) > > User-Agent: Mutt/1.5.13 (2006-08-11) > > > > On November 28 2007, Anders Henke wrote: > > > As "everything is reported as being zero" is quite odd an Jan took a > > > guess that it might be block-layer or driver-related, I've assumed > > > that the driver is responsible for this; just out of the curiousity, > > > I've manually replaced the dpt_i2o driver by the 2.6.19 one by copying > > > driver/scsi/dpt_i2o.c driver/scsi/dpti.h and driver/scsi/dpt/ into a > > > vanilla 2.6.23.1. kernel; using this kernel fixed the issue for me. > > > > > > I haven't yet fine-tested from which kernel release on the dpt_i2o driver > > > behaves like this and spews out zeroed blocks when trying to mount > > > the rootfs. Maybe this is just some timing issue. > > > > I've started the fine-tests and can say so far that dpt_i2o from > > 2.6.22 is still fine. Test is simple: > > > > [EMAIL PROTECTED]:/usr/src/linux-2.6.22/drivers/scsi/dpt$ cp -r dpt/ > > dpt_i2o.c dpti.h /usr/src/linux-2.6.23.1/drivers/scsi/ > > > > ... recompile the kernel, reboot: works. > > > > 2.6.22 and 2.6.23 differ in terms of the dpt_i2o driver by two different > > patch sets: > > -one 2 Kb small set of patches from 2.6.22 to 2.6.22-rc1 > > -one 7 Kb set of patches from 2.6.23-rc2 to 2.6.23-rc3 > > -one 162 Kb set of patches from 2.6.23-rc9 to 2.6.23-rc10. > > > > When applying the 2.6.23-rc1-based driver to "my" 2.6.31.1 kernel, > > the "zero blocks"-symptom show up, so it's the "lucky" situation > > that the smallest patch actually seams to be the broken one. > > > > According to the 2.6.23-rc1 short-form changelog, there is > > one major edit on the dpt_i2o driver: > > > > FUJITA Tomonori > > > > [SCSI] dpt_i2o: convert to use the data buffer accessors > > > > Stephen Rothwell > > dpt_i2o depends on virt_to_bus > > > > Fujita, would you please take a look at this? > > Sorry about the bug. Can you try this? > > > diff --git a/drivers/scsi/dpt_i2o.c b/drivers/scsi/dpt_i2o.c > index 8258506..1255b26 100644 > --- a/drivers/scsi/dpt_i2o.c > +++ b/drivers/scsi/dpt_i2o.c > @@ -3295,7 +3295,7 @@ static struct scsi_host_template adpt_template = { > .this_id= 7, > .cmd_per_lun= 1, > .use_clustering = ENABLE_CLUSTERING, > - .use_sg_chaining= ENABLE_SG_CHAINING, > + .use_sg_chaining= DISABLE_SG_CHAINING, > }; > > static s32 adpt_scsi_register(adpt_hba* pHba) The structure to patch does look different and doesn't include an tag "use_sg_chaining": ---cut static struct scsi_host_template adpt_template = { .name = "dpt_i2o", .proc_name = "dpt_i2o", .proc_info = adpt_proc_info, .info = adpt_info, .queuecommand = adpt_queue, .eh_abort_handler = adpt_abort, .eh_device_reset_handler = adpt_device_reset, .eh_bus_reset_handler = adpt_bus_reset, .eh_host_reset_handler = adpt_reset, .bios_param = adpt_bios_param, .slave_configure= adpt_slave_configure, .can_queue = MAX_TO_IOP_MESSAGES, .this_id= 7, .cmd_per_lun= 1, .use_clustering = ENABLE_CLUSTERING, }; static s32 adpt_scsi_register(adpt_hba* pHba) ---cut Anders -- 1&1 Internet AG System Design Brauerstrasse 48 v://49.721.91374.50 D-76135 Karlsruhef://49.721.91374.225 Amtsgericht Montabaur HRB 6484 Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Andreas Gauger, Thomas Gottschlich, Matthias Greve, Robert Hoffmann, Norbert Lang, Achim Weiss Aufsichtsratsvorsitzender: Michael Scheeren - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: broken dpt_i2o in 2.6.23 (was: ext2_check_page: bad entry in directory) (fwd)
Am 29.11.2007 schrieb Matthew Wilcox: > On Thu, Nov 29, 2007 at 05:45:57PM +0100, Anders Henke wrote: > > On Nov 29 2007, FUJITA Tomonori wrote: > > > @@ -3295,7 +3295,7 @@ static struct scsi_host_template adpt_template = { > > > .this_id= 7, > > > .cmd_per_lun= 1, > > > .use_clustering = ENABLE_CLUSTERING, > > > - .use_sg_chaining= ENABLE_SG_CHAINING, > > > + .use_sg_chaining= DISABLE_SG_CHAINING, > > > }; > > > > > > static s32 adpt_scsi_register(adpt_hba* pHba) > > > > The structure to patch does look different and doesn't include an > > tag "use_sg_chaining": > > > > .this_id= 7, > > .cmd_per_lun= 1, > > .use_clustering = ENABLE_CLUSTERING, > > Just add the line > .use_sg_chaining= DISABLE_SG_CHAINING, > > > }; Just out of curiosity, I've tried 2.6.24-rc3 and patched the kernel accordingly (DISABLE_SG_CHAINING): doesn't boot successfully, same error as usual: EXT2-fs error (device sda1): ext2_check_page: bad entry in directory #2: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0 Warning: unable to open an initial console. Kernel panic - not syncing: No init found. Try passing init= option to kernel. As sent in a parallel mail, I've found out that 2.6.23-rc2 works and 2.6.23-rc3 shows the same problems - so the problem has to be searched in the dpt_i2o-changes made for 2.6.23-rc3. Anders -- 1&1 Internet AG System Design Brauerstrasse 48 v://49.721.91374.50 D-76135 Karlsruhef://49.721.91374.225 Amtsgericht Montabaur HRB 6484 Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Andreas Gauger, Thomas Gottschlich, Matthias Greve, Robert Hoffmann, Norbert Lang, Achim Weiss Aufsichtsratsvorsitzender: Michael Scheeren - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: broken dpt_i2o in 2.6.23 (was: ext2_check_page: bad entry in directory) (fwd)
Am 30.11.2007 schrieb FUJITA Tomonori: > > > > According to the 2.6.23-rc1 short-form changelog, there is > > > > one major edit on the dpt_i2o driver: > > > > > > > > FUJITA Tomonori > > > > > > > > [SCSI] dpt_i2o: convert to use the data buffer accessors > > > > > > > > Stephen Rothwell > > > > dpt_i2o depends on virt_to_bus > > > > > > > > Fujita, would you please take a look at this? > > > > > > Sorry about the bug. Can you try this? > > > > > > > > > diff --git a/drivers/scsi/dpt_i2o.c b/drivers/scsi/dpt_i2o.c > > > index 8258506..1255b26 100644 > > > --- a/drivers/scsi/dpt_i2o.c > > > +++ b/drivers/scsi/dpt_i2o.c > > > @@ -3295,7 +3295,7 @@ static struct scsi_host_template adpt_template = { > > > .this_id= 7, > > > .cmd_per_lun= 1, > > > .use_clustering = ENABLE_CLUSTERING, > > > - .use_sg_chaining= ENABLE_SG_CHAINING, > > > + .use_sg_chaining= DISABLE_SG_CHAINING, > > > }; > > > > > > static s32 adpt_scsi_register(adpt_hba* pHba) > > > > The structure to patch does look different and doesn't include an > > tag "use_sg_chaining": > > Sorry, I misread your bug report. If you use 2.6.23, the sg chaining > is unrelated. > > What architecture do you use? "Mainstream" 32-bit-x86, the affected boxes are running Intel Xeons (P4) at 2.66 or 2.8 GHz. In between, I've ruled out that the static assignment isn't source of the problem. And due to some manually made "make clean" which didn't clean enough, I've also pointed out the wrong patch - sorry, Fujita, definitely the right one which breaks my boxes is the dpt_i2o patch from 2.6.23-rc2 to 2.6.23-rc3 (7 kb in Size) from Matthew Wilcox. commit 55d9fcf57ba5ec427544fca7abc335cf3da78160 Author: Matthew Wilcox <[EMAIL PROTECTED]> Date: Mon Jul 30 15:19:18 2007 -0600 [SCSI] dpt_i2o: convert to SCSI hotplug model - Delete refereces to HOSTS_C - Switch to module_init/module_exit instead of detect/release - Don't pass around the host template and rename it to adpt_template - Switch from scsi_register/scsi_unregister to scsi_host_alloc, scsi_add_host, scsi_scan_host and scsi_host_put. Signed-off-by: Matthew Wilcox <[EMAIL PROTECTED]> Acked-by: "Salyzyn, Mark" <[EMAIL PROTECTED]> Signed-off-by: James Bottomley <[EMAIL PROTECTED]> "Definitely" as in -applied the diff-patch onto 2.6.23.1 with a dpt_i2o from 2.6.23-rc1 to verify that booting failes. -Recompiled a clean unpacked 2.6.23-rc2 to verify that the driver from -rc2 still works. -Recompiled a clean unpacked 2.6.23-rc3 to verify that the driver from -rc3 breaks booting on my boxes. So sorry for mispointing the bug to you, it's an issue for Matthew. Anders -- 1&1 Internet AG System Design Brauerstrasse 48 v://49.721.91374.50 D-76135 Karlsruhef://49.721.91374.225 Amtsgericht Montabaur HRB 6484 Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Andreas Gauger, Thomas Gottschlich, Matthias Greve, Robert Hoffmann, Norbert Lang, Achim Weiss Aufsichtsratsvorsitzender: Michael Scheeren - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: broken dpt_i2o in 2.6.23 (was: ext2_check_page: bad entry in directory)
On Tue, 4 Dec 2007 Andrew Morton wrote: > On Wed, 05 Dec 2007 10:30:54 +0900 FUJITA Tomonori <[EMAIL PROTECTED]> wrote: > > > On Tue, 4 Dec 2007 17:11:55 -0800 > > Andrew Morton <[EMAIL PROTECTED]> wrote: > > > > > On Wed, 05 Dec 2007 10:04:03 +0900 > > > FUJITA Tomonori <[EMAIL PROTECTED]> wrote: > > > > > > > On Tue, 4 Dec 2007 16:57:38 -0800 > > > > Andrew Morton <[EMAIL PROTECTED]> wrote: > > > > > > > > > On Thu, 29 Nov 2007 13:31:50 +0100 > > > > > Anders Henke <[EMAIL PROTECTED]> wrote: > > > > > > > > > > > On November 28 2007, Anders Henke wrote: > > > > > > > As "everything is reported as being zero" is quite odd an Jan > > > > > > > took a > > > > > > > guess that it might be block-layer or driver-related, I've assumed > > > > > > > that the driver is responsible for this; just out of the > > > > > > > curiousity, > > > > > > > I've manually replaced the dpt_i2o driver by the 2.6.19 one by > > > > > > > copying > > > > > > > driver/scsi/dpt_i2o.c driver/scsi/dpti.h and driver/scsi/dpt/ > > > > > > > into a > > > > > > > vanilla 2.6.23.1. kernel; using this kernel fixed the issue for > > > > > > > me. > > > > > > > > > > > > > > I haven't yet fine-tested from which kernel release on the > > > > > > > dpt_i2o driver > > > > > > > behaves like this and spews out zeroed blocks when trying to mount > > > > > > > the rootfs. Maybe this is just some timing issue. > > > > > > > > > > > > I've started the fine-tests and can say so far that dpt_i2o from > > > > > > 2.6.22 is still fine. Test is simple: > > > > > > > > > > > > [EMAIL PROTECTED]:/usr/src/linux-2.6.22/drivers/scsi/dpt$ cp -r > > > > > > dpt/ dpt_i2o.c dpti.h /usr/src/linux-2.6.23.1/drivers/scsi/ > > > > > > > > > > > > ... recompile the kernel, reboot: works. > > > > > > > > > > > > 2.6.22 and 2.6.23 differ in terms of the dpt_i2o driver by two > > > > > > different > > > > > > patch sets: > > > > > > -one 2 Kb small set of patches from 2.6.22 to 2.6.22-rc1 > > > > > > -one 7 Kb set of patches from 2.6.23-rc2 to 2.6.23-rc3 > > > > > > -one 162 Kb set of patches from 2.6.23-rc9 to 2.6.23-rc10. > > > > > > > > > > > > When applying the 2.6.23-rc1-based driver to "my" 2.6.31.1 kernel, > > > > > > the "zero blocks"-symptom show up, so it's the "lucky" situation > > > > > > that the smallest patch actually seams to be the broken one. > > > > > > > > > > > > According to the 2.6.23-rc1 short-form changelog, there is > > > > > > one major edit on the dpt_i2o driver: > > > > > > > > > > > > FUJITA Tomonori > > > > > > > > > > > > [SCSI] dpt_i2o: convert to use the data buffer accessors > > > > > > > > > > > > Stephen Rothwell > > > > > > dpt_i2o depends on virt_to_bus > > > > > > > > > > > > Fujita, would you please take a look at this? > > > > > > > > > > He won't have seen this. cc's added. > > > > > > > > > > > I think that something's broken in there, leading to the dpt_i2o > > > > > > sending out blocks of zeroes right after initialization, at least on > > > > > > some specific controllers (in this case, Adaptec 2010S on Intel > > > > > > SE7501WV2S-based boxes). > > > > > > > > > > > > I don't have insight kernel driver development knowledge, so I'm > > > > > > quite out of help right now. Nevertheless, I'll add the diff > > > > > > from 2.6.22 to 2.6.23-rc1 in terms of dpt_i2o: > > > > > > > > > > > > > > > > Can you please confirm that this revert (against 2.6.24-rc4) fixes > > > > > the data > > > > > corruption problems? > > > > > > > > Anders said