device mapper not reporting no-barrier-support?

2008-02-25 Thread Anders Henke
Hi,

I'm currently stuck between Kernel LVM and DRBD, as I'm using Kernel
2.6.24.2 with DRBD 8.2.5 on top of an LVM2 device (LV).

-LVM2/device mapper doesn't support write barriers
-DRBD uses blkdev_issue_flush() to flush its metadata to disk.
 On a no-barrier-device, DRBD should receive EOPNOTSUPP, but
 it really does receive an EIO. Promptly, DRBD gives the
 error message "drbd0: local disk flush failed with status -5".

The physical disk (in LVM speak) is a RAID1 on a 3ware 9650SE-2LP
controller; the driver 3w-9xxx supports barriers and after moving my D
RBD device from the LV to a single partition on the same RAID1, the 
error messages from DRBD vanished.

I've posted a lengty summary of my findings to

http://lists.linbit.com/pipermail/drbd-user/2008-February/008665.html

... where Lars Ellenberg from DRBD basically responded in

http://lists.linbit.com/pipermail/drbd-user/2008-February/008666.html

... that DRBD does catch the EOPNOTSUPP for blkdev_issue_flush and
BIO_RW_BARRIER, but the lvm implementation of blkdev_issue_flush in
2.6.24.2 aparently does return EIO for blkdev_issue_flush.

So simply the question: how should a top-layer driver check wether a lower
device does support barriers? md-raid does check this way differently than
e.g. XFS does, while DRBD also adds a third way to check this.
Or is this "merely" a bug in drivers/md/dm.c?


Anders
-- 
1&1 Internet AG  System Architect
Brauerstrasse 48 v://49.721.91374.50
D-76135 Karlsruhef://49.721.91374.225

Amtsgericht Montabaur HRB 6484
Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Andreas Gauger,
Thomas Gottschlich, Matthias Greve, Robert Hoffmann, Markus Huhn, Achim Weiss
Aufsichtsratsvorsitzender: Michael Scheeren
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [dm-devel] Re: device mapper not reporting no-barrier-support?

2008-02-26 Thread Anders Henke
On Tue, Feb 26 2008 Jens Axboe wrote:
> On Tue, Feb 26 2008, Alasdair G Kergon wrote:
> > On Mon, Feb 25, 2008 at 03:20:50PM -0800, Andrew Morton wrote:
> > > On Mon, 25 Feb 2008 14:26:15 +0100 Anders Henke <[EMAIL PROTECTED]> wrote:
> > > > I'm currently stuck between Kernel LVM and DRBD, as I'm using Kernel
> > > > 2.6.24.2 with DRBD 8.2.5 on top of an LVM2 device (LV).
> > > > -LVM2/device mapper doesn't support write barriers
> > 
> > That's right.
> > 
> > > > -DRBD uses blkdev_issue_flush() to flush its metadata to disk.
> > 
> > Which won't work if device-mapper is underneath.
> > 
> > > >  On a no-barrier-device, DRBD should receive EOPNOTSUPP, but
> > > >  it really does receive an EIO. Promptly, DRBD gives the
> > > >  error message "drbd0: local disk flush failed with status -5".
> > > > I've posted a lengty summary of my findings to
> > > > http://lists.linbit.com/pipermail/drbd-user/2008-February/008665.html
> > > > ... that DRBD does catch the EOPNOTSUPP for blkdev_issue_flush and
> > > > BIO_RW_BARRIER, but the lvm implementation of blkdev_issue_flush in
> > > > 2.6.24.2 aparently does return EIO for blkdev_issue_flush.
> > > I'd say it's a DM bug.
> > 
> > The dm code is unchanged, but look at the limited endio handling in
> > ll_rw_blk.c:
> > 
> > static void bio_end_empty_barrier(struct bio *bio, int err)
> > {
> > if (err)
> > clear_bit(BIO_UPTODATE, &bio->bi_flags);
> > 
> > complete(bio->bi_private);
> > }
> > 
> > int blkdev_issue_flush(struct block_device *bdev, sector_t *error_sector)
> > {
> > ...
> > wait_for_completion(&wait);
> > if (error_sector)
> > *error_sector = bio->bi_sector;
> > ret = 0;
> > if (!bio_flagged(bio, BIO_UPTODATE))
> > ret = -EIO;
> 
> You are right, the return value got broken there. Does this make it
> return -EOPNOTSUPP properly for you?


No, it doesn't.



I've applied your patch manually, as 2.6.24.2. doesn't have a "blk-barrier.c":

---cut
--- linux-2.6.24.2/block/ll_rw_blk.c.prepatch   2008-02-11
06:51:11.0 +0100
+++ linux-2.6.24.2/block/ll_rw_blk.c2008-02-26 20:02:28.514641620
+0100
@@ -2667,8 +2667,11 @@
 
 static void bio_end_empty_barrier(struct bio *bio, int err)
 {
-   if (err)
+   if (err) {
+   if (err == -EOPNOTSUPP)
+   set_bit(BIO_EOPNOTSUPP, &bio->bi_flags);
clear_bit(BIO_UPTODATE, &bio->bi_flags);
+   }
 
complete(bio->bi_private);
 }
---cut

... and the resulting kernel shows exactly the same behaviour than before:

[  752.301388] drbd0: Writing meta data super block now.
[  752.349713] drbd0: local disk flush failed with status -5
[  752.416256] drbd0: local disk flush failed with status -5
[  753.419254] drbd0: local disk flush failed with status -5
[  753.925726] drbd0: local disk flush failed with status -5
[  754.551176] drbd0: local disk flush failed with status -5
[  754.806052] drbd0: local disk flush failed with status -5
[  755.327988] drbd0: local disk flush failed with status -5
[  755.781863] drbd0: local disk flush failed with status -5
[  756.266694] drbd0: local disk flush failed with status -5





Anders

> diff --git a/block/blk-barrier.c b/block/blk-barrier.c
> index 6901eed..55c5f1f 100644
> --- a/block/blk-barrier.c
> +++ b/block/blk-barrier.c
> @@ -259,8 +259,11 @@ int blk_do_ordered(struct request_queue *q, struct 
> request **rqp)
>  
>  static void bio_end_empty_barrier(struct bio *bio, int err)
>  {
> - if (err)
> + if (err) {
> + if (err == -EOPNOTSUPP)
> + set_bit(BIO_EOPNOTSUPP, &bio->bi_flags);
>   clear_bit(BIO_UPTODATE, &bio->bi_flags);
> + }
>  
>   complete(bio->bi_private);
>  }
> @@ -309,7 +312,9 @@ int blkdev_issue_flush(struct block_device *bdev, 
> sector_t *error_sector)
>   *error_sector = bio->bi_sector;
>  
>   ret = 0;
> - if (!bio_flagged(bio, BIO_UPTODATE))
> + if (bio_flagged(bio, BIO_EOPNOTSUPP))
> + ret = -EOPNOTSUPP;
> + else if (!bio_flagged(bio, BIO_UPTODATE))
>   ret = -EIO;
>  
>   bio_put(bio);
> 
> -- 
> Jens Axboe
> 
-- 
1&1 Internet AG  "Use the --force, Luke"
Brauerstrasse 48 v://49.721.91374.50
D-76135 Karlsruhef://49.721.91374.225

Amtsgericht Montabaur HRB 6484
Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Andreas Gauger,
Thomas Gottschlich, Matthias Greve, Robert Hoffmann, Markus Huhn, Achim Weiss
Aufsichtsratsvorsitzender: Michael Scheeren
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: broken dpt_i2o in 2.6.23 (was: ext2 check page: bad entry in directory) (fwd)

2007-12-12 Thread Anders Henke
Hi,

I'd like to let you now that my boxes are running a 32-bit kernel, so
the 64-bit-uncleanliness shouldn't apply to my boxes; however,

http://www.miquels.cistron.nl/linux/dpt_i2o-64bit-2.6.23.patch

fixed the issue on my testbox.

I took a clean 2.6.23, applied patch, recompiled the kernel, reboot: works.



Regards,

Anders

PS: Sorry for breaking the threading, I'm not a regular subscriber to
linux-kernel and haven't received Miguel's message by mail.
-- 
1&1 Internet AG  System Design
Brauerstrasse 48 v://49.721.91374.50
D-76135 Karlsruhef://49.721.91374.225

Amtsgericht Montabaur HRB 6484
Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Andreas Gauger,
Thomas Gottschlich, Matthias Greve, Robert Hoffmann, Norbert Lang, Achim Weiss
Aufsichtsratsvorsitzender: Michael Scheeren
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: broken dpt_i2o in 2.6.23 (was: ext2 check page: bad entry in directory) (fwd)

2007-12-12 Thread Anders Henke
Am 12.12.2007 schrieb Miquel van Smoorenburg:
> On Wed, 2007-12-12 at 03:38 -0800, Andrew Morton wrote:
> > On Wed, 12 Dec 2007 11:58:41 +0100 Anders Henke <[EMAIL PROTECTED]> wrote:
> > 
> > > Hi,
> > > 
> > > I'd like to let you now that my boxes are running a 32-bit kernel, so
> > > the 64-bit-uncleanliness shouldn't apply to my boxes; however,
> > > 
> > > http://www.miquels.cistron.nl/linux/dpt_i2o-64bit-2.6.23.patch
> > > 
> > > fixed the issue on my testbox.
> > > 
> > > I took a clean 2.6.23, applied patch, recompiled the kernel, reboot: 
> > > works.
> > 
> > What a huge patch :(
> > 
> > We already reverted the offening patch so I assume that 2.6.24-rc5 is
> > working for you?
> > 
> > I guess we need to look at restoring "dpt_i2o: convert to SCSI hotplug
> > model" and then absorbing what Miquel has done there.
> 
> This was just a patch I had lying around, if it worked it would confirm
> my suspicion, which it has.
> 
> The minimal patch which is suitable for 2.6.23-stable and 2.6.24 would
> be the attached one-liner. The "dpt_i2o: convert to SCSI hotplug model"
> patch could be restored then.
> 
> (if the list eats the attachment, it's also available here:
> http://www.miquels.cistron.nl/linux/linux-2.6.23+24-dpt_i2o-dma64.patch 
> )
> 
> Anders, does this one-liner patch work for you ?

Got it - and it works!

I took a clean 2.6.23, applied the patch, recompiled the kernel and
rebooted my testbox: came up with the fresh-compiled kernel 
(verified by "uname -a").


Regards,

Anders
-- 
1&1 Internet AG  System Design
Brauerstrasse 48 v://49.721.91374.50
D-76135 Karlsruhef://49.721.91374.225

Amtsgericht Montabaur HRB 6484
Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Andreas Gauger,
Thomas Gottschlich, Matthias Greve, Robert Hoffmann, Norbert Lang, Achim Weiss
Aufsichtsratsvorsitzender: Michael Scheeren
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: broken dpt_i2o in 2.6.23 (was: ext2 check page: bad entry in directory) (fwd)

2007-12-12 Thread Anders Henke
Am 12.12.2007 schrieb Andrew Morton:
> On Wed, 12 Dec 2007 11:58:41 +0100 Anders Henke <[EMAIL PROTECTED]> wrote:
> 
> > Hi,
> > 
> > I'd like to let you now that my boxes are running a 32-bit kernel, so
> > the 64-bit-uncleanliness shouldn't apply to my boxes; however,
> > 
> > http://www.miquels.cistron.nl/linux/dpt_i2o-64bit-2.6.23.patch
> > 
> > fixed the issue on my testbox.
> > 
> > I took a clean 2.6.23, applied patch, recompiled the kernel, reboot: works.
> 
> What a huge patch :(
> 
> We already reverted the offening patch so I assume that 2.6.24-rc5 is
> working for you?

Yes, the vanilla 2.6.24-rc5 works fine (at least it's booting :-).

Linux rdb140 2.6.24-rc5 #1 SMP Wed Dec 12 15:06:05 CET 2007 i686 GNU/Linux

> I guess we need to look at restoring "dpt_i2o: convert to SCSI hotplug
> model" and then absorbing what Miquel has done there.


I've tried 2.6.23 with

http://www.miquels.cistron.nl/linux/linux-2.6.23+24-dpt_i2o-dma64.patch

... and that's enough to make my boxes boot again.


Regards,

Anders
-- 
1&1 Internet AGEnter any 11-digit prime number to continue.
Brauerstrasse 48   v://49.721.91374.50
D-76135 Karlsruhe  f://49.721.91374.225

Amtsgericht Montabaur HRB 6484
Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Andreas Gauger,
Thomas Gottschlich, Matthias Greve, Robert Hoffmann, Norbert Lang, Achim Weiss
Aufsichtsratsvorsitzender: Michael Scheeren
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


broken dpt_i2o (was: ext2_check_page: bad entry in directory)

2007-11-28 Thread Anders Henke
Hi,

I've been bitten by the problem noted in the lkml message of rougly the same
subject, dated back on Oct/24/2007. 
My boxes were running 2.6.19 and have been upgraded to 2.6.23.1, but their
bootup failed when trying to mount the root (ext2) filesystem:

---cut
serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
00:08: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
00:09: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
Loading Adaptec I2O RAID: Version 2.4 Build 5go
Detecting Adaptec I2O RAID controllers...
ACPI: PCI Interrupt :04:08.0[A] -> GSI 48 (level, low) -> IRQ 16
Adaptec I2O RAID controller 0 irq=16
 BAR0 f888 - size= 10
 BAR1 f8a0 - size= 100
dpti: If you have a lot of devices this could take a few minutes.
dpti0: Reading the hardware resource table.
TID 008  Vendor: ADAPTEC  Device: AIC-7902 Rev: 0001
TID 009  Vendor: ADAPTEC  Device: AIC-7902 Rev: 0001
TID 515  Vendor: ESG-SHV SDevice: SCA HSBP M21 Rev: 0.080   
TID 518  Vendor: ADAPTEC RDevice: RAID-1   Rev: 3B0AD   
scsi0 : Vendor: Adaptec  Model: 2010SFW:3B0A
scsi 0:1:0:0: Direct-Access ADAPTEC  RAID-1   3B0A PQ: 0
ANSI: 2
scsi 0:1:6:0: Processor ESG-SHV  SCA HSBP M21 0.08 PQ: 0
ANSI: 2
Adaptec aacraid driver 1.1-5[2449]-ms
GDT-HA: Storage RAID Controller Driver. Version: 3.05
GDT-HA: Found 0 PCI Storage RAID Controllers
3ware Storage Controller device driver for Linux v1.26.02.002.
3ware 9000 Storage Controller device driver for Linux v2.26.02.010.
sd 0:1:0:0: [sda] 143374336 512-byte hardware sectors (73408 MB)
sd 0:1:0:0: [sda] Write Protect is off
sd 0:1:0:0: [sda] Write cache: enabled, read cache: enabled, supports
DPO and FUA
sd 0:1:0:0: [sda] 143374336 512-byte hardware sectors (73408 MB)
sd 0:1:0:0: [sda] Write Protect is off
sd 0:1:0:0: [sda] Write cache: enabled, read cache: enabled, supports
DPO and FUA
 sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 >
sd 0:1:0:0: [sda] Attached SCSI disk
PNP: PS/2 Controller [PNP0303:PS2K] at 0x60,0x64 irq 1
PNP: PS/2 appears to have AUX port disabled, if this is incorrect please
boot with i8042.nopnp
serio: i8042 KBD port at 0x60,0x64 irq 1
mice: PS/2 mouse device common for all mice
md: raid1 personality registered for level 1
EDAC MC: Ver: 2.1.0 Oct 23 2007
TCP cubic registered
NET: Registered protocol family 1
NET: Registered protocol family 17
Starting balanced_irq
Using IPI Shortcut mode
md: Autodetecting RAID arrays.
md: autorun ...
md: ... autorun DONE.
VFS: Mounted root (ext2 filesystem) readonly.
Freeing unused kernel memory: 264k freed
EXT2-fs error (device sda1): ext2_check_page: bad entry in directory #2:
rec_len is smaller than minimal - offset=0, inode=0, rec_len=0,
name_len=0
Warning: unable to open an initial console.
Kernel panic - not syncing: No init found.  Try passing init= option to
kernel.
Rebooting in 30 seconds..
---cut

Rebooting the box into 2.6.19 works without any problems.

I've checked the changelogs for 2.6.24-rc*, but haven't come across a
solution for this issue; but maybe I've also overseen the point.

http://lkml.org/lkml/2007/10/24/224, this bug has been reported earlier.

I've contacted Jan Kara off-list; as booting into 2.6.19 works and e2fsck 
on an e2image file doesn't show any errors, we assumed that the Ext2 itself
is fine.

As "everything is reported as being zero" is quite odd an Jan took a
guess that it might be block-layer or driver-related, I've assumed
that the driver is responsible for this; just out of the curiousity, 
I've manually replaced the dpt_i2o driver by the 2.6.19 one by copying 
driver/scsi/dpt_i2o.c driver/scsi/dpti.h and driver/scsi/dpt/ into a 
vanilla 2.6.23.1. kernel; using this kernel fixed the issue for me.

I haven't yet fine-tested from which kernel release on the dpt_i2o driver 
behaves like this and spews out zeroed blocks when trying to mount
the rootfs. Maybe this is just some timing issue.


For some strange reason, this doesn't affect all boxes running the
dpt_i2o driver.

Affected (verified on 6 out of 6 tested boxes so far):

Intel SE7501WV2S using an Adaptec 2010S with the following "lspci -vn"-section:

:04:08.0 0104: 1044:a511 (rev 01)
Subsystem: 1044:c035
Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 16
BIST result: 00
Memory at fe90 (32-bit, non-prefetchable) [size=1M]
Memory at fb00 (32-bit, prefetchable) [size=16M]
Memory at f800 (32-bit, prefetchable) [size=32M]
Expansion ROM at f620 [disabled] [size=32K]
Capabilities: [44] Power Management version 2

Not affected are e.g. a box with a Supermicro X5DPR using an Adaptec 2015S
and the following "lspci -vn"-section:

:03:03.0 0104: 1044:a511 (rev 01)
Subsystem: 1044:c034
Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 16
BIST result: 00
Memory at f830 (32-bit, non-prefetchable) [size=1M]
Memory at fb00 (32-bit,

Re: broken dpt_i2o in 2.6.23 (was: ext2_check_page: bad entry in directory)

2007-11-29 Thread Anders Henke
On November 28 2007, Anders Henke wrote:
> As "everything is reported as being zero" is quite odd an Jan took a
> guess that it might be block-layer or driver-related, I've assumed
> that the driver is responsible for this; just out of the curiousity, 
> I've manually replaced the dpt_i2o driver by the 2.6.19 one by copying 
> driver/scsi/dpt_i2o.c driver/scsi/dpti.h and driver/scsi/dpt/ into a 
> vanilla 2.6.23.1. kernel; using this kernel fixed the issue for me.
> 
> I haven't yet fine-tested from which kernel release on the dpt_i2o driver 
> behaves like this and spews out zeroed blocks when trying to mount
> the rootfs. Maybe this is just some timing issue.

I've started the fine-tests and can say so far that dpt_i2o from 
2.6.22 is still fine. Test is simple:

[EMAIL PROTECTED]:/usr/src/linux-2.6.22/drivers/scsi/dpt$ cp -r dpt/ dpt_i2o.c 
dpti.h /usr/src/linux-2.6.23.1/drivers/scsi/

... recompile the kernel, reboot: works.

2.6.22 and 2.6.23 differ in terms of the dpt_i2o driver by two different
patch sets:
-one 2 Kb small set of patches from 2.6.22 to 2.6.22-rc1
-one 7 Kb set of patches from 2.6.23-rc2 to 2.6.23-rc3
-one 162 Kb set of patches from 2.6.23-rc9 to 2.6.23-rc10.

When applying the 2.6.23-rc1-based driver to "my" 2.6.31.1 kernel,
the "zero blocks"-symptom show up, so it's the "lucky" situation
that the smallest patch actually seams to be the broken one.

According to the 2.6.23-rc1 short-form changelog, there is
one major edit on the dpt_i2o driver:

FUJITA Tomonori 

  [SCSI] dpt_i2o: convert to use the data buffer accessors

Stephen Rothwell 
  dpt_i2o depends on virt_to_bus

Fujita, would you please take a look at this?

I think that something's broken in there, leading to the dpt_i2o 
sending out blocks of zeroes right after initialization, at least on
some specific controllers (in this case, Adaptec 2010S on Intel
SE7501WV2S-based boxes).

I don't have insight kernel driver development knowledge, so I'm
quite out of help right now. Nevertheless, I'll add the diff
from 2.6.22 to 2.6.23-rc1 in terms of dpt_i2o:

---cut
diff -Nur linux-2.6.22/drivers/scsi/dpt_i2o.c 
linux-2.6.23-rc1/drivers/scsi/dpt_i2o.c
--- linux-2.6.22/drivers/scsi/dpt_i2o.c 2007-07-09 01:32:17.0 +0200
+++ linux-2.6.23-rc1/drivers/scsi/dpt_i2o.c 2007-07-22 22:41:00.0 
+0200
@@ -2078,12 +2078,13 @@
u32 *lenptr;
int direction;
int scsidir;
+   int nseg;
u32 len;
u32 reqlen;
s32 rcode;
 
memset(msg, 0 , sizeof(msg));
-   len = cmd->request_bufflen;
+   len = scsi_bufflen(cmd);
direction = 0x; 

scsidir = 0x;   // DATA NO XFER
@@ -2140,21 +2141,21 @@
lenptr=mptr++;  /* Remember me - fill in when we know */
reqlen = 14;// SINGLE SGE
/* Now fill in the SGList and command */
-   if(cmd->use_sg) {
-   struct scatterlist *sg = (struct scatterlist 
*)cmd->request_buffer;
-   int sg_count = pci_map_sg(pHba->pDev, sg, cmd->use_sg,
-   cmd->sc_data_direction);
 
+   nseg = scsi_dma_map(cmd);
+   BUG_ON(nseg < 0);
+   if (nseg) {
+   struct scatterlist *sg;
 
len = 0;
-   for(i = 0 ; i < sg_count; i++) {
+   scsi_for_each_sg(cmd, sg, nseg, i) {
*mptr++ = direction|0x1000|sg_dma_len(sg);
len+=sg_dma_len(sg);
*mptr++ = sg_dma_address(sg);
-   sg++;
+   /* Make this an end of list */
+   if (i == nseg - 1)
+   mptr[-2] = direction|0xD000|sg_dma_len(sg);
}
-   /* Make this an end of list */
-   mptr[-2] = direction|0xD000|sg_dma_len(sg-1);
reqlen = mptr - msg;
*lenptr = len;

@@ -2163,16 +2164,8 @@
len, cmd->underflow);
}
} else {
-   *lenptr = len = cmd->request_bufflen;
-   if(len == 0) {
-   reqlen = 12;
-   } else {
-   *mptr++ = 0xD000|direction|cmd->request_bufflen;
-   *mptr++ = pci_map_single(pHba->pDev,
-   cmd->request_buffer,
-   cmd->request_bufflen,
-   cmd->sc_data_direction);
-   }
+   *lenptr = len = 0;
+   reqlen = 12;
}

/* Stick the headers on */
@@ -2232,7 +2225,7 @@
hba_status = detailed_status >> 8;
 
// calculate resid for sg 
-   cmd->resid = cmd->request_bu

Re: broken dpt_i2o in 2.6.23 (was: ext2_check_page: bad entry in directory) (fwd)

2007-11-29 Thread Anders Henke
On Nov 29 2007, FUJITA Tomonori wrote:
> On Thu, 29 Nov 2007 14:03:19 +0100
> Jan Kara <[EMAIL PROTECTED]> wrote:
> 
> >   Adding relevant people and lists to CC...
> > 
> > Honza
> > 
> > - Forwarded message from Anders Henke <[EMAIL PROTECTED]> -
> > 
> > Date:   Thu, 29 Nov 2007 13:31:50 +0100
> > From: Anders Henke <[EMAIL PROTECTED]>
> > To: linux-kernel@vger.kernel.org
> > Subject: Re: broken dpt_i2o in 2.6.23 (was: ext2_check_page: bad entry in 
> > directory)
> > User-Agent: Mutt/1.5.13 (2006-08-11)
> > 
> > On November 28 2007, Anders Henke wrote:
> > > As "everything is reported as being zero" is quite odd an Jan took a
> > > guess that it might be block-layer or driver-related, I've assumed
> > > that the driver is responsible for this; just out of the curiousity, 
> > > I've manually replaced the dpt_i2o driver by the 2.6.19 one by copying 
> > > driver/scsi/dpt_i2o.c driver/scsi/dpti.h and driver/scsi/dpt/ into a 
> > > vanilla 2.6.23.1. kernel; using this kernel fixed the issue for me.
> > > 
> > > I haven't yet fine-tested from which kernel release on the dpt_i2o driver 
> > > behaves like this and spews out zeroed blocks when trying to mount
> > > the rootfs. Maybe this is just some timing issue.
> > 
> > I've started the fine-tests and can say so far that dpt_i2o from 
> > 2.6.22 is still fine. Test is simple:
> > 
> > [EMAIL PROTECTED]:/usr/src/linux-2.6.22/drivers/scsi/dpt$ cp -r dpt/ 
> > dpt_i2o.c dpti.h /usr/src/linux-2.6.23.1/drivers/scsi/
> > 
> > ... recompile the kernel, reboot: works.
> > 
> > 2.6.22 and 2.6.23 differ in terms of the dpt_i2o driver by two different
> > patch sets:
> > -one 2 Kb small set of patches from 2.6.22 to 2.6.22-rc1
> > -one 7 Kb set of patches from 2.6.23-rc2 to 2.6.23-rc3
> > -one 162 Kb set of patches from 2.6.23-rc9 to 2.6.23-rc10.
> > 
> > When applying the 2.6.23-rc1-based driver to "my" 2.6.31.1 kernel,
> > the "zero blocks"-symptom show up, so it's the "lucky" situation
> > that the smallest patch actually seams to be the broken one.
> > 
> > According to the 2.6.23-rc1 short-form changelog, there is
> > one major edit on the dpt_i2o driver:
> > 
> > FUJITA Tomonori 
> > 
> >   [SCSI] dpt_i2o: convert to use the data buffer accessors
> > 
> > Stephen Rothwell 
> >   dpt_i2o depends on virt_to_bus
> > 
> > Fujita, would you please take a look at this?
> 
> Sorry about the bug. Can you try this?
> 
> 
> diff --git a/drivers/scsi/dpt_i2o.c b/drivers/scsi/dpt_i2o.c
> index 8258506..1255b26 100644
> --- a/drivers/scsi/dpt_i2o.c
> +++ b/drivers/scsi/dpt_i2o.c
> @@ -3295,7 +3295,7 @@ static struct scsi_host_template adpt_template = {
>   .this_id= 7,
>   .cmd_per_lun= 1,
>   .use_clustering = ENABLE_CLUSTERING,
> - .use_sg_chaining= ENABLE_SG_CHAINING,
> + .use_sg_chaining= DISABLE_SG_CHAINING,
>  };
>  
>  static s32 adpt_scsi_register(adpt_hba* pHba)

The structure to patch does look different and doesn't include an
tag "use_sg_chaining":

---cut
static struct scsi_host_template adpt_template = {
.name   = "dpt_i2o",
.proc_name  = "dpt_i2o",
.proc_info  = adpt_proc_info,
.info   = adpt_info,
.queuecommand   = adpt_queue,
.eh_abort_handler   = adpt_abort,
.eh_device_reset_handler = adpt_device_reset,
.eh_bus_reset_handler   = adpt_bus_reset,
.eh_host_reset_handler  = adpt_reset,
.bios_param = adpt_bios_param,
.slave_configure= adpt_slave_configure,
.can_queue  = MAX_TO_IOP_MESSAGES,
.this_id= 7,
.cmd_per_lun= 1,
.use_clustering = ENABLE_CLUSTERING,
};

static s32 adpt_scsi_register(adpt_hba* pHba)
---cut



Anders
-- 
1&1 Internet AG  System Design
Brauerstrasse 48 v://49.721.91374.50
D-76135 Karlsruhef://49.721.91374.225

Amtsgericht Montabaur HRB 6484
Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Andreas Gauger,
Thomas Gottschlich, Matthias Greve, Robert Hoffmann, Norbert Lang, Achim Weiss
Aufsichtsratsvorsitzender: Michael Scheeren
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: broken dpt_i2o in 2.6.23 (was: ext2_check_page: bad entry in directory) (fwd)

2007-11-30 Thread Anders Henke
Am 29.11.2007 schrieb Matthew Wilcox:
> On Thu, Nov 29, 2007 at 05:45:57PM +0100, Anders Henke wrote:
> > On Nov 29 2007, FUJITA Tomonori wrote:
> > > @@ -3295,7 +3295,7 @@ static struct scsi_host_template adpt_template = {
> > >   .this_id= 7,
> > >   .cmd_per_lun= 1,
> > >   .use_clustering = ENABLE_CLUSTERING,
> > > - .use_sg_chaining= ENABLE_SG_CHAINING,
> > > + .use_sg_chaining= DISABLE_SG_CHAINING,
> > >  };
> > >  
> > >  static s32 adpt_scsi_register(adpt_hba* pHba)
> > 
> > The structure to patch does look different and doesn't include an
> > tag "use_sg_chaining":
> > 
> > .this_id= 7,
> > .cmd_per_lun= 1,
> > .use_clustering = ENABLE_CLUSTERING,
> 
> Just add the line
>   .use_sg_chaining= DISABLE_SG_CHAINING,
> 
> > };


Just out of curiosity, I've tried 2.6.24-rc3 and patched the kernel
accordingly (DISABLE_SG_CHAINING): doesn't boot successfully,
same error as usual:

EXT2-fs error (device sda1): ext2_check_page: bad entry in directory #2:
rec_len is smaller than minimal - offset=0, inode=0, rec_len=0,
name_len=0
Warning: unable to open an initial console.
Kernel panic - not syncing: No init found.  Try passing init= option to
kernel.

As sent in a parallel mail, I've found out that 2.6.23-rc2 works and
2.6.23-rc3 shows the same problems - so the problem has to be searched
in the dpt_i2o-changes made for 2.6.23-rc3.


Anders
-- 
1&1 Internet AG  System Design
Brauerstrasse 48 v://49.721.91374.50
D-76135 Karlsruhef://49.721.91374.225

Amtsgericht Montabaur HRB 6484
Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Andreas Gauger,
Thomas Gottschlich, Matthias Greve, Robert Hoffmann, Norbert Lang, Achim Weiss
Aufsichtsratsvorsitzender: Michael Scheeren
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: broken dpt_i2o in 2.6.23 (was: ext2_check_page: bad entry in directory) (fwd)

2007-11-30 Thread Anders Henke
Am 30.11.2007 schrieb FUJITA Tomonori:
> > > > According to the 2.6.23-rc1 short-form changelog, there is
> > > > one major edit on the dpt_i2o driver:
> > > > 
> > > > FUJITA Tomonori 
> > > > 
> > > >   [SCSI] dpt_i2o: convert to use the data buffer accessors
> > > > 
> > > > Stephen Rothwell 
> > > >   dpt_i2o depends on virt_to_bus
> > > > 
> > > > Fujita, would you please take a look at this?
> > > 
> > > Sorry about the bug. Can you try this?
> > > 
> > > 
> > > diff --git a/drivers/scsi/dpt_i2o.c b/drivers/scsi/dpt_i2o.c
> > > index 8258506..1255b26 100644
> > > --- a/drivers/scsi/dpt_i2o.c
> > > +++ b/drivers/scsi/dpt_i2o.c
> > > @@ -3295,7 +3295,7 @@ static struct scsi_host_template adpt_template = {
> > >   .this_id= 7,
> > >   .cmd_per_lun= 1,
> > >   .use_clustering = ENABLE_CLUSTERING,
> > > - .use_sg_chaining= ENABLE_SG_CHAINING,
> > > + .use_sg_chaining= DISABLE_SG_CHAINING,
> > >  };
> > >  
> > >  static s32 adpt_scsi_register(adpt_hba* pHba)
> > 
> > The structure to patch does look different and doesn't include an
> > tag "use_sg_chaining":
> 
> Sorry, I misread your bug report. If you use 2.6.23, the sg chaining
> is unrelated.
> 
> What architecture do you use?

"Mainstream" 32-bit-x86, the affected boxes are running Intel Xeons (P4)
at 2.66 or 2.8 GHz. 

In between, I've ruled out that the static assignment isn't
source of the problem.  And due to some manually made "make clean" which 
didn't clean enough, I've also pointed out the wrong patch - sorry,
Fujita, definitely the right one which breaks my boxes is the dpt_i2o patch 
from 2.6.23-rc2 to 2.6.23-rc3 (7 kb in Size) from Matthew Wilcox.

commit 55d9fcf57ba5ec427544fca7abc335cf3da78160
Author: Matthew Wilcox <[EMAIL PROTECTED]>
Date:   Mon Jul 30 15:19:18 2007 -0600

[SCSI] dpt_i2o: convert to SCSI hotplug model

 - Delete refereces to HOSTS_C
 - Switch to module_init/module_exit instead of detect/release
 - Don't pass around the host template and rename it to
   adpt_template
 - Switch from scsi_register/scsi_unregister to scsi_host_alloc,
   scsi_add_host, scsi_scan_host and scsi_host_put.

Signed-off-by: Matthew Wilcox <[EMAIL PROTECTED]>
Acked-by: "Salyzyn, Mark" <[EMAIL PROTECTED]>
Signed-off-by: James Bottomley <[EMAIL PROTECTED]>

"Definitely" as in
-applied the diff-patch onto 2.6.23.1 with a dpt_i2o from 2.6.23-rc1
 to verify that booting failes.
-Recompiled a clean unpacked 2.6.23-rc2 to verify that the driver
 from -rc2 still works.
-Recompiled a clean unpacked 2.6.23-rc3 to verify that the driver
 from -rc3 breaks booting on my boxes.

So sorry for mispointing the bug to you, it's an issue for Matthew.


Anders
-- 
1&1 Internet AG  System Design
Brauerstrasse 48 v://49.721.91374.50
D-76135 Karlsruhef://49.721.91374.225

Amtsgericht Montabaur HRB 6484
Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Andreas Gauger,
Thomas Gottschlich, Matthias Greve, Robert Hoffmann, Norbert Lang, Achim Weiss
Aufsichtsratsvorsitzender: Michael Scheeren
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: broken dpt_i2o in 2.6.23 (was: ext2_check_page: bad entry in directory)

2007-12-05 Thread Anders Henke
On Tue, 4 Dec 2007 Andrew Morton wrote:
> On Wed, 05 Dec 2007 10:30:54 +0900 FUJITA Tomonori <[EMAIL PROTECTED]> wrote:
> 
> > On Tue, 4 Dec 2007 17:11:55 -0800
> > Andrew Morton <[EMAIL PROTECTED]> wrote:
> > 
> > > On Wed, 05 Dec 2007 10:04:03 +0900
> > > FUJITA Tomonori <[EMAIL PROTECTED]> wrote:
> > > 
> > > > On Tue, 4 Dec 2007 16:57:38 -0800
> > > > Andrew Morton <[EMAIL PROTECTED]> wrote:
> > > > 
> > > > > On Thu, 29 Nov 2007 13:31:50 +0100
> > > > > Anders Henke <[EMAIL PROTECTED]> wrote:
> > > > > 
> > > > > > On November 28 2007, Anders Henke wrote:
> > > > > > > As "everything is reported as being zero" is quite odd an Jan 
> > > > > > > took a
> > > > > > > guess that it might be block-layer or driver-related, I've assumed
> > > > > > > that the driver is responsible for this; just out of the 
> > > > > > > curiousity, 
> > > > > > > I've manually replaced the dpt_i2o driver by the 2.6.19 one by 
> > > > > > > copying 
> > > > > > > driver/scsi/dpt_i2o.c driver/scsi/dpti.h and driver/scsi/dpt/ 
> > > > > > > into a 
> > > > > > > vanilla 2.6.23.1. kernel; using this kernel fixed the issue for 
> > > > > > > me.
> > > > > > > 
> > > > > > > I haven't yet fine-tested from which kernel release on the 
> > > > > > > dpt_i2o driver 
> > > > > > > behaves like this and spews out zeroed blocks when trying to mount
> > > > > > > the rootfs. Maybe this is just some timing issue.
> > > > > > 
> > > > > > I've started the fine-tests and can say so far that dpt_i2o from 
> > > > > > 2.6.22 is still fine. Test is simple:
> > > > > > 
> > > > > > [EMAIL PROTECTED]:/usr/src/linux-2.6.22/drivers/scsi/dpt$ cp -r 
> > > > > > dpt/ dpt_i2o.c dpti.h /usr/src/linux-2.6.23.1/drivers/scsi/
> > > > > > 
> > > > > > ... recompile the kernel, reboot: works.
> > > > > > 
> > > > > > 2.6.22 and 2.6.23 differ in terms of the dpt_i2o driver by two 
> > > > > > different
> > > > > > patch sets:
> > > > > > -one 2 Kb small set of patches from 2.6.22 to 2.6.22-rc1
> > > > > > -one 7 Kb set of patches from 2.6.23-rc2 to 2.6.23-rc3
> > > > > > -one 162 Kb set of patches from 2.6.23-rc9 to 2.6.23-rc10.
> > > > > > 
> > > > > > When applying the 2.6.23-rc1-based driver to "my" 2.6.31.1 kernel,
> > > > > > the "zero blocks"-symptom show up, so it's the "lucky" situation
> > > > > > that the smallest patch actually seams to be the broken one.
> > > > > > 
> > > > > > According to the 2.6.23-rc1 short-form changelog, there is
> > > > > > one major edit on the dpt_i2o driver:
> > > > > > 
> > > > > > FUJITA Tomonori 
> > > > > > 
> > > > > >   [SCSI] dpt_i2o: convert to use the data buffer accessors
> > > > > > 
> > > > > > Stephen Rothwell 
> > > > > >   dpt_i2o depends on virt_to_bus
> > > > > > 
> > > > > > Fujita, would you please take a look at this?
> > > > > 
> > > > > He won't have seen this.  cc's added.
> > > > > 
> > > > > > I think that something's broken in there, leading to the dpt_i2o 
> > > > > > sending out blocks of zeroes right after initialization, at least on
> > > > > > some specific controllers (in this case, Adaptec 2010S on Intel
> > > > > > SE7501WV2S-based boxes).
> > > > > > 
> > > > > > I don't have insight kernel driver development knowledge, so I'm
> > > > > > quite out of help right now. Nevertheless, I'll add the diff
> > > > > > from 2.6.22 to 2.6.23-rc1 in terms of dpt_i2o:
> > > > > > 
> > > > > 
> > > > > Can you please confirm that this revert (against 2.6.24-rc4) fixes 
> > > > > the data
> > > > > corruption problems?
> > > > 
> > > > Anders said