Re: AIC7xxx on 2.6.18

2007-02-02 Thread Sean Bruno
On Fri, 2007-02-02 at 23:12 -0600, Mark Rustad wrote: > On Feb 2, 2007, at 6:42 PM, Wakko Warner wrote: > > > Andrew Morton wrote: > >> Yes, getting the oops traces will help, thanks. And confirmation > >> on a more recent kernel would be good. > > > > Here's what I get. I used netconsole so w

Re: AIC7xxx on 2.6.18

2007-02-02 Thread Mark Rustad
On Feb 2, 2007, at 6:42 PM, Wakko Warner wrote: Andrew Morton wrote: Yes, getting the oops traces will help, thanks. And confirmation on a more recent kernel would be good. Here's what I get. I used netconsole so whatever was logged prior to it starting was lost. The PC is a suprtmicro

Re: [BUG] Unable to handle kernel NULL pointer dereference...as_move_to_dispatch+0x11/0x135

2007-02-02 Thread Andrew Vasquez
On Fri, 02 Feb 2007, Randy Dunlap wrote: > On Fri, 2 Feb 2007 16:25:41 -0800 Andrew Morton wrote: > > > On Fri, 2 Feb 2007 12:56:30 -0800 > > Andrew Vasquez <[EMAIL PROTECTED]> wrote: > > > > > > > dt of=/dev/raw/raw1 procs=8 oncerr=abort bs=16k disable=stats > > > > > limit=2m passes=1

Re: [PATCH 2.6.19.2] SCSI sd: udev accessing an uninitialized scsi_disk results in a crash

2007-02-02 Thread James Bottomley
On Fri, 2007-02-02 at 17:56 -0800, Greg KH wrote: > > Thanks - I'll queue this up for 2.6.20 also. > > No objection from me, as long as James says this is ok. > > I wonder why we haven't noticed this in the past? Because the race is so small ... I'll queue it in the rc-fixes tree .. I have thre

Re: [PATCH 2.6.19.2] SCSI sd: udev accessing an uninitialized scsi_disk results in a crash

2007-02-02 Thread Greg KH
On Fri, Feb 02, 2007 at 05:19:24PM -0800, Andrew Morton wrote: > On Fri, 2 Feb 2007 17:34:56 +0530 > Nagendra Singh Tomar <[EMAIL PROTECTED]> wrote: > > > Hi, > > sd_probe() calls class_device_add() even before initializing the > > sdkp->device variable. class_device_add() eventually results

Re: [BUG] Unable to handle kernel NULL pointer dereference...as_move_to_dispatch+0x11/0x135

2007-02-02 Thread Randy Dunlap
On Fri, 2 Feb 2007 16:25:41 -0800 Andrew Morton wrote: > On Fri, 2 Feb 2007 12:56:30 -0800 > Andrew Vasquez <[EMAIL PROTECTED]> wrote: > > > > > dt of=/dev/raw/raw1 procs=8 oncerr=abort bs=16k disable=stats > > > > limit=2m passes=100 pattern=iot dlimit=2048 > > What is this mysteri

Re: [PATCH 2.6.19.2] SCSI sd: udev accessing an uninitialized scsi_disk results in a crash

2007-02-02 Thread Andrew Morton
On Fri, 2 Feb 2007 17:34:56 +0530 Nagendra Singh Tomar <[EMAIL PROTECTED]> wrote: > Hi, > sd_probe() calls class_device_add() even before initializing the > sdkp->device variable. class_device_add() eventually results in the user mode > udev program to be called. udev program can read the

RE: [PATCH 0/2] : definion, code, and use of new SCSI ML hoststatus DID_COND_REQUEUE

2007-02-02 Thread Qi, Yanling
I agree with Eric. RDAC/MPP will survive with the straight SAM BUSY status. --yanling > -Original Message- > From: Moore, Eric > Sent: Friday, February 02, 2007 5:59 PM > To: Edward Goggin; James Bottomley; Qi, Yanling > Cc: linux-scsi@vger.kernel.org > Subject: RE: [PATCH 0/2] : definion,

RE: [PATCH 0/2] : definion, code, and use of new SCSI ML hoststatus DID_COND_REQUEUE

2007-02-02 Thread James Bottomley
On Fri, 2007-02-02 at 17:31 -0700, Qi, Yanling wrote: > [Qi, Yanling] The following code in the scsi_lib.c will be enough for > RDAC/MPP. BTW, why do we do "wait_for = (cmd->allowed + 1) * > cmd->timeout_per_command". With a sd request, the wait_for will be 180 > seconds. (SD_MAX_RETRIES=5 and SD_

RE: [PATCH 0/2] : definion, code, and use of new SCSI ML hoststatus DID_COND_REQUEUE

2007-02-02 Thread Qi, Yanling
> -Original Message- > From: [EMAIL PROTECTED] [mailto:linux-scsi- > [EMAIL PROTECTED] On Behalf Of Edward Goggin > Sent: Friday, February 02, 2007 5:34 PM > To: James Bottomley > Cc: linux-scsi@vger.kernel.org; Moore, Eric > I think I see your argument ... retries for BUSY and all other

Re: [BUG] Unable to handle kernel NULL pointer dereference...as_move_to_dispatch+0x11/0x135

2007-02-02 Thread Andrew Morton
On Fri, 2 Feb 2007 12:56:30 -0800 Andrew Vasquez <[EMAIL PROTECTED]> wrote: > > > dt of=/dev/raw/raw1 procs=8 oncerr=abort bs=16k disable=stats > > > limit=2m passes=100 pattern=iot dlimit=2048 What is this mysterious dt command, btw? - To unsubscribe from this list: send the line "u

RE: [PATCH 0/2] : definion, code, and use of new SCSI ML hoststatus DID_COND_REQUEUE

2007-02-02 Thread Moore, Eric
On Friday, February 02, 2007 4:34 PM, Edward Goggin wrote: > On Fri, 2007-02-02 at 17:18 -0600, James Bottomley wrote: > > On Fri, 2007-02-02 at 18:11 -0500, Edward Goggin wrote: > > > That solution doesn't work for the RDAC/MPP driver as the > BUSY status > > > handler retries indefinitely. We n

Re: [PATCH 0/2] : definion, code, and use of new SCSI ML host status DID_COND_REQUEUE

2007-02-02 Thread Edward Goggin
On Fri, 2007-02-02 at 17:18 -0600, James Bottomley wrote: > On Fri, 2007-02-02 at 18:11 -0500, Edward Goggin wrote: > > That solution doesn't work for the RDAC/MPP driver as the BUSY status > > handler retries indefinitely. We need a solution which works for both a > > bare metal host running RDAC

Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR

2007-02-02 Thread Matt Mackall
On Fri, Feb 02, 2007 at 05:58:04PM -0500, Mark Lord wrote: > Matt Mackall wrote: > >.. > >Also worth considering is that spending minutes trying to reread > >damaged sectors is likely to accelerate your death spiral. More data > >may be recoverable if you give up quickly in a first pass, then go >

Re: [PATCH 0/2] : definion, code, and use of new SCSI ML host status DID_COND_REQUEUE

2007-02-02 Thread James Bottomley
On Fri, 2007-02-02 at 18:11 -0500, Edward Goggin wrote: > That solution doesn't work for the RDAC/MPP driver as the BUSY status > handler retries indefinitely. We need a solution which works for both a > bare metal host running RDAC/MPP which for this use case, wants to get > control over the fail

Re: [PATCH 0/2] : definion, code, and use of new SCSI ML host status DID_COND_REQUEUE

2007-02-02 Thread Edward Goggin
On Fri, 2007-02-02 at 16:54 -0600, James Bottomley wrote: > On Fri, 2007-02-02 at 17:04 -0500, Edward Goggin wrote: > > Patch Set Summary: > > > > 1 Define new SCSI ML host status DID_COND_REQUEUE and > > add its handling code to scsi_decide_disposition. > > Scsi_decide_disposition retur

Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR

2007-02-02 Thread Mark Lord
Matt Mackall wrote: .. Also worth considering is that spending minutes trying to reread damaged sectors is likely to accelerate your death spiral. More data may be recoverable if you give up quickly in a first pass, then go back and manually retry damaged bits with smaller I/Os. All good input.

Re: [PATCH 0/2] : definion, code, and use of new SCSI ML host status DID_COND_REQUEUE

2007-02-02 Thread James Bottomley
On Fri, 2007-02-02 at 17:04 -0500, Edward Goggin wrote: > Patch Set Summary: > > 1 Define new SCSI ML host status DID_COND_REQUEUE and > add its handling code to scsi_decide_disposition. > Scsi_decide_disposition returns ADD_TO_MLQUEUE IFF > not REQ_FAILFAST. > > 2 Retur

Re: [BUG] Unable to handle kernel NULL pointer dereference...as_move_to_dispatch+0x11/0x135

2007-02-02 Thread Andrew Morton
On Fri, 2 Feb 2007 12:56:30 -0800 Andrew Vasquez <[EMAIL PROTECTED]> wrote: > On Thu, 01 Feb 2007, Andrew Morton wrote: > > > On Mon, 22 Jan 2007 10:35:10 -0800 Andrew Vasquez <[EMAIL PROTECTED]> wrote: > > > Basically what is happening from the FC side is the initiator executes > > > a simple dt

[PATCH 2/2] fusion: return DID_COND_REQUEUE if SCSI status is MPI_SCSI_STATUS_BUSY

2007-02-02 Thread Edward Goggin
From: Ed Goggin <[EMAIL PROTECTED]> Return DID_COND_REQUEUE instead of DID_BUS_BUSY for IOC status is SUCCESS and scsi_status is MPI_SCSI_STATUS_BUSY. Command will be retried via ADD_TO_MLQUEUE IFF not REQ_FAILFAST. Signed-off-by: Ed Goggin <[EMAIL PROTECTED]> diff --git a/drivers/message/fusio

[PATCH 1/2] scsi: add DID_COND_REQUEUE SCSI ML host status

2007-02-02 Thread Edward Goggin
From: Ed Goggin <[EMAIL PROTECTED]> Add new SCSI ML host status DID_COND_REQUEUE for ADD_TO_MLQUEUE IFF not REQ_FAILFAST. Signed-off-by: Ed Goggin <[EMAIL PROTECTED]> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index 2dce06a..d8e884b 100644 --- a/drivers/scsi/scsi_error.c

[PATCH 0/2] : definion, code, and use of new SCSI ML host status DID_COND_REQUEUE

2007-02-02 Thread Edward Goggin
Patch Set Summary: 1 Define new SCSI ML host status DID_COND_REQUEUE and add its handling code to scsi_decide_disposition. Scsi_decide_disposition returns ADD_TO_MLQUEUE IFF not REQ_FAILFAST. 2 Return DID_COND_REQUEUE instead of DID_BUS_BUSY host status

Re: [BUG] Unable to handle kernel NULL pointer dereference...as_move_to_dispatch+0x11/0x135

2007-02-02 Thread Andrew Vasquez
On Thu, 01 Feb 2007, Andrew Morton wrote: > On Mon, 22 Jan 2007 10:35:10 -0800 Andrew Vasquez <[EMAIL PROTECTED]> wrote: > > Basically what is happening from the FC side is the initiator executes > > a simple dt test: > > > > dt of=/dev/raw/raw1 procs=8 oncerr=abort bs=16k disable=stats

Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR

2007-02-02 Thread Douglas Gilbert
Alan wrote: >> The interesting point of this question is about the typically pattern of >> IO errors. On a read, it is safe to assume that you will have issues >> with some bounded numbers of adjacent sectors. > > Which in theory you can get by asking the drive for the real sector size > from th

Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR

2007-02-02 Thread Matt Mackall
On Fri, Feb 02, 2007 at 11:06:19AM -0500, Mark Lord wrote: > Alan wrote: > > > >If this is the right strategy for disk recovery for a given type of > >device then this ought to be an automatic strategy. Most end users will > >not have the knowledge to frob about in sysfs, and if the bad sector hits

Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR

2007-02-02 Thread Ric Wheeler
James Bottomley wrote: On Fri, 2007-02-02 at 14:42 +, Alan wrote: The interesting point of this question is about the typically pattern of IO errors. On a read, it is safe to assume that you will have issues with some bounded numbers of adjacent sectors. Which in theory you can

Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR

2007-02-02 Thread Mark Lord
Alan wrote: If this is the right strategy for disk recovery for a given type of device then this ought to be an automatic strategy. Most end users will not have the knowledge to frob about in sysfs, and if the bad sector hits at the wrong moment a sensible automatic recovery strategy is going to

Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR

2007-02-02 Thread James Bottomley
On Fri, 2007-02-02 at 14:42 +, Alan wrote: > > The interesting point of this question is about the typically pattern of > > IO errors. On a read, it is safe to assume that you will have issues > > with some bounded numbers of adjacent sectors. > > Which in theory you can get by asking the dr

Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR

2007-02-02 Thread Alan
> your system requirements are, what the system is trying to do (i.e., > when trying to recover a failing but not dead yet disk, IO errors should > be as quick as possible and we should choose an IO scheduler that does > not combine IO's). If this is the right strategy for disk recovery for a g

Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR

2007-02-02 Thread Alan
> The interesting point of this question is about the typically pattern of > IO errors. On a read, it is safe to assume that you will have issues > with some bounded numbers of adjacent sectors. Which in theory you can get by asking the drive for the real sector size from the ATA7 info. (We ough

Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR

2007-02-02 Thread Ric Wheeler
James Bottomley wrote: On Thu, 2007-02-01 at 15:02 -0500, Mark Lord wrote: I believe you made the first change in response to my prodding at the time, when libata was not returning valid sense data (no LBA) for media errors. The SCSI EH handling of that was rather poor at the time, and so

[PATCH 2.6.19.2] SCSI sd: udev accessing an uninitialized scsi_disk results in a crash

2007-02-02 Thread Nagendra Singh Tomar
Hi, sd_probe() calls class_device_add() even before initializing the sdkp->device variable. class_device_add() eventually results in the user mode udev program to be called. udev program can read the the allow_restart attribute of the newly created scsi device. This is resulting in a cra

Re: [PATCH] RESEND: SCSI, libata: add support for ATA_16 commands to libata ATAPI devices

2007-02-02 Thread Christoph Hellwig
On Thu, Feb 01, 2007 at 03:21:25PM -0500, Douglas Gilbert wrote: > My point is that the linux block layer and scsi mid > level should get out of the business of putting hard > limits place. Why? Both of them never have been in the business of putting hard limits in place. We currently have a hard