Re: [patch 1/7] libata: check for AN support

2007-04-24 Thread Tejun Heo
Hello,

Kristen Carlson Accardi wrote:
>  static unsigned int ata_print_id = 1;
> @@ -1744,6 +1745,23 @@ int ata_dev_configure(struct ata_device 
>   }
>   dev->cdb_len = (unsigned int) rc;
>  
> + /*
> +  * check to see if this ATAPI device supports
> +  * Asynchronous Notification
> +  */
> + if ((ap->flags & ATA_FLAG_AN) && ata_id_has_AN(id))
> + {
> + /* issue SET feature command to turn this on */
> + rc = ata_dev_set_AN(dev);

Please don't store err_mask into int rc.  Please store it to a separate
err_mask variable and report it when printing error message.

> + if (rc) {
> + ata_dev_printk(dev, KERN_ERR,
> + "unable to set AN\n");
> + rc = -EINVAL;

Wouldn't -EIO be more appropriate?

> + goto err_out_nosup;
> + }
> + dev->flags |= ATA_DFLAG_AN;
> + }
> +

Not NACKing.  Just notes for future improvements.  We need to be more
careful here.  ATA/ATAPI world is filled with braindamaged devices and I
bet there are devices which advertises it can do AN but chokes when AN
is enabled.

This should be handled similarly to ACPI failure.  Currently ACPI does
the following.

1. try once, if fail, record that ACPI failed.  return error to trigger
retry.
2. try again, if fail again, ignore error if possible (!FROZEN) and turn
off ACPI.

This fallback mechanism for optional features can probably be
generalized and used for both ACPI and AN.

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/7] libata: check for AN support

2007-04-24 Thread Alan Cox
> + /*
> +  * check to see if this ATAPI device supports
> +  * Asynchronous Notification
> +  */
> + if ((ap->flags & ATA_FLAG_AN) && ata_id_has_AN(id))
> + {

Bracketing police ^^^

> + /* issue SET feature command to turn this on */
> + rc = ata_dev_set_AN(dev);
> + if (rc) {
> + ata_dev_printk(dev, KERN_ERR,
> + "unable to set AN\n");
> + rc = -EINVAL;
> + goto err_out_nosup;

How fatal is this - do we need to ignore the device at this point or
should we just pretend (possibly correctly) that the device itself does
not support notification. 

> @@ -299,6 +305,8 @@ struct ata_taskfile {
>  #define ata_id_queue_depth(id)   (((id)[75] & 0x1f) + 1)
>  #define ata_id_removeable(id)((id)[0] & (1 << 7))
>  #define ata_id_has_dword_io(id)  ((id)[50] & (1 << 0))
> +#define ata_id_has_AN(id)\
> + ((id[76] && (~id[76])) & ((id)[78] & (1 << 5)))

Might be nice to check ATA version as well to be paranoid but this all
looks ok as its a reserved field since way back when.

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 2/7] genhd: expose AN to user space

2007-04-24 Thread Tejun Heo
Kristen Carlson Accardi wrote:
> +static struct disk_attribute disk_attr_capability = {
> + .attr = {.name = "capability_flags", .mode = S_IRUGO },
> + .show   = disk_capability_read
> +};

How about just "capability"?  I think that would be more consistent with
other attributes.

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 7/7] libata: send event when AN received

2007-04-24 Thread Alan Cox
> + /* check the 'N' bit in word 0 of the FIS */
> + if (f[0] & (1 << 15)) {
> + int port_addr =  ((f[0] & 0x0f00) >> 8);
> + struct ata_device *adev = &ap->device[port_addr];

You can't be sure that the port_addr returned will be in range if a
device is malfunctioning...

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 5/7] genhd: send async notification on media change

2007-04-24 Thread Tejun Heo
Kristen Carlson Accardi wrote:
> Send an uevent to user space to indicate that a media change event has 
> occurred.
> 
> Signed-off-by: Kristen Carlson Accardi <[EMAIL PROTECTED]>
> 
> Index: 2.6-git/block/genhd.c
> ===
> --- 2.6-git.orig/block/genhd.c
> +++ 2.6-git/block/genhd.c
> @@ -643,6 +643,25 @@ struct seq_operations diskstats_op = {
>   .show   = diskstats_show
>  };
>  
> +static void media_change_notify_thread(struct work_struct *work)
> +{
> + struct gendisk *gd = container_of(work, struct gendisk, async_notify);
> + char event[] = "MEDIA_CHANGE=1";
> + char *envp[] = { event, NULL };
> +
> + /*
> +  * set enviroment vars to indicate which event this is for
> +  * so that user space will know to go check the media status.
> +  */
> + kobject_uevent_env(&gd->kobj, KOBJ_CHANGE, envp);
> +}
> +
> +void genhd_media_change_notify(struct gendisk *disk)
> +{
> + schedule_work(&disk->async_notify);
> +}
> +EXPORT_SYMBOL_GPL(genhd_media_change_notify);

genhd might go away while async_notify work is in-flight.  You'll need
to either grab a reference or wait for the work to finish in release
routine.

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] aacraid: fails to initialize after a kexec operation

2007-04-24 Thread Vivek Goyal
On Mon, Apr 23, 2007 at 01:20:32PM -0400, Salyzyn, Mark wrote:
> That is a failure to route the interrupts and is possibly an issue with
> the kernel and the hardware, and not the driver directly (since there is
> an expectation that request_irq will connect the interrupt to the
> interrupt service routine). Judith reported success in the past with
> this patch on her hardware, perhaps the motherboard on your system has
> some odd BIOS setup of the hardware that is giving acpi or the apic some
> headaches? Can you check out success or failure on other motherboards?
> Please try the suggestions from the driver (safe flags)?
> 
> Sincerely -- Mark Salyzyn
> 

Hi Mark,

We don't even go through BIOS in kexec and kdump. So BIOS should not be an
issue.

Looks like you sent some message to controller and then waiting for an
interrupt from the controller as an indication of completion of command. In
this case you never seem to get an interrupt hence timeout.

To bypass this problem, I am now booting my second kernel with "irqpoll"
command line option. This will make sure that aacraid interrupt handler
gets invoked even if there is an interrupt routing issue.

This option does help in progressing the things but it ends up corrupting
something or other on the disk. In three attempts I get three types of
errors.

In first attempt I get continuous stream of following messages once
root file system has been mounted.

=
sda1: rw=0, want=9261304112, limit=41945652
attempt to access beyond end of device
sda1: rw=0, want=9261304112, limit=41945652
attempt to access beyond end of device
sda1: rw=0, want=9261304112, limit=41945652
attempt to access beyond end of device
sda1: rw=0, want=9261304112, limit=41945652
attempt to access beyond end of device
sda1: rw=0, want=9261304112, limit=41945652
attempt to access beyond end of device


In second attempt, it mounted the file system but it found some issue
with "resize" inode and asked me to run fsck manually. Which in turn 
deleted whole lot of inodes.

In third attemt it panics later when it finds ext3 to be corrupted.

=
Creating block device nodes.
Trying to resume from LABEL=SWAP-sda3
No suspend signature on swap, not resuming.
Creating root device.
Mounting root filesystem.
EXT3-fs: Magic mismatch, very weird !
mount: error mouKernel panic - not syncing: Attempted to kill init!
nting /dev/root
=== 

Following are relevant aacraid initiliazation messages on serial console.

===
Adaptec aacraid driver (1.1-5[2437]-mh4)
ACPI: PCI Interrupt :01:02.0[A] -> GSI 25 (level, low) -> IRQ 25
AAC0: kernel 5.2-0[11835] Jan  9 2007
AAC0: monitor 5.2-0[11835]
AAC0: bios 5.2-0[11835]
AAC0: serial 1625d1
AAC0: 64bit support enabled.
AAC0: 64 Bit DAC enabled
scsi0 : ServeRAID
scsi 0:0:0:0: Direct-Access IBM  x366 V1.0 PQ: 0 ANSI: 2
scsi 0:1:0:0: Direct-Access IBM-ESXS ST973401SS   B519 PQ: 0 ANSI: 5
scsi 0:1:1:0: Direct-Access IBM-ESXS ST973401SS   B519 PQ: 0 ANSI: 5
scsi 0:1:2:0: Direct-Access IBM-ESXS ST973401SS   B519 PQ: 0 ANSI: 5
scsi 0:3:0:0: Enclosure IBM  SAS SES-2 DEVICE 0.09 PQ: 0 ANSI: 5
sd 0:0:0:0: [sda] 429459456 512-byte hardware sectors (219883 MB)
sd 0:0:0:0: [sda] Assuming Write Enabled
sd 0:0:0:0: [sda] Assuming drive cache: write through
sd 0:0:0:0: [sda] 429459456 512-byte hardware sectors (219883 MB)
sd 0:0:0:0: [sda] Assuming Write Enabled
sd 0:0:0:0: [sda] Assuming drive cache: write through
 sda: sda1 sda2 sda3 sda4 < sda5 >
sd 0:0:0:0: [sda] Attached SCSI removable disk
sd 0:0:0:0: Attached scsi generic sg0 type 0
scsi 0:1:0:0: Attached scsi generic sg1 type 0
scsi 0:1:1:0: Attached scsi generic sg2 type 0
scsi 0:1:2:0: Attached scsi generic sg3 type 0
scsi 0:3:0:0: Attached scsi generic sg4 type 13


I am not sure why this reset leaves file system in corrupted state and
is there a better way to handle this? Link syncing the existing commands
before restarting it.

Should one keep a dedicated partition on the disk and not mount it in first
kernel. Mount this partition only in second kernel to save the dump. I shall
have to test such configuration.

Thanks
Vivek
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Kernel crash with AIC94xx

2007-04-24 Thread Constantin Teodorescu
Hello, I hope I can get a little help from you regarding this kind of 
crash !


Hardware:
- server, TYAN Tempest i5000VS S5372 BIOS v1.0.4
- 8 SATA drives Seagate 136 Gb attached on a AIC-9410 controller
- one IDE (boot disk and system)
- 8 Gb RAM

Software:
- OpenSUSE 10.2 x86_64 (tried also with SLES 10 but didn't succed in 
compiling adp94xx driver from Adaptec)


Kernels: i tried with any  of them : linux-2.6.20.1 ,  linux-2.6.20.4 ,  
linux-2.6.20.7 , linux-2.6.21.rc7
The last one has the 1.0.3 version of aic94xx driver but the results are 
the same :-(


Description:
- the server is running a very heavy loaded PostgreSQL database with 
tables spread on those SAS drives, a lot of writes and reads
- at least 4, 5 times a day I got some warnings in /var/log/messages 
(sas: Enter sas_scsi_recover_host , trying to find task XXX ---> 
aic94xx: came back from clear nexus) but the system is still working
- more rarely (once per day) I got the following bug in 
/var/log/messages and the system is crashed, SAS drivers are not working 
anymore, shutdown command is waiting forever, need to hardware reset the 
system



Apr 24 07:22:20 bnd kernel: sas: command 0x8101c9f5e2c0, task 
0x81005bfcb080, timed out: EH_NOT_HANDLED
Apr 24 07:22:20 bnd kernel: sas: command 0x810047f9dd00, task 
0x81007df80cc0, timed out: EH_NOT_HANDLED
Apr 24 07:22:20 bnd kernel: sas: command 0x810164d31180, task 
0x8101247ad500, timed out: EH_NOT_HANDLED
Apr 24 07:22:20 bnd kernel: sas: command 0x81021b8af380, task 
0x81012e550ac0, timed out: EH_NOT_HANDLED
Apr 24 07:22:20 bnd kernel: sas: command 0x8101698c3940, task 
0x8101a3b69b80, timed out: EH_NOT_HANDLED
Apr 24 07:22:20 bnd kernel: sas: command 0x81011e865680, task 
0x8101a3b69380, timed out: EH_NOT_HANDLED
Apr 24 07:22:20 bnd kernel: sas: command 0x81000ce37340, task 
0x8101a3b69580, timed out: EH_NOT_HANDLED
Apr 24 07:22:20 bnd kernel: sas: command 0x810164d31a40, task 
0x810058a93dc0, timed out: EH_NOT_HANDLED
Apr 24 07:22:20 bnd kernel: sas: command 0x8100bc25b940, task 
0x81005bfcbc80, timed out: EH_NOT_HANDLED
Apr 24 07:22:20 bnd kernel: sas: command 0x81000ce37880, task 
0x81015856bd00, timed out: EH_NOT_HANDLED
Apr 24 07:22:20 bnd kernel: sas: command 0x81022fa2f940, task 
0x8101d2cf87c0, timed out: EH_NOT_HANDLED
Apr 24 07:22:20 bnd kernel: sas: command 0x8100bc25b080, task 
0x81005bfcb880, timed out: EH_NOT_HANDLED
Apr 24 07:22:20 bnd kernel: sas: command 0x81000ce37dc0, task 
0x8101d186a940, timed out: EH_NOT_HANDLED
Apr 24 07:22:20 bnd kernel: sas: command 0x81009c620640, task 
0x81010d46a940, timed out: EH_NOT_HANDLED
Apr 24 07:22:20 bnd kernel: sas: command 0x8100531ae1c0, task 
0x81012e9bf4c0, timed out: EH_NOT_HANDLED
Apr 24 07:22:20 bnd kernel: sas: command 0x8100531ae380, task 
0x8101d186a740, timed out: EH_NOT_HANDLED
Apr 24 07:22:20 bnd kernel: sas: command 0x81011e8654c0, task 
0x8101247ad100, timed out: EH_NOT_HANDLED
Apr 24 07:22:20 bnd kernel: sas: command 0x81009c620480, task 
0x81012e5502c0, timed out: EH_NOT_HANDLED
Apr 24 07:22:20 bnd kernel: sas: command 0x81000ce37180, task 
0x8101d2cf89c0, timed out: EH_NOT_HANDLED
Apr 24 07:22:20 bnd kernel: sas: command 0x81017d5268c0, task 
0x8101d186a540, timed out: EH_NOT_HANDLED
Apr 24 07:22:20 bnd kernel: sas: command 0x8101c9f5e800, task 
0x81015856b900, timed out: EH_NOT_HANDLED
Apr 24 07:22:20 bnd kernel: sas: command 0x81014f8db600, task 
0x81007df808c0, timed out: EH_NOT_HANDLED
Apr 24 07:22:20 bnd kernel: sas: command 0x81011e865bc0, task 
0x81012e550cc0, timed out: EH_NOT_HANDLED
Apr 24 07:22:20 bnd kernel: sas: command 0x81009c620100, task 
0x8101a3b69980, timed out: EH_NOT_HANDLED

Apr 24 07:22:20 bnd kernel: sas: Enter sas_scsi_recover_host
Apr 24 07:22:20 bnd kernel: sas: trying to find task 0x81005bfcb080
Apr 24 07:22:20 bnd kernel: sas: sas_scsi_find_task: aborting task 
0x81005bfcb080

Apr 24 07:22:25 bnd kernel: aic94xx: tmf timed out
Apr 24 07:22:25 bnd kernel: aic94xx: tmf came back
Apr 24 07:22:25 bnd kernel: aic94xx: task not done, clearing nexus
Apr 24 07:22:25 bnd kernel: aic94xx: asd_clear_nexus_index: PRE
Apr 24 07:22:25 bnd kernel: aic94xx: asd_clear_nexus_index: POST
Apr 24 07:22:25 bnd kernel: aic94xx: asd_clear_nexus_index: clear nexus 
posted, waiting...

Apr 24 07:22:30 bnd kernel: aic94xx: asd_clear_nexus_timedout: here
Apr 24 07:22:35 bnd kernel: aic94xx: came back from clear nexus
Apr 24 07:22:35 bnd kernel: aic94xx: task not done, clearing nexus
Apr 24 07:22:35 bnd kernel: aic94xx: asd_clear_nexus_index: PRE
Apr 24 07:22:35 bnd kernel: aic94xx: asd_clear_nexus_index: POST
Apr 24 07:22:35 bnd kernel: aic94xx: asd_clear_nexus_index: clear nexus 
posted, waiting...

Apr 24 07:22:35 bnd kernel: aic94xx: asd_clear_nexus_tasklet_complete: here
Apr 24 07:22:35 bnd kernel: aic94xx: asd_clea

Re: [PATCH] aacraid: fails to initialize after a kexec operation

2007-04-24 Thread Vivek Goyal
On Tue, Apr 24, 2007 at 02:14:44PM +0530, Vivek Goyal wrote:
> 
> In second attempt, it mounted the file system but it found some issue
> with "resize" inode and asked me to run fsck manually. Which in turn 
> deleted whole lot of inodes.
> 
> In third attemt it panics later when it finds ext3 to be corrupted.
> 
> =
> Creating block device nodes.
> Trying to resume from LABEL=SWAP-sda3
> No suspend signature on swap, not resuming.
> Creating root device.
> Mounting root filesystem.
> EXT3-fs: Magic mismatch, very weird !
> mount: error mouKernel panic - not syncing: Attempted to kill init!
> nting /dev/root
> === 
> 

Hi Mark,

Interesting observation. After above message I rebooted my system
expecting ext3 is corrupted and I shall have to try to recover it
using fsck. Nothing of that sort happened. System just booted fine.

This leaves me wondering why does ext things that Magic number is a 
mismatch while booting using kexec. Is AACRAID returning the write bytes
from the disk after an reset?

Thanks
Vivek
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/7] libata: check for AN support

2007-04-24 Thread Olivier Galibert
Sorry for replying to Alan's reply, I missed the original mail.

> > +#define ata_id_has_AN(id)  \
> > +   ((id[76] && (~id[76])) & ((id)[78] & (1 << 5)))

(a && ~a) & (b & 32)

I don't think that does what you think it does, because at that point
it's a funny way to write 0 ((0 or 1) binary-and (0 or 32)).

I'm not even sure what it is you want.  If for the first part you
wanted (id[76] != 0x00 && id[76] != 0xff), please write just that,
thanks :-)

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH]: Rewritten ESP driver, porters needed!

2007-04-24 Thread Christoph Hellwig
Overall the driver looks really nice, thanks a lot!

some comments:

> +#define esp_log_intr(f, a...) \
> +do { if (esp_debug & ESP_DEBUG_INTR) \
> + printk(f, ## a); \
> +} while (0)

would be nice to have dev_printk here, but sbus still seems to
lack driver model integration.

> +static void esp_map_dma(struct esp *esp, struct scsi_cmnd *cmd)
>  {
> + struct esp_cmd_priv *spriv = ESP_CMD_PRIV(cmd);
> + int dir = cmd->sc_data_direction;
>  
> + if (dir == DMA_NONE)
> + return;
>  
> + if (cmd->use_sg == 0) {
> + spriv->u.dma_addr = esp->ops->map_single(esp,
> +  cmd->request_buffer,
> +  cmd->request_bufflen,
> +  dir);
> + spriv->mapping_type = MAPPING_TYPE_SINGLE;
> + spriv->tot_residue = spriv->cur_residue = cmd->request_bufflen;
> + spriv->cur_sg = NULL;

The non-use-sg case is dead, you can put in BUG_ON()s here and in
the unmap path.

> +static void esp_build_sync_msg(struct esp *esp, u8 period, u8 offset)
>  {
> + esp->msg_out[0] = EXTENDED_MESSAGE;
> + esp->msg_out[1] = 3;
> + esp->msg_out[2] = EXTENDED_SDTR;
> + esp->msg_out[3] = period;
> + esp->msg_out[4] = offset;
> + esp->msg_out_len = 5;
> +}
>  
> +static void esp_build_wide_msg(struct esp *esp, int wide)
> +{
> + esp->msg_out[0] = EXTENDED_MESSAGE;
> + esp->msg_out[1] = 2;
> + esp->msg_out[2] = EXTENDED_WDTR;
> + esp->msg_out[3] = (wide ? 1 : 0);
> + esp->msg_out_len = 4;
>  }

These might actually be worth putting into the spi transport
class, taking an u8 * as first argument.  After all all
SPI drivers without smart firmware will need them.

> +/* If we get a non-tagged command, we let all the current
> + * tagged commands finish out before issuing the non-tagged
> + * one.  I found that if I did not do this, devices would
> + * completely hang.  In fact the SCSI-2 standard is pretty
> + * explicit about this in section 7.8, "Queued I/O processes":
> + *
> + *   ... An initiator may not mix the use of tagged and
> + *   untagged queuing for I/O processes to a logical unit,
> + *   except during a contingent allegiance or extended
> + *   contingent allegiance condition when only untagged
> + *   initial connections are allowed.
> + *
> + * The language at the end means that we have to issue REQUEST_SENSE
> + * commands even if tagged ones are pending, because those tagged
> + * commands will be suspended until the REQUEST_SENSE executes.
> + *
> + * When a request comes in here to issue a non-tagged command which is
> + * not a REQUEST_SENSE, we set the per-lun 'hold' state.  The next
> + * time we try to issue for this lun when 'hold' is true but
> + * 'num_tagged' has dropped to zero, we clear 'hold' and issue the
> + * non-tagged command.
> + *
> + * The 'hold' state exists to make sure we do not keep dispatching
> + * more tagged commands, thus starving out the non-tagged one.
> + */

The comment doesn't match the code.  You don't do the REQUEST_SENSE
special casing anymore since implementing autosense :)

> +static int esp_alloc_lun_tag(struct esp_cmd_entry *ent,
> +  struct esp_lun_data *lp)
> +{
> + if (!lp) {
> + /* When we don't have lun-data yet, we disallow
> +  * disconnects, so we do not have to see if this
> +  * untagged command matches a disconnected one and
> +  * thus return -EBUSY.
>*/
> + return 0;
> + }

Given that you allocate the lun-data in esp_slave_alloc this
can never happen.  Some more comments on the handling of
per-lun data here:

  - normally you allocate per-lun data in slave_alloc and free it
in slave_detroy.  This is guranteed to be save because
slave_alloc is called before the first I/O and slave_destroy
after the last I/O has finished.  No need for checking
of it already beeing allocated in esp_slave_alloc.
  - there is no need to keep track of per-lun data on your own.
the midlayer gives you sdev->hostdata for it, and you can
easily get at it in every place you use the lun data currently
 
Doing things properly also avoids the !lp checks in various places.
 
(the lundata management looks inspired by sym53c8xx, but it's probably
 not the best driver to be inspired by :))

> +/* When a contingent allegiance conditon is created, we force feed a
> + * REQUEST_SENSE command to the device to fetch the sense data.  I
> + * tried many other schemes, relying on the scsi error handling layer
> + * to send out the REQUEST_SENSE automatically, but this was difficult
> + * to get right especially in the presence of applications like smartd
> + * which use SG_IO to send out their own REQUEST_SENSE commands.
> + */
> +static void esp_autosense(struct esp *esp, struct esp_cmd_entry *ent)
>  {
> - stat

Re: [RFC PATCH]: Rewritten ESP driver, porters needed!

2007-04-24 Thread Christoph Hellwig
Oh, btw - there is a problem with the generic code beeing esp.ko -
we already have drivers/char/esp.c which buids into esp.ko for
ISA platforms, which have a bit of overlap with ESP-using platforms.
Maybe the driver should become esp_scsi.c/.ko or ncr_esp or
ncr53x9x?
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Kernel crash with AIC94xx

2007-04-24 Thread Brian King
Copying linux-scsi...

-Brian

Constantin Teodorescu wrote:
> Hello, I hope I can get a little help from you regarding this kind of 
> crash !
> 
> Hardware:
>  - server, TYAN Tempest i5000VS S5372 BIOS v1.0.4
>  - 8 SATA drives Seagate 136 Gb attached on a AIC-9410 controller
>  - one IDE (boot disk and system)
>  - 8 Gb RAM
> 
> Software:
>  - OpenSUSE 10.2 x86_64 (tried also with SLES 10 but didn't succed in 
> compiling adp94xx driver from Adaptec)
> 
> Kernels: i tried with any  of them : linux-2.6.20.1 ,  linux-2.6.20.4 ,  
> linux-2.6.20.7 , linux-2.6.21.rc7
> The last one has the 1.0.3 version of aic94xx driver but the results are 
> the same :-(
> 
> Description:
> - the server is running a very heavy loaded PostgreSQL database with 
> tables spread on those SAS drives, a lot of writes and reads
> - at least 4, 5 times a day I got some warnings in /var/log/messages 
> (sas: Enter sas_scsi_recover_host , trying to find task XXX ---> 
> aic94xx: came back from clear nexus) but the system is still working
> - more rarely (once per day) I got the following bug in 
> /var/log/messages and the system is crashed, SAS drivers are not working 
> anymore, shutdown command is waiting forever, need to hardware reset the 
> system
> 
> 
> Apr 24 07:22:20 bnd kernel: sas: command 0x8101c9f5e2c0, task 
> 0x81005bfcb080, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x810047f9dd00, task 
> 0x81007df80cc0, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x810164d31180, task 
> 0x8101247ad500, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x81021b8af380, task 
> 0x81012e550ac0, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x8101698c3940, task 
> 0x8101a3b69b80, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x81011e865680, task 
> 0x8101a3b69380, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x81000ce37340, task 
> 0x8101a3b69580, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x810164d31a40, task 
> 0x810058a93dc0, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x8100bc25b940, task 
> 0x81005bfcbc80, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x81000ce37880, task 
> 0x81015856bd00, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x81022fa2f940, task 
> 0x8101d2cf87c0, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x8100bc25b080, task 
> 0x81005bfcb880, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x81000ce37dc0, task 
> 0x8101d186a940, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x81009c620640, task 
> 0x81010d46a940, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x8100531ae1c0, task 
> 0x81012e9bf4c0, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x8100531ae380, task 
> 0x8101d186a740, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x81011e8654c0, task 
> 0x8101247ad100, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x81009c620480, task 
> 0x81012e5502c0, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x81000ce37180, task 
> 0x8101d2cf89c0, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x81017d5268c0, task 
> 0x8101d186a540, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x8101c9f5e800, task 
> 0x81015856b900, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x81014f8db600, task 
> 0x81007df808c0, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x81011e865bc0, task 
> 0x81012e550cc0, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x81009c620100, task 
> 0x8101a3b69980, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: Enter sas_scsi_recover_host
> Apr 24 07:22:20 bnd kernel: sas: trying to find task 0x81005bfcb080
> Apr 24 07:22:20 bnd kernel: sas: sas_scsi_find_task: aborting task 
> 0x81005bfcb080
> Apr 24 07:22:25 bnd kernel: aic94xx: tmf timed out
> Apr 24 07:22:25 bnd kernel: aic94xx: tmf came back
> Apr 24 07:22:25 bnd kernel: aic94xx: task not done, clearing nexus
> Apr 24 07:22:25 bnd kernel: aic94xx: asd_clear_nexus_index: PRE
> Apr 24 07:22:25 bnd kernel: aic94xx: asd_clear_nexus_index: POST
> Apr 24 07:22:25 bnd kernel: aic94xx: asd_clear_nexus_index: clear nexus 
> posted, waiting...
> Apr 24 07:22:30 bnd kernel: aic94xx: asd_clear_nexus_timedout: here
> Apr 24 07:22:35 bnd kernel: aic94xx: came back from clear nexus
> Apr 24 07:22:35 bnd kernel: aic94xx: task not done, clearing nexus
> Apr 24 07:22:35 bnd kernel: aic94xx: asd_clear_nexus_index: PRE
> Apr 24 07:22:35 bnd kernel: aic94

RE: [PATCH] aacraid: fails to initialize after a kexec operation

2007-04-24 Thread Salyzyn, Mark
The system BIOS sets up the card's PCI configuration and there is code
in the kernel that is capable of picking up some of the BIOS'
information from the BIOS Data Space (not sure if it is actively
collected in your configuration, you need a kernel flag to pick this
up). On kexec this BIOS Data Space information is missing (?) and if
there was any reconfiguration of the PCI space going on (I think only
the Linux BIOS project does this), kexec will inherit it. This issue
strikes me as a corrupted PCI configuration inherited in the kexec case,
such corrupted PCI configurations could be a motherboard specific issue
and can be related to the BIOS' initial setup for the initial kernel. At
least that is my thought process in questioning the motherboard BIOS or
hardware.

Another possibility is that after you have patched over the interrupt
routing issues (a PCI configuration problem), the card has a foreign
array, and the reset and reconfiguration is taking arrays offline. Add
'aacraid.commit=1' to force the foreign arrays to be accepted by the
card.

Could you please check if this issue is specific to your motherboard
model. Could you please check if there is an updated motherboard BIOS
available for it. Could you please check if this issue is specific to
the GB product release cycle? Given the information you have collected,
I would still try the safe flags since there is an interrupt routing
issue.

Another possibility is the reset did not hit your card, the card is not
working correctly or the reset is not working correctly. This feature
was added to the Firmware at the end of 2004, so B11835 certainly would
have it, but that Firmware appears to be an interim test release of the
GB product, and the latest Firmware release to IBM should be B11847 (I
could be mistaken).

Sincerely -- Mark Salyzyn

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Vivek Goyal
> Sent: Tuesday, April 24, 2007 4:45 AM
> To: Salyzyn, Mark
> Cc: James Bottomley; Kexec Mailing List; Judith Lebzelter; 
> linux-scsi@vger.kernel.org
> Subject: Re: [PATCH] aacraid: fails to initialize after a 
> kexec operation
> 
> 
> On Mon, Apr 23, 2007 at 01:20:32PM -0400, Salyzyn, Mark wrote:
> > That is a failure to route the interrupts and is possibly 
> an issue with
> > the kernel and the hardware, and not the driver directly 
> (since there is
> > an expectation that request_irq will connect the interrupt to the
> > interrupt service routine). Judith reported success in the past with
> > this patch on her hardware, perhaps the motherboard on your 
> system has
> > some odd BIOS setup of the hardware that is giving acpi or 
> the apic some
> > headaches? Can you check out success or failure on other 
> motherboards?
> > Please try the suggestions from the driver (safe flags)?
> > 
> > Sincerely -- Mark Salyzyn
> > 
> 
> Hi Mark,
> 
> We don't even go through BIOS in kexec and kdump. So BIOS 
> should not be an
> issue.
> 
> Looks like you sent some message to controller and then waiting for an
> interrupt from the controller as an indication of completion 
> of command. In
> this case you never seem to get an interrupt hence timeout.
> 
> To bypass this problem, I am now booting my second kernel 
> with "irqpoll"
> command line option. This will make sure that aacraid 
> interrupt handler
> gets invoked even if there is an interrupt routing issue.
> 
> This option does help in progressing the things but it ends 
> up corrupting
> something or other on the disk. In three attempts I get three types of
> errors.
> 
> In first attempt I get continuous stream of following messages once
> root file system has been mounted.
> 
> =
> sda1: rw=0, want=9261304112, limit=41945652
> attempt to access beyond end of device
> sda1: rw=0, want=9261304112, limit=41945652
> attempt to access beyond end of device
> sda1: rw=0, want=9261304112, limit=41945652
> attempt to access beyond end of device
> sda1: rw=0, want=9261304112, limit=41945652
> attempt to access beyond end of device
> sda1: rw=0, want=9261304112, limit=41945652
> attempt to access beyond end of device
> 
> 
> In second attempt, it mounted the file system but it found some issue
> with "resize" inode and asked me to run fsck manually. Which in turn 
> deleted whole lot of inodes.
> 
> In third attemt it panics later when it finds ext3 to be corrupted.
> 
> =
> Creating block device nodes.
> Trying to resume from LABEL=SWAP-sda3
> No suspend signature on swap, not resuming.
> Creating root device.
> Mounting root filesystem.
> EXT3-fs: Magic mismatch, very weird !
> mount: error mouKernel panic - not syncing: Attempted to kill init!
> nting /dev/root
> === 
> 
> Following are relevant aacraid initiliazation messages on 
> serial console.
> 
> =

Re: [RFC PATCH]: Rewritten ESP driver, porters needed!

2007-04-24 Thread Matthew Wilcox
On Tue, Apr 24, 2007 at 01:22:35PM +0100, Christoph Hellwig wrote:
> > +static void esp_build_sync_msg(struct esp *esp, u8 period, u8 offset)
> >  {
> > +   esp->msg_out[0] = EXTENDED_MESSAGE;
> > +   esp->msg_out[1] = 3;
> > +   esp->msg_out[2] = EXTENDED_SDTR;
> > +   esp->msg_out[3] = period;
> > +   esp->msg_out[4] = offset;
> > +   esp->msg_out_len = 5;
> > +}
> >  
> > +static void esp_build_wide_msg(struct esp *esp, int wide)
> > +{
> > +   esp->msg_out[0] = EXTENDED_MESSAGE;
> > +   esp->msg_out[1] = 2;
> > +   esp->msg_out[2] = EXTENDED_WDTR;
> > +   esp->msg_out[3] = (wide ? 1 : 0);
> > +   esp->msg_out_len = 4;
> >  }
> 
> These might actually be worth putting into the spi transport
> class, taking an u8 * as first argument.  After all all
> SPI drivers without smart firmware will need them.

Already done -- spi_populate_sync_msg, spi_populate_width_msg and
spi_populate_ppr_msg.

> (the lundata management looks inspired by sym53c8xx, but it's probably
>  not the best driver to be inspired by :))

*cough*.

> > +static int esp_queue(struct scsi_cmnd *cmd, void (*done)(struct scsi_cmnd 
> > *))
> 
> It would be nice to call this esp_queuecommand to match the name of
> the method.
> 
> > +{
> > +   struct scsi_device *dev = cmd->device;
> > +   struct esp *esp = host_to_esp(dev->host);
> > +   struct esp_cmd_priv *spriv;
> > +   struct esp_cmd_entry *ent;
> >  
> > +   cmd->scsi_done = done;
> >  
> > +   if (dev->id == esp->scsi_id) {
> > +   cmd->result = DID_NO_CONNECT << 16;
> > +   cmd->scsi_done(cmd);
> > +   return 0;
> > +   }
> 
> This can't happen, no need to check for it.  (And yes, I know some
> drivers like sym53x8xx still have the checks despite me submitting
> patches to get rid of it)

I think the last time you sent me a patch to get rid of that is was part
of a larger patchset and the whole thing failed to work.

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH]: Rewritten ESP driver, porters needed!

2007-04-24 Thread James Bottomley
On Tue, 2007-04-24 at 13:22 +0100, Christoph Hellwig wrote:
> > +static void esp_build_sync_msg(struct esp *esp, u8 period, u8
> offset)
> >  {
> > + esp->msg_out[0] = EXTENDED_MESSAGE;
> > + esp->msg_out[1] = 3;
> > + esp->msg_out[2] = EXTENDED_SDTR;
> > + esp->msg_out[3] = period;
> > + esp->msg_out[4] = offset;
> > + esp->msg_out_len = 5;
> > +}
> >  
> > +static void esp_build_wide_msg(struct esp *esp, int wide)
> > +{
> > + esp->msg_out[0] = EXTENDED_MESSAGE;
> > + esp->msg_out[1] = 2;
> > + esp->msg_out[2] = EXTENDED_WDTR;
> > + esp->msg_out[3] = (wide ? 1 : 0);
> > + esp->msg_out_len = 4;
> >  }
> 
> These might actually be worth putting into the spi transport
> class, taking an u8 * as first argument.  After all all
> SPI drivers without smart firmware will need them.

This was going to be my comment ... they're already there:

int spi_populate_width_msg(unsigned char *msg, int width);
int spi_populate_sync_msg(unsigned char *msg, int period, int offset);
int spi_populate_ppr_msg(unsigned char *msg, int period, int offset, int width,
int options);

James


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Question regarding firmware

2007-04-24 Thread Andrew Vasquez
On Tue, 24 Apr 2007, renuka apte wrote:

> I am using a QLA2460 card. I was experiencing a problem with port
> entries not appearing under /sys/class/fc_hosts/. Through the dmesgs
> I realised it was happening as the firmware image could not be
> found, and I could get around the problem by disabling the
> ql2xfwloadbin module parameter. I would like to know whether having
> the firmware image is mandatory, or whether I should continue using
> this workaround.

Take a look at the driver's Kconfig file:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=drivers/scsi/qla2xxx/Kconfig;h=8c865b9e02b579f2d5bfc64a706fec3ea4063433;hb=HEAD

for details on where to retrieve and place the firmware images on your
local systems.

Regards,
Andrew Vasquez
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/7] libata: check for AN support

2007-04-24 Thread Kristen Carlson Accardi
On Tue, 24 Apr 2007 12:23:04 +0200
Olivier Galibert <[EMAIL PROTECTED]> wrote:

> Sorry for replying to Alan's reply, I missed the original mail.
> 
> > > +#define ata_id_has_AN(id)\
> > > + ((id[76] && (~id[76])) & ((id)[78] & (1 << 5)))
> 
> (a && ~a) & (b & 32)
> 
> I don't think that does what you think it does, because at that point
> it's a funny way to write 0 ((0 or 1) binary-and (0 or 32)).
> 
> I'm not even sure what it is you want.  If for the first part you
> wanted (id[76] != 0x00 && id[76] != 0xff), please write just that,
> thanks :-)
> 
>   OG.
> 

>From the serial ata spec, we have:

13.2.1.18Word 78: Serial ATA features supported
If Word 76 is not h or h, Word 78 reports the optional features 
supported by the device.  Support for this word is optional and if not 
supported the word shall be zero indicating the device has no support for new 
Serial ATA capabilities.

so, basically yes, I'm really testing to make sure that word 76 isn't 0 or all
one then using that value & with value of bit in work 78 to determine AN
support - if you think this is really obfuscated, I've got no problem changing 
it - there's obviously many ways to mess around with bits.
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/7] libata: check for AN support

2007-04-24 Thread Kristen Carlson Accardi
On Tue, 24 Apr 2007 17:03:29 +0900
Tejun Heo <[EMAIL PROTECTED]> wrote:

> Hello,
> 
> Kristen Carlson Accardi wrote:
> >  static unsigned int ata_print_id = 1;
> > @@ -1744,6 +1745,23 @@ int ata_dev_configure(struct ata_device 
> > }
> > dev->cdb_len = (unsigned int) rc;
> >  
> > +   /*
> > +* check to see if this ATAPI device supports
> > +* Asynchronous Notification
> > +*/
> > +   if ((ap->flags & ATA_FLAG_AN) && ata_id_has_AN(id))
> > +   {
> > +   /* issue SET feature command to turn this on */
> > +   rc = ata_dev_set_AN(dev);
> 
> Please don't store err_mask into int rc.  Please store it to a separate
> err_mask variable and report it when printing error message.
> 
> > +   if (rc) {
> > +   ata_dev_printk(dev, KERN_ERR,
> > +   "unable to set AN\n");
> > +   rc = -EINVAL;
> 
> Wouldn't -EIO be more appropriate?

I think Alan is right - and being unable to turn on AN should not be fatal.
I'll just change all this code to just print the err and keep going.

> 
> > +   goto err_out_nosup;
> > +   }
> > +   dev->flags |= ATA_DFLAG_AN;
> > +   }
> > +
> 
> Not NACKing.  Just notes for future improvements.  We need to be more
> careful here.  ATA/ATAPI world is filled with braindamaged devices and I
> bet there are devices which advertises it can do AN but chokes when AN
> is enabled.
> 
> This should be handled similarly to ACPI failure.  Currently ACPI does
> the following.
> 
> 1. try once, if fail, record that ACPI failed.  return error to trigger
> retry.
> 2. try again, if fail again, ignore error if possible (!FROZEN) and turn
> off ACPI.
> 
> This fallback mechanism for optional features can probably be
> generalized and used for both ACPI and AN.

Ok - meanwhile I think it's appropriate here to just do try-once-fail-give-up.
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH resend] bsg: return SAM device status code

2007-04-24 Thread Pete Wyckoff
Use the status codes from the spec, not the shifted-by-one codes
that are marked deprecated in scsi.h  This makes bsg v4 status
report the same value as sg v3 status too.

Signed-off-by: Pete Wyckoff <[EMAIL PROTECTED]>
Acked-by: Douglas Gilbert <[EMAIL PROTECTED]>
---
 block/bsg.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/block/bsg.c b/block/bsg.c
index f0753c0..92be6fa 100644
--- a/block/bsg.c
+++ b/block/bsg.c
@@ -440,7 +440,7 @@ static int blk_complete_sgv4_hdr_rq(struct request *rq, 
struct sg_io_v4 *hdr,
/*
 * fill in all the output members
 */
-   hdr->device_status = status_byte(rq->errors);
+   hdr->device_status = rq->errors & 0xff;
hdr->transport_status = host_byte(rq->errors);
hdr->driver_status = driver_byte(rq->errors);
hdr->info = 0;
-- 
1.5.0.6

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] bsg: port to bidi

2007-04-24 Thread Pete Wyckoff
Changes required to make bsg patches work on top of bidi patches.  Adds
capability to bsg to handle bidirectional commands and extended CDBs.

Signed-off-by: Pete Wyckoff <[EMAIL PROTECTED]>
---
 block/bsg.c|  106 ++--
 block/ll_rw_blk.c  |   30 +-
 include/linux/blkdev.h |1 +
 3 files changed, 87 insertions(+), 50 deletions(-)

diff --git a/block/bsg.c b/block/bsg.c
index 92be6fa..9d09505 100644
--- a/block/bsg.c
+++ b/block/bsg.c
@@ -95,6 +95,7 @@ struct bsg_command {
struct list_head list;
struct request *rq;
struct bio *bio;
+   struct bio *bidi_read_bio;
int err;
struct sg_io_v4 hdr;
struct sg_io_v4 __user *uhdr;
@@ -225,18 +226,31 @@ static struct bsg_command *bsg_get_command(struct 
bsg_device *bd)
 static int blk_fill_sgv4_hdr_rq(request_queue_t *q, struct request *rq,
struct sg_io_v4 *hdr, int has_write_perm)
 {
+   int len = hdr->request_len;
+   int cmd_len = min(len, BLK_MAX_CDB);
+
memset(rq->cmd, 0, BLK_MAX_CDB); /* ATAPI hates garbage after CDB */
 
if (copy_from_user(rq->cmd, (void *)(unsigned long)hdr->request,
-  hdr->request_len))
+  cmd_len))
return -EFAULT;
+   if (len > BLK_MAX_CDB) {
+   rq->varlen_cdb_len = len;
+   rq->varlen_cdb = kmalloc(len, GFP_KERNEL);
+   if (rq->varlen_cdb == NULL)
+   return -ENOMEM;
+   if (copy_from_user(rq->varlen_cdb,
+  (void *)(unsigned long)hdr->request, len))
+   return -EFAULT;
+   }
+
if (blk_verify_command(rq->cmd, has_write_perm))
return -EPERM;
 
/*
 * fill in request structure
 */
-   rq->cmd_len = hdr->request_len;
+   rq->cmd_len = cmd_len;
rq->cmd_type = REQ_TYPE_BLOCK_PC;
 
rq->timeout = (hdr->timeout * HZ) / 1000;
@@ -252,12 +266,11 @@ static int blk_fill_sgv4_hdr_rq(request_queue_t *q, 
struct request *rq,
  * Check if sg_io_v4 from user is allowed and valid
  */
 static int
-bsg_validate_sgv4_hdr(request_queue_t *q, struct sg_io_v4 *hdr, int *rw)
+bsg_validate_sgv4_hdr(request_queue_t *q, struct sg_io_v4 *hdr,
+  enum dma_data_direction *dir)
 {
if (hdr->guard != 'Q')
return -EINVAL;
-   if (hdr->request_len > BLK_MAX_CDB)
-   return -EINVAL;
if (hdr->dout_xfer_len > (q->max_sectors << 9) ||
hdr->din_xfer_len > (q->max_sectors << 9))
return -EIO;
@@ -266,17 +279,15 @@ bsg_validate_sgv4_hdr(request_queue_t *q, struct sg_io_v4 
*hdr, int *rw)
if (hdr->protocol || hdr->subprotocol)
return -EINVAL;
 
-   /*
-* looks sane, if no data then it should be fine from our POV
-*/
-   if (!hdr->dout_xfer_len && !hdr->din_xfer_len)
-   return 0;
-
-   /* not supported currently */
-   if (hdr->dout_xfer_len && hdr->din_xfer_len)
-   return -EINVAL;
-
-   *rw = hdr->dout_xfer_len ? WRITE : READ;
+   if (hdr->dout_xfer_len) {
+   if (hdr->din_xfer_len)
+   *dir = DMA_BIDIRECTIONAL;
+   else
+   *dir = DMA_TO_DEVICE;
+   } else if (hdr->din_xfer_len)
+   *dir = DMA_FROM_DEVICE;
+   else
+   *dir = DMA_NONE;
 
return 0;
 }
@@ -289,7 +300,8 @@ bsg_map_hdr(struct bsg_device *bd, struct sg_io_v4 *hdr)
 {
request_queue_t *q = bd->queue;
struct request *rq;
-   int ret, rw = 0; /* shut up gcc */
+   enum dma_data_direction dir;
+   int ret;
unsigned int dxfer_len;
void *dxferp = NULL;
 
@@ -297,39 +309,45 @@ bsg_map_hdr(struct bsg_device *bd, struct sg_io_v4 *hdr)
hdr->dout_xfer_len, (unsigned long long) hdr->din_xferp,
hdr->din_xfer_len);
 
-   ret = bsg_validate_sgv4_hdr(q, hdr, &rw);
+   ret = bsg_validate_sgv4_hdr(q, hdr, &dir);
if (ret)
return ERR_PTR(ret);
 
/*
 * map scatter-gather elements seperately and string them to request
 */
-   rq = blk_get_request(q, rw, GFP_KERNEL);
+   rq = blk_get_request(q, dir, GFP_KERNEL);
ret = blk_fill_sgv4_hdr_rq(q, rq, hdr, test_bit(BSG_F_WRITE_PERM,
   &bd->flags));
-   if (ret) {
-   blk_put_request(rq);
-   return ERR_PTR(ret);
-   }
+   if (ret)
+   goto errout;
 
if (hdr->dout_xfer_len) {
dxfer_len = hdr->dout_xfer_len;
dxferp = (void*)(unsigned long)hdr->dout_xferp;
-   } else if (hdr->din_xfer_len) {
+   ret = blk_rq_map_user_bidi(q, rq, dxferp, dxfer_len, dir);
+   if (ret)
+  

[PATCH 1/2] bsg: compile with bidi

2007-04-24 Thread Pete Wyckoff
Fixes to common functions added with bsg so that they compile in
a kernel that has the bidi patches.

Signed-off-by: Pete Wyckoff <[EMAIL PROTECTED]>
---
 block/bsg.c|6 +++---
 block/scsi_ioctl.c |6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/block/bsg.c b/block/bsg.c
index a333c93..f0753c0 100644
--- a/block/bsg.c
+++ b/block/bsg.c
@@ -368,7 +368,7 @@ static void bsg_add_command(struct bsg_device *bd, 
request_queue_t *q,
 * add bc command to busy queue and submit rq for io
 */
bc->rq = rq;
-   bc->bio = rq->bio;
+   bc->bio = rq_uni(rq)->bio;
bc->hdr.duration = jiffies;
spin_lock_irq(&bd->lock);
list_add_tail(&bc->list, &bd->busy_list);
@@ -446,7 +446,7 @@ static int blk_complete_sgv4_hdr_rq(struct request *rq, 
struct sg_io_v4 *hdr,
hdr->info = 0;
if (hdr->device_status || hdr->transport_status || hdr->driver_status)
hdr->info |= SG_INFO_CHECK;
-   hdr->din_resid = rq->data_len;
+   hdr->din_resid = rq_uni(rq)->data_len;
hdr->response_len = 0;
 
if (rq->sense_len && hdr->response) {
@@ -915,7 +915,7 @@ bsg_ioctl(struct inode *inode, struct file *file, unsigned 
int cmd,
if (IS_ERR(rq))
return PTR_ERR(rq);
 
-   bio = rq->bio;
+   bio = rq_uni(rq)->bio;
blk_execute_rq(bd->queue, NULL, rq, 0);
blk_complete_sgv4_hdr_rq(rq, &hdr, bio);
 
diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c
index cb29ea1..c1cfae9 100644
--- a/block/scsi_ioctl.c
+++ b/block/scsi_ioctl.c
@@ -244,7 +244,7 @@ EXPORT_SYMBOL_GPL(blk_fill_sghdr_rq);
  */
 int blk_unmap_sghdr_rq(struct request *rq, struct sg_io_hdr *hdr)
 {
-   blk_rq_unmap_user(rq->bio);
+   blk_rq_unmap_user(rq_uni(rq)->bio);
blk_put_request(rq);
return 0;
 }
@@ -266,7 +266,7 @@ int blk_complete_sghdr_rq(struct request *rq, struct 
sg_io_hdr *hdr,
hdr->info = 0;
if (hdr->masked_status || hdr->host_status || hdr->driver_status)
hdr->info |= SG_INFO_CHECK;
-   hdr->resid = rq->data_len;
+   hdr->resid = rq_uni(rq)->data_len;
hdr->sb_len_wr = 0;
 
if (rq->sense_len && hdr->sbp) {
@@ -278,7 +278,7 @@ int blk_complete_sghdr_rq(struct request *rq, struct 
sg_io_hdr *hdr,
ret = -EFAULT;
}
 
-   rq->bio = bio;
+   rq_uni(rq)->bio = bio;
r = blk_unmap_sghdr_rq(rq, hdr);
if (ret)
r = ret;
-- 
1.5.0.6

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] bsg: port to bidi

2007-04-24 Thread Pete Wyckoff
[EMAIL PROTECTED] wrote on Tue, 24 Apr 2007 13:27 -0400:
> Changes required to make bsg patches work on top of bidi patches.  Adds
> capability to bsg to handle bidirectional commands and extended CDBs.

Oops.  This is 2/2.  Apply the "compile with bidi" one first.

-- Pete
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/7] libata: check for AN support

2007-04-24 Thread Olivier Galibert
On Tue, Apr 24, 2007 at 08:49:04AM -0700, Kristen Carlson Accardi wrote:
> On Tue, 24 Apr 2007 12:23:04 +0200
> Olivier Galibert <[EMAIL PROTECTED]> wrote:
> 
> > Sorry for replying to Alan's reply, I missed the original mail.
> > 
> > > > +#define ata_id_has_AN(id)  \
> > > > +   ((id[76] && (~id[76])) & ((id)[78] & (1 << 5)))
> > 
> > (a && ~a) & (b & 32)
> > 
> > I don't think that does what you think it does, because at that point
> > it's a funny way to write 0 ((0 or 1) binary-and (0 or 32)).
> > 
> > I'm not even sure what it is you want.  If for the first part you
> > wanted (id[76] != 0x00 && id[76] != 0xff), please write just that,
> > thanks :-)
> > 
> >   OG.
> > 
> 
> >From the serial ata spec, we have:
> 
> 13.2.1.18Word 78: Serial ATA features supported
> If Word 76 is not h or h, Word 78 reports the optional features 
> supported by the device.  Support for this word is optional and if not 
> supported the word shall be zero indicating the device has no support for new 
> Serial ATA capabilities.
> 
> so, basically yes, I'm really testing to make sure that word 76 isn't 0 or all
> one then using that value & with value of bit in work 78 to determine AN
> support - if you think this is really obfuscated, I've got no problem 
> changing 
> it - there's obviously many ways to mess around with bits.

& is not &&, so right now it's really incorrect.  1 & 32 is 0.

((id)[76] != 0x && (id)[76] != 0x && ((id)[78] & (1 << 5)))

The implicit typing of id looks dangerous to me, but you're not the
one who has started it.

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/7] libata: check for AN support

2007-04-24 Thread Kristen Carlson Accardi
On Tue, 24 Apr 2007 20:05:52 +0200
Olivier Galibert <[EMAIL PROTECTED]> wrote:

> On Tue, Apr 24, 2007 at 08:49:04AM -0700, Kristen Carlson Accardi wrote:
> > On Tue, 24 Apr 2007 12:23:04 +0200
> > Olivier Galibert <[EMAIL PROTECTED]> wrote:
> > 
> > > Sorry for replying to Alan's reply, I missed the original mail.
> > > 
> > > > > +#define ata_id_has_AN(id)\
> > > > > + ((id[76] && (~id[76])) & ((id)[78] & (1 << 5)))
> > > 
> > > (a && ~a) & (b & 32)
> > > 
> > > I don't think that does what you think it does, because at that point
> > > it's a funny way to write 0 ((0 or 1) binary-and (0 or 32)).
> > > 
> > > I'm not even sure what it is you want.  If for the first part you
> > > wanted (id[76] != 0x00 && id[76] != 0xff), please write just that,
> > > thanks :-)
> > > 
> > >   OG.
> > > 
> > 
> > >From the serial ata spec, we have:
> > 
> > 13.2.1.18Word 78: Serial ATA features supported
> > If Word 76 is not h or h, Word 78 reports the optional features 
> > supported by the device.  Support for this word is optional and if not 
> > supported the word shall be zero indicating the device has no support for 
> > new 
> > Serial ATA capabilities.
> > 
> > so, basically yes, I'm really testing to make sure that word 76 isn't 0 or 
> > all
> > one then using that value & with value of bit in work 78 to determine AN
> > support - if you think this is really obfuscated, I've got no problem 
> > changing 
> > it - there's obviously many ways to mess around with bits.
> 
> & is not &&, so right now it's really incorrect.  1 & 32 is 0.

ah - ok, gotcha, thanks.

> 
> ((id)[76] != 0x && (id)[76] != 0x && ((id)[78] & (1 << 5)))
> 
> The implicit typing of id looks dangerous to me, but you're not the
> one who has started it.
> 
>   OG.
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Kernel crash with AIC94xx

2007-04-24 Thread James Bottomley
On Tue, 2007-04-24 at 11:52 +0300, Constantin Teodorescu wrote:
> Hello, I hope I can get a little help from you regarding this kind of 
> crash !
> 
> Hardware:
> - server, TYAN Tempest i5000VS S5372 BIOS v1.0.4
> - 8 SATA drives Seagate 136 Gb attached on a AIC-9410 controller
> - one IDE (boot disk and system)

This configuration doesn't work on the vanilla linux kernel ... you need
the scsi-aic94xxx-sas-2.6 tree as well for this; is that what you're
running with?

> - 8 Gb RAM
> 
> Software:
> - OpenSUSE 10.2 x86_64 (tried also with SLES 10 but didn't succed in 
> compiling adp94xx driver from Adaptec)
> 
> Kernels: i tried with any  of them : linux-2.6.20.1 ,  linux-2.6.20.4 ,  
> linux-2.6.20.7 , linux-2.6.21.rc7
> The last one has the 1.0.3 version of aic94xx driver but the results are 
> the same :-(
> 
> Description:
> - the server is running a very heavy loaded PostgreSQL database with 
> tables spread on those SAS drives, a lot of writes and reads

Are these SAS or SATA drives?

> - at least 4, 5 times a day I got some warnings in /var/log/messages 
> (sas: Enter sas_scsi_recover_host , trying to find task XXX ---> 
> aic94xx: came back from clear nexus) but the system is still working
> - more rarely (once per day) I got the following bug in 
> /var/log/messages and the system is crashed, SAS drivers are not working 
> anymore, shutdown command is waiting forever, need to hardware reset the 
> system
> 
> 
> Apr 24 07:22:20 bnd kernel: sas: command 0x8101c9f5e2c0, task 
> 0x81005bfcb080, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x810047f9dd00, task 
> 0x81007df80cc0, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x810164d31180, task 
> 0x8101247ad500, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x81021b8af380, task 
> 0x81012e550ac0, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x8101698c3940, task 
> 0x8101a3b69b80, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x81011e865680, task 
> 0x8101a3b69380, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x81000ce37340, task 
> 0x8101a3b69580, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x810164d31a40, task 
> 0x810058a93dc0, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x8100bc25b940, task 
> 0x81005bfcbc80, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x81000ce37880, task 
> 0x81015856bd00, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x81022fa2f940, task 
> 0x8101d2cf87c0, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x8100bc25b080, task 
> 0x81005bfcb880, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x81000ce37dc0, task 
> 0x8101d186a940, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x81009c620640, task 
> 0x81010d46a940, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x8100531ae1c0, task 
> 0x81012e9bf4c0, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x8100531ae380, task 
> 0x8101d186a740, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x81011e8654c0, task 
> 0x8101247ad100, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x81009c620480, task 
> 0x81012e5502c0, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x81000ce37180, task 
> 0x8101d2cf89c0, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x81017d5268c0, task 
> 0x8101d186a540, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x8101c9f5e800, task 
> 0x81015856b900, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x81014f8db600, task 
> 0x81007df808c0, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x81011e865bc0, task 
> 0x81012e550cc0, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: command 0x81009c620100, task 
> 0x8101a3b69980, timed out: EH_NOT_HANDLED
> Apr 24 07:22:20 bnd kernel: sas: Enter sas_scsi_recover_host
> Apr 24 07:22:20 bnd kernel: sas: trying to find task 0x81005bfcb080
> Apr 24 07:22:20 bnd kernel: sas: sas_scsi_find_task: aborting task 
> 0x81005bfcb080
> Apr 24 07:22:25 bnd kernel: aic94xx: tmf timed out
> Apr 24 07:22:25 bnd kernel: aic94xx: tmf came back
> Apr 24 07:22:25 bnd kernel: aic94xx: task not done, clearing nexus
> Apr 24 07:22:25 bnd kernel: aic94xx: asd_clear_nexus_index: PRE
> Apr 24 07:22:25 bnd kernel: aic94xx: asd_clear_nexus_index: POST
> Apr 24 07:22:25 bnd kernel: aic94xx: asd_clear_nexus_index: clear nexus 
> posted, waiting...
> Apr 24 07:22:30 bnd kernel: aic94xx: asd_clear_nexus_timedout: here
> Apr 24 07:22:35 bnd kernel: aic94xx: came bac

Re: [PATCH] megaraid: update version reported by MEGAIOC_QDRVRVER

2007-04-24 Thread Christoph Hellwig
On Thu, Apr 19, 2007 at 12:10:24PM -0500, David Milburn wrote:
> Update the driver version reported by MEGAIOC_QDRVRVER to
> match LSI_COMMON_MOD_VERSION.

Why does this matter?

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Kernel crash with AIC94xx (one step forward, hope it's lucky)

2007-04-24 Thread James Bottomley
Please don't cut linux-scsi from the cc list

On Tue, 2007-04-24 at 22:14 +0300, Constantin Teodorescu wrote:
> James Bottomley wrote:
> > This configuration doesn't work on the vanilla linux kernel ... you need
> > the scsi-aic94xxx-sas-2.6 tree as well for this; is that what you're
> > running with?
> >   
> Yes, I am experimenting now on a 2.6.21-RC7 kernel with 1.0.3 version of 
> aic94xx driver.
> 
> > Are these SAS or SATA drives?
> >   
> 8 SAS drives
> 
> I have already received some information from Luben Tuikov and Alexis 
> Bruemmer (he told me that there is a new firmware seq file at Adaptec) 
> and I am sending you the message:
> 
> Luben Tuikov wrote:
> > Constantin,
> >
> > adp94xx is not supported by anyone.
> >
> > The in-kernel aic94xx is supported by linux-scsi mailing list
> > and your OS vendor.
> >
> > Luben
> >   
> I was afraid of ... :-(
> 
> I already got the news from Andy Warner that told me about that !
> 
> Alexis Bruemmer send me also a message saying :
> > Yep we have seen this issue before.  However the fix involves both a
> > driver update and a sequencer f/w update found at:
> >
> > http://www.adaptec.com/en-US/downloads/linux_source/linux_source_code?productId=SAS-48300&dn=Adaptec+Serial+Attached+SCSI+48300
> >  
> >
> >
> > If you still get this crash with that version of the sequencer let me
> > know
> 
> Digging I discovered that the firmware is different (md5sum differs) and 
> newer (released in 2 Mar 2007) that the firmware that I installed in 
> January 2007.
> 
> bnd:~/# md5sum /lib/firmware/aic94xx-seq.fw old-aic94xx-seq.fw
> fb393f52fde81eb53afa1e204a606c37  /lib/firmware/aic94xx-seq.fw
> 589f442b43ea0cc42fec275d7a612c2e  old-aic94xx-seq.fw
> 
> So I downloaded the new firmware and I will intensively teste it.
> I will keep you informed about the results.
> 
> Thank you again for all your valuable help,
> Best regards from Romania,
> Teo
> 
> 
> 
> 

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH]: Rewritten ESP driver, porters needed!

2007-04-24 Thread David Miller
From: Christoph Hellwig <[EMAIL PROTECTED]>
Date: Tue, 24 Apr 2007 13:22:35 +0100

> Overall the driver looks really nice, thanks a lot!

Thanks.

> would be nice to have dev_printk here, but sbus still seems to
> lack driver model integration.

There is only partial integration at the moment, but filling
that gap is certainly planned.

> The non-use-sg case is dead, you can put in BUG_ON()s here and in
> the unmap path.

Thanks I've done that.

> > +static void esp_build_sync_msg(struct esp *esp, u8 period, u8 offset)
> >  {
> > +   esp->msg_out[0] = EXTENDED_MESSAGE;
> > +   esp->msg_out[1] = 3;
> > +   esp->msg_out[2] = EXTENDED_SDTR;
> > +   esp->msg_out[3] = period;
> > +   esp->msg_out[4] = offset;
> > +   esp->msg_out_len = 5;
> > +}
> >  
> > +static void esp_build_wide_msg(struct esp *esp, int wide)
> > +{
> > +   esp->msg_out[0] = EXTENDED_MESSAGE;
> > +   esp->msg_out[1] = 2;
> > +   esp->msg_out[2] = EXTENDED_WDTR;
> > +   esp->msg_out[3] = (wide ? 1 : 0);
> > +   esp->msg_out_len = 4;
> >  }
> 
> These might actually be worth putting into the spi transport
> class, taking an u8 * as first argument.  After all all
> SPI drivers without smart firmware will need them.

As others noted the SPI layer does have this already.  I converted
esp.c to use them, thanks.

> > +/* If we get a non-tagged command, we let all the current
...
> 
> The comment doesn't match the code.  You don't do the REQUEST_SENSE
> special casing anymore since implementing autosense :)

Amusing, comment deleted :-)

> > +static int esp_alloc_lun_tag(struct esp_cmd_entry *ent,
> > +struct esp_lun_data *lp)
> > +{
> > +   if (!lp) {
> > +   /* When we don't have lun-data yet, we disallow
> > +* disconnects, so we do not have to see if this
> > +* untagged command matches a disconnected one and
> > +* thus return -EBUSY.
> >  */
> > +   return 0;
> > +   }
> 
> Given that you allocate the lun-data in esp_slave_alloc this
> can never happen.  Some more comments on the handling of
> per-lun data here:
> 
>   - normally you allocate per-lun data in slave_alloc and free it
> in slave_detroy.  This is guranteed to be save because
> slave_alloc is called before the first I/O and slave_destroy
> after the last I/O has finished.  No need for checking
> of it already beeing allocated in esp_slave_alloc.
>   - there is no need to keep track of per-lun data on your own.
> the midlayer gives you sdev->hostdata for it, and you can
> easily get at it in every place you use the lun data currently
>  
> Doing things properly also avoids the !lp checks in various places.

I did all of this, and it's fine, but there is one site which is much
less pleasant, device reconnect.

With the esp_target_data->lun[] mapping the lookup during device
reconnect was O(1), now I have to use __scsi_device_lookup_by_target()
which is O(num_active_luns).

In fact the efficiency of that lookup was why I did the data
structures the way I did in the first place.

But anyways this is cleaner for now and I doubt it matters for
the setups people have with this chip.

> > +   if (dev->id == esp->scsi_id) {
> > +   cmd->result = DID_NO_CONNECT << 16;
> > +   cmd->scsi_done(cmd);
> > +   return 0;
> > +   }
> 
> This can't happen, no need to check for it.  (And yes, I know some
> drivers like sym53x8xx still have the checks despite me submitting
> patches to get rid of it)

Ok I'll remove that, thanks.

> > +   spriv->u.dma_addr = ~(dma_addr_t)0x0;
> > +   spriv->mapping_type = MAPPING_TYPE_NONE;
> >  
> > +   ent = esp_get_ent(esp);
> > +   if (!ent) {
> > +   cmd->result = (DID_OK << 16) | (QUEUE_FULL << 1);
> > +   cmd->scsi_done(cmd);
> > +   return 0;
> 
> This should not set ->result and call ->scsi_done but rather return
> SCSI_MLQUEUE_HOST_BUSY.

Done.

> > +   }
> > +   ent->cmd = cmd;
> >  
> > +   if (cmd->cmnd[0] == REQUEST_SENSE)
> > +   list_add(&ent->list, &esp->queued_cmds);
> > +   else
> > +   list_add_tail(&ent->list, &esp->queued_cmds);
> 
> I don't think there's a need to handle REQUEST_SENSE special anymore.

Agreed, and done.

> >  
> > +   esp_maybe_execute_command(esp);
> 
> You still have internal queueing in the driver, and I think this
> is avoidable.  Instead you should just try to directly issue
> the command and return SCSI_MLQUEUE_DEVICE_BUSY/SCSI_MLQUEUE_EH_RETRY
> if you can't do it at this point.  The midlayer keeps proper per-lun
> and per-host busy counters to call into ->queuecommand once the
> next command returned and the lun/host is not busy anymore.

Hmmm, OK.  But which of those two error codes should I use?

Actually I don't think it's avoidable.  I set cmd_per_lun to "2"
and in this way when a command completes I'll always be able to
immediately issue another command for the same device even if
disconnect is disabled.

Otherwise I have to wait for the softi

Re: [PATCH] megaraid: update version reported by MEGAIOC_QDRVRVER

2007-04-24 Thread David Milburn

Christoph Hellwig wrote:

On Thu, Apr 19, 2007 at 12:10:24PM -0500, David Milburn wrote:


Update the driver version reported by MEGAIOC_QDRVRVER to
match LSI_COMMON_MOD_VERSION.



Why does this matter?



It is needed so that the correct driver version is reported
by user-space tools that use the MEGAIOC_QDRVRVER ioctl
to query the driver version.

David

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/7] libata: check for AN support

2007-04-24 Thread Kristen Carlson Accardi
Check to see if an ATAPI device supports Asynchronous Notification.
If so, enable it.

changes from last version: 
* fix typo in ata_id_has_AN and make word 76 test more clear
* If we fail to set the AN feature, just print a warning and continue
 
Signed-off-by: Kristen Carlson Accardi <[EMAIL PROTECTED]>

Index: 2.6-git/drivers/ata/libata-core.c
===
--- 2.6-git.orig/drivers/ata/libata-core.c
+++ 2.6-git/drivers/ata/libata-core.c
@@ -70,6 +70,7 @@ const unsigned long sata_deb_timing_long
 static unsigned int ata_dev_init_params(struct ata_device *dev,
u16 heads, u16 sectors);
 static unsigned int ata_dev_set_xfermode(struct ata_device *dev);
+static unsigned int ata_dev_set_AN(struct ata_device *dev);
 static void ata_dev_xfermask(struct ata_device *dev);
 
 static unsigned int ata_print_id = 1;
@@ -1744,6 +1745,22 @@ int ata_dev_configure(struct ata_device 
}
dev->cdb_len = (unsigned int) rc;
 
+   /*
+* check to see if this ATAPI device supports
+* Asynchronous Notification
+*/
+   if ((ap->flags & ATA_FLAG_AN) && ata_id_has_AN(id)) {
+   int err;
+   /* issue SET feature command to turn this on */
+   err = ata_dev_set_AN(dev);
+   if (err)
+   ata_dev_printk(dev, KERN_ERR,
+   "unable to set AN, err %x\n",
+   err);
+   else
+   dev->flags |= ATA_DFLAG_AN;
+   }
+
if (ata_id_cdb_intr(dev->id)) {
dev->flags |= ATA_DFLAG_CDB_INTR;
cdb_intr_string = ", CDB intr";
@@ -3525,6 +3542,42 @@ static unsigned int ata_dev_set_xfermode
 }
 
 /**
+ * ata_dev_set_AN - Issue SET FEATURES - SATA FEATURES
+ *   with sector count set to indicate
+ *   Asynchronous Notification feature
+ * @dev: Device to which command will be sent
+ *
+ * Issue SET FEATURES - SATA FEATURES command to device @dev
+ * on port @ap.
+ *
+ * LOCKING:
+ * PCI/etc. bus probe sem.
+ *
+ * RETURNS:
+ * 0 on success, AC_ERR_* mask otherwise.
+ */
+static unsigned int ata_dev_set_AN(struct ata_device *dev)
+{
+   struct ata_taskfile tf;
+   unsigned int err_mask;
+
+   /* set up set-features taskfile */
+   DPRINTK("set features - SATA features\n");
+
+   ata_tf_init(dev, &tf);
+   tf.command = ATA_CMD_SET_FEATURES;
+   tf.feature = SETFEATURES_SATA_ENABLE;
+   tf.flags |= ATA_TFLAG_ISADDR | ATA_TFLAG_DEVICE;
+   tf.protocol = ATA_PROT_NODATA;
+   tf.nsect = SATA_AN;
+
+   err_mask = ata_exec_internal(dev, &tf, NULL, DMA_NONE, NULL, 0);
+
+   DPRINTK("EXIT, err_mask=%x\n", err_mask);
+   return err_mask;
+}
+
+/**
  * ata_dev_init_params - Issue INIT DEV PARAMS command
  * @dev: Device to which command will be sent
  * @heads: Number of heads (taskfile parameter)
Index: 2.6-git/include/linux/ata.h
===
--- 2.6-git.orig/include/linux/ata.h
+++ 2.6-git/include/linux/ata.h
@@ -194,6 +194,12 @@ enum {
SETFEATURES_WC_ON   = 0x02, /* Enable write cache */
SETFEATURES_WC_OFF  = 0x82, /* Disable write cache */
 
+   SETFEATURES_SATA_ENABLE = 0x10, /* Enable use of SATA feature */
+   SETFEATURES_SATA_DISABLE = 0x90, /* Disable use of SATA feature */
+
+   /* SETFEATURE Sector counts for SATA features */
+   SATA_AN = 0x05,  /* Asynchronous Notification */
+
/* ATAPI stuff */
ATAPI_PKT_DMA   = (1 << 0),
ATAPI_DMADIR= (1 << 2), /* ATAPI data dir:
@@ -299,6 +305,8 @@ struct ata_taskfile {
 #define ata_id_queue_depth(id) (((id)[75] & 0x1f) + 1)
 #define ata_id_removeable(id)  ((id)[0] & (1 << 7))
 #define ata_id_has_dword_io(id)((id)[50] & (1 << 0))
+#define ata_id_has_AN(id)  \
+   (((id[76] != 0x) && (id[76] != 0x)) && ((id)[78] & (1 << 5)))
 #define ata_id_iordy_disable(id) ((id)[49] & (1 << 10))
 #define ata_id_has_iordy(id) ((id)[49] & (1 << 9))
 #define ata_id_u32(id,n)   \
Index: 2.6-git/include/linux/libata.h
===
--- 2.6-git.orig/include/linux/libata.h
+++ 2.6-git/include/linux/libata.h
@@ -136,6 +136,7 @@ enum {
ATA_DFLAG_CDB_INTR  = (1 << 2), /* device asserts INTRQ when ready 
for CDB */
ATA_DFLAG_NCQ   = (1 << 3), /* device supports NCQ */
ATA_DFLAG_FLUSH_EXT = (1 << 4), /* do FLUSH_EXT instead of FLUSH */
+   ATA_DFLAG_AN= (1 << 5), /* device supports Async 
notification */
ATA_D

Re: [RFC PATCH]: Rewritten ESP driver, porters needed!

2007-04-24 Thread David Miller
From: Christoph Hellwig <[EMAIL PROTECTED]>
Date: Tue, 24 Apr 2007 13:45:27 +0100

> Oh, btw - there is a problem with the generic code beeing esp.ko -
> we already have drivers/char/esp.c which buids into esp.ko for
> ISA platforms, which have a bit of overlap with ESP-using platforms.
> Maybe the driver should become esp_scsi.c/.ko or ncr_esp or
> ncr53x9x?

What really pisses me off about that one is that I created
esp.c years before that character driver got into the tree
but somehow they won out.

Anyways, what's done is done and I'll rename it to esp_scsi.

Thanks.
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 2/7] genhd: expose AN to user space

2007-04-24 Thread Kristen Carlson Accardi
Allow user space to determine if a disk supports Asynchronous Notification
of media changes.  This is done by adding a new sysfs file "capability_flags",
which is documented in (insert file name).  This sysfs file will export all
disk capabilities flags to user space.  We also define a new flag to define
the media change notification capability.

Changed from last version:
* changed sysfs filename to "capability" from "capability_flags"

Signed-off-by: Kristen Carlson Accardi <[EMAIL PROTECTED]>

Index: 2.6-git/block/genhd.c
===
--- 2.6-git.orig/block/genhd.c
+++ 2.6-git/block/genhd.c
@@ -370,7 +370,10 @@ static ssize_t disk_size_read(struct gen
 {
return sprintf(page, "%llu\n", (unsigned long long)get_capacity(disk));
 }
-
+static ssize_t disk_capability_read(struct gendisk *disk, char *page)
+{
+   return sprintf(page, "%x\n", disk->flags);
+}
 static ssize_t disk_stats_read(struct gendisk * disk, char *page)
 {
preempt_disable();
@@ -413,6 +416,10 @@ static struct disk_attribute disk_attr_s
.attr = {.name = "size", .mode = S_IRUGO },
.show   = disk_size_read
 };
+static struct disk_attribute disk_attr_capability = {
+   .attr = {.name = "capability", .mode = S_IRUGO },
+   .show   = disk_capability_read
+};
 static struct disk_attribute disk_attr_stat = {
.attr = {.name = "stat", .mode = S_IRUGO },
.show   = disk_stats_read
@@ -453,6 +460,7 @@ static struct attribute * default_attrs[
&disk_attr_removable.attr,
&disk_attr_size.attr,
&disk_attr_stat.attr,
+   &disk_attr_capability.attr,
 #ifdef CONFIG_FAIL_MAKE_REQUEST
&disk_attr_fail.attr,
 #endif
Index: 2.6-git/include/linux/genhd.h
===
--- 2.6-git.orig/include/linux/genhd.h
+++ 2.6-git/include/linux/genhd.h
@@ -94,6 +94,7 @@ struct hd_struct {
 
 #define GENHD_FL_REMOVABLE 1
 #define GENHD_FL_DRIVERFS  2
+#define GENHD_FL_MEDIA_CHANGE_NOTIFY   4
 #define GENHD_FL_CD8
 #define GENHD_FL_UP16
 #define GENHD_FL_SUPPRESS_PARTITION_INFO   32
Index: 2.6-git/Documentation/block/capability.txt
===
--- /dev/null
+++ 2.6-git/Documentation/block/capability.txt
@@ -0,0 +1,15 @@
+Generic Block Device Capability
+===
+This file documents the sysfs file block//capability
+
+capability is a hex word indicating which capabilities a specific disk
+supports.  For more information on bits not listed here, see
+include/linux/genhd.h
+
+Capability Value
+---
+GENHD_FL_MEDIA_CHANGE_NOTIFY   4
+   When this bit is set, the disk supports Asynchronous Notification
+   of media change events.  These events will be broadcast to user
+   space via kernel uevent.
+
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 5/7] genhd: send async notification on media change

2007-04-24 Thread Kristen Carlson Accardi
Send an uevent to user space to indicate that a media change event has occurred.

Changes from last version:
* use get/put_device to increment reference count on the device struct

Signed-off-by: Kristen Carlson Accardi <[EMAIL PROTECTED]>

Index: 2.6-git/block/genhd.c
===
--- 2.6-git.orig/block/genhd.c
+++ 2.6-git/block/genhd.c
@@ -643,6 +643,27 @@ struct seq_operations diskstats_op = {
.show   = diskstats_show
 };
 
+static void media_change_notify_thread(struct work_struct *work)
+{
+   struct gendisk *gd = container_of(work, struct gendisk, async_notify);
+   char event[] = "MEDIA_CHANGE=1";
+   char *envp[] = { event, NULL };
+
+   /*
+* set enviroment vars to indicate which event this is for
+* so that user space will know to go check the media status.
+*/
+   kobject_uevent_env(&gd->kobj, KOBJ_CHANGE, envp);
+   put_device(gd->driverfs_dev);
+}
+
+void genhd_media_change_notify(struct gendisk *disk)
+{
+   get_device(disk->driverfs_dev);
+   schedule_work(&disk->async_notify);
+}
+EXPORT_SYMBOL_GPL(genhd_media_change_notify);
+
 struct gendisk *alloc_disk(int minors)
 {
return alloc_disk_node(minors, -1);
@@ -672,6 +693,8 @@ struct gendisk *alloc_disk_node(int mino
kobj_set_kset_s(disk,block_subsys);
kobject_init(&disk->kobj);
rand_initialize_disk(disk);
+   INIT_WORK(&disk->async_notify,
+   media_change_notify_thread);
}
return disk;
 }
Index: 2.6-git/include/linux/genhd.h
===
--- 2.6-git.orig/include/linux/genhd.h
+++ 2.6-git/include/linux/genhd.h
@@ -66,6 +66,7 @@ struct partition {
 #include 
 #include 
 #include 
+#include 
 
 struct partition {
unsigned char boot_ind; /* 0x80 - active */
@@ -139,6 +140,7 @@ struct gendisk {
 #else
struct disk_stats dkstats;
 #endif
+   struct work_struct async_notify;
 };
 
 /* Structure for sysfs attributes on block devices */
@@ -419,7 +421,7 @@ extern struct gendisk *alloc_disk_node(i
 extern struct gendisk *alloc_disk(int minors);
 extern struct kobject *get_disk(struct gendisk *disk);
 extern void put_disk(struct gendisk *disk);
-
+extern void genhd_media_change_notify(struct gendisk *disk);
 extern void blk_register_region(dev_t dev, unsigned long range,
struct module *module,
struct kobject *(*probe)(dev_t, int *, void *),
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 7/7] libata: send event when AN received

2007-04-24 Thread Kristen Carlson Accardi
When we get an SDB FIS with the 'N' bit set, we should send
an event to user space to indicate that there has been a
media change.  This will be done via the block device. 

changed from last version:
* Make sure that port_addr is within ATA_MAX_DEVICES

Signed-off-by: Kristen Carlson Accardi <[EMAIL PROTECTED]>
Index: 2.6-git/drivers/ata/ahci.c
===
--- 2.6-git.orig/drivers/ata/ahci.c
+++ 2.6-git/drivers/ata/ahci.c
@@ -1147,6 +1147,28 @@ static void ahci_host_intr(struct ata_po
return;
}
 
+   if (status & PORT_IRQ_SDB_FIS) {
+   /*
+* if this is an ATAPI device with AN turned on,
+* then we should interrogate the device to
+* determine the cause of the interrupt
+*
+* for AN - this we should check the SDB FIS
+* and find the I and N bits set
+*/
+   const u32 *f = pp->rx_fis + RX_FIS_SDB;
+
+   /* check the 'N' bit in word 0 of the FIS */
+   if (f[0] & (1 << 15)) {
+   int port_addr =  ((f[0] & 0x0f00) >> 8);
+   struct ata_device *adev;
+   if (port_addr < ATA_MAX_DEVICES) {
+   adev = &ap->device[port_addr];
+   if (adev->flags & ATA_DFLAG_AN)
+   ata_scsi_media_change_notify(adev);
+   }
+   }
+   }
if (ap->sactive)
qc_active = readl(port_mmio + PORT_SCR_ACT);
else
Index: 2.6-git/include/linux/libata.h
===
--- 2.6-git.orig/include/linux/libata.h
+++ 2.6-git/include/linux/libata.h
@@ -737,6 +737,7 @@ extern void ata_host_init(struct ata_hos
 extern int ata_scsi_detect(struct scsi_host_template *sht);
 extern int ata_scsi_ioctl(struct scsi_device *dev, int cmd, void __user *arg);
 extern int ata_scsi_queuecmd(struct scsi_cmnd *cmd, void (*done)(struct 
scsi_cmnd *));
+extern void ata_scsi_media_change_notify(struct ata_device *atadev);
 extern void ata_sas_port_destroy(struct ata_port *);
 extern struct ata_port *ata_sas_port_alloc(struct ata_host *,
   struct ata_port_info *, struct 
Scsi_Host *);
Index: 2.6-git/drivers/ata/libata-scsi.c
===
--- 2.6-git.orig/drivers/ata/libata-scsi.c
+++ 2.6-git/drivers/ata/libata-scsi.c
@@ -3057,6 +3057,22 @@ static void ata_scsi_remove_dev(struct a
 }
 
 /**
+ * ata_scsi_media_change_notify - send media change event
+ * @atadev: Pointer to the disk device with media change event
+ *
+ * Tell the block layer to send a media change notification
+ * event.
+ *
+ * LOCKING:
+ * interrupt context, may not sleep.
+ */
+void ata_scsi_media_change_notify(struct ata_device *atadev)
+{
+   genhd_media_change_notify(atadev->sdev->disk);
+}
+EXPORT_SYMBOL_GPL(ata_scsi_media_change_notify);
+
+/**
  * ata_scsi_hotplug - SCSI part of hotplug
  * @work: Pointer to ATA port to perform SCSI hotplug on
  *
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


old ISA DMA bug in 2.6.12?

2007-04-24 Thread Bob Tracy
I was enjoying yet another session of beating my head against the wall
trying to do useful things with old hardware :-), and managed to cause a
kernel panic by simply trying to mount a cdrom in the context of a DSL-N
installation.

The SCSI host adapter is an Adaptec AHA-1542B, and when I try to mount a
cdrom, I manage to run afoul of the BAD_DMA() check in aha1542.c: the
buffer returned is not in the lower 16 MB of memory.

The same 2.6.12 kernel + hardware combination works fine as long as I
confine my I/O to the hard disk that's also attached to the AHA-1542B.

-- 
---
Bob Tracy   WTO + WIPO = DMCA? http://www.anti-dmca.org
[EMAIL PROTECTED]
---
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/7] libata: check for AN support

2007-04-24 Thread Olivier Galibert
On Tue, Apr 24, 2007 at 01:53:27PM -0700, Kristen Carlson Accardi wrote:
> Check to see if an ATAPI device supports Asynchronous Notification.
> If so, enable it.
> 
> changes from last version: 
> * fix typo in ata_id_has_AN and make word 76 test more clear
> * If we fail to set the AN feature, just print a warning and continue
>  
> Signed-off-by: Kristen Carlson Accardi <[EMAIL PROTECTED]>
> 
> @@ -299,6 +305,8 @@ struct ata_taskfile {
>  #define ata_id_queue_depth(id)   (((id)[75] & 0x1f) + 1)
>  #define ata_id_removeable(id)((id)[0] & (1 << 7))
>  #define ata_id_has_dword_io(id)  ((id)[50] & (1 << 0))
> +#define ata_id_has_AN(id)\
> + (((id[76] != 0x) && (id[76] != 0x)) && ((id)[78] & (1 << 5)))

(id)[76] I guess ?  Sorry for being a pain :/

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html