Re: [PATCH 8/8] scsi tgt: IBM eServer i/pSeries virtual SCSI target driver

2006-11-28 Thread FUJITA Tomonori
Hi ibmvscsi team,

Can I get your ACK on this patch to get it merged into the scsi-misc?


From: FUJITA Tomonori <[EMAIL PROTECTED]>
Subject: [PATCH 8/8] scsi tgt: IBM eServer i/pSeries virtual SCSI target driver
Date: Thu, 16 Nov 2006 19:24:26 +0900

> This is IBM Virtual SCSI target driver for tgt. The driver is based on
> the original ibmvscsis driver:
> 
> http://lkml.org/lkml/2005/10/17/99
> 
> 
> Signed-off-by: FUJITA Tomonori <[EMAIL PROTECTED]>
> Signed-off-by: Mike Christie <[EMAIL PROTECTED]>
> Signed-off-by: James Bottomley <[EMAIL PROTECTED]>
> ---
>  drivers/scsi/Kconfig |   14 +
>  drivers/scsi/Makefile|1 +
>  drivers/scsi/ibmvscsi/Makefile   |2 +
>  drivers/scsi/ibmvscsi/ibmvstgt.c |  958 
> ++
>  4 files changed, 975 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig
> index e5dd59e..87c51fd 100644
> --- a/drivers/scsi/Kconfig
> +++ b/drivers/scsi/Kconfig
> @@ -804,6 +804,20 @@ config SCSI_IBMVSCSI
> To compile this driver as a module, choose M here: the
> module will be called ibmvscsic.
>  
> +config SCSI_IBMVSCSIS
> + tristate "IBM Virtual SCSI Server support"
> + depends on PPC_PSERIES && SCSI_TGT && SCSI_SRP
> + help
> +   This is the SRP target driver for IBM pSeries virtual environments.
> +
> +   The userspace component needed to initialize the driver and
> +   documentation can be found:
> +
> +   http://stgt.berlios.de/
> +
> +   To compile this driver as a module, choose M here: the
> +   module will be called ibmvstgt.
> +
>  config SCSI_INITIO
>   tristate "Initio 9100U(W) support"
>   depends on PCI && SCSI
> diff --git a/drivers/scsi/Makefile b/drivers/scsi/Makefile
> index a6281fd..854b98b 100644
> --- a/drivers/scsi/Makefile
> +++ b/drivers/scsi/Makefile
> @@ -128,6 +128,7 @@ obj-$(CONFIG_SCSI_NSP32)  += nsp32.o
>  obj-$(CONFIG_SCSI_IPR)   += ipr.o
>  obj-$(CONFIG_SCSI_SRP)   += libsrp.o
>  obj-$(CONFIG_SCSI_IBMVSCSI)  += ibmvscsi/
> +obj-$(CONFIG_SCSI_IBMVSCSIS) += ibmvscsi/
>  obj-$(CONFIG_SCSI_HPTIOP)+= hptiop.o
>  obj-$(CONFIG_SCSI_STEX)  += stex.o
>  
> diff --git a/drivers/scsi/ibmvscsi/Makefile b/drivers/scsi/ibmvscsi/Makefile
> index 4e247b6..6ac0633 100644
> --- a/drivers/scsi/ibmvscsi/Makefile
> +++ b/drivers/scsi/ibmvscsi/Makefile
> @@ -3,3 +3,5 @@ obj-$(CONFIG_SCSI_IBMVSCSI)   += ibmvscsic
>  ibmvscsic-y  += ibmvscsi.o
>  ibmvscsic-$(CONFIG_PPC_ISERIES)  += iseries_vscsi.o 
>  ibmvscsic-$(CONFIG_PPC_PSERIES)  += rpa_vscsi.o 
> +
> +obj-$(CONFIG_SCSI_IBMVSCSIS) += ibmvstgt.o
> diff --git a/drivers/scsi/ibmvscsi/ibmvstgt.c 
> b/drivers/scsi/ibmvscsi/ibmvstgt.c
> new file mode 100644
> index 000..73fcfca
> --- /dev/null
> +++ b/drivers/scsi/ibmvscsi/ibmvstgt.c
> @@ -0,0 +1,958 @@
> +/*
> + * IBM eServer i/pSeries Virtual SCSI Target Driver
> + * Copyright (C) 2003-2005 Dave Boutcher ([EMAIL PROTECTED]) IBM Corp.
> + *  Santiago Leon ([EMAIL PROTECTED]) IBM Corp.
> + *  Linda Xie ([EMAIL PROTECTED]) IBM Corp.
> + *
> + * Copyright (C) 2005-2006 FUJITA Tomonori <[EMAIL PROTECTED]>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
> + * USA
> + */
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "ibmvscsi.h"
> +
> +#define  INITIAL_SRP_LIMIT   16
> +#define  DEFAULT_MAX_SECTORS 512
> +
> +#define  TGT_NAME"ibmvstgt"
> +
> +/*
> + * Hypervisor calls.
> + */
> +#define h_copy_rdma(l, sa, sb, da, db) \
> + plpar_hcall_norets(H_COPY_RDMA, l, sa, sb, da, db)
> +#define h_send_crq(ua, l, h) \
> + plpar_hcall_norets(H_SEND_CRQ, ua, l, h)
> +#define h_reg_crq(ua, tok, sz)\
> + plpar_hcall_norets(H_REG_CRQ, ua, tok, sz);
> +#define h_free_crq(ua) \
> + plpar_hcall_norets(H_FREE_CRQ, ua);
> +
> +/* tmp - will replace with SCSI logging stuff */
> +#define eprintk(fmt, args...)\
> +do { \
> + printk("%s(%d) " fmt, __FUNC

Re: Possible bug in scsi_lib.c:scsi_req_map_sg()

2006-11-28 Thread Boaz Harrosh

Mike Christie wrote:

Boaz Harrosh wrote:

Playing with some tests which I admit are not 100% orthodox I have
stumbled upon a bug that raises a serious question:

In the call to scsi_execute_async() in the use_sg case, must the
scatterlist* (pointed to by buffer) map a buffer that's contiguous in
virtual memory or is it allowed to map disjoint segments of memory?


I thought they were continguous. I think James has said before that they
can be disjoint. When we converted sg it did not look like sg or st
supported disjoint. The main non dio path used a buffer from
get_free_pages so I thought that would always be contiguous. The dio
path then always set the first sg offset, but the rest it set to zero.

How did you hit this problem? Is it with sg or st, or with some other
code? Is it the mmap path maybe?


OK I admit, guilty as charged, I was using it from a kernel driver, OSD-Initiator from 
IBM. The code is unorthodox in mapping user space iovects into scatterlist*. I will have 
to work around it than. Such a petty because it saves me a copy of an high bandwidth 
channel. with iSCSI the fix works well but I guess if the working assumption was 
"contiguous", than allowing it here will expose problems in drivers that don't 
expect it.

In any way the bio.c patch should go in. 1. Zero vecs bio cannot be freed with 
current code 2. It lets kernel exit gracefully with an error instead of a crash.

should we at least do below patch so people know what happened postmortem:

diff -Npu /tmp/tmp.5864.0 
/home/bharrosh/p4.local/local/scsi-misc-2.6-dev/linux/drivers/scsi/scsi_lib.c 
-L a/scsi_lib.c -L b/scsi_lib.c
--- a/scsi_lib.c
+++ b/scsi_lib.c
@@ -321,6 +321,9 @@ static int scsi_req_map_sg(struct reques
nr_vecs = min_t(int, BIO_MAX_PAGES, nr_pages);
nr_pages -= nr_vecs;

+/* most probably not a contiguous memory mapping */ 
+BUGON(!nr_vecs); 
+

bio = bio_alloc(gfp, nr_vecs);
if (!bio) {
err = -ENOMEM;


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Possible bug in scsi_lib.c:scsi_req_map_sg()

2006-11-28 Thread Mike Christie
Boaz Harrosh wrote:
> Mike Christie wrote:
>> Boaz Harrosh wrote:
>>> Playing with some tests which I admit are not 100% orthodox I have
>>> stumbled upon a bug that raises a serious question:
>>>
>>> In the call to scsi_execute_async() in the use_sg case, must the
>>> scatterlist* (pointed to by buffer) map a buffer that's contiguous in
>>> virtual memory or is it allowed to map disjoint segments of memory?
>>
>> I thought they were continguous. I think James has said before that they
>> can be disjoint. When we converted sg it did not look like sg or st
>> supported disjoint. The main non dio path used a buffer from
>> get_free_pages so I thought that would always be contiguous. The dio
>> path then always set the first sg offset, but the rest it set to zero.
>>
>> How did you hit this problem? Is it with sg or st, or with some other
>> code? Is it the mmap path maybe?
> 
> OK I admit, guilty as charged, I was using it from a kernel driver,
> OSD-Initiator from IBM. The code is unorthodox in mapping user space
> iovects into scatterlist*. I will have to work around it than.

Well, you do not have to work around it :)

I want to kill scsi_execute_async and just allow the ULDs to allocate a
request, call blk_rq_map_* (and add any new map helpers we need), then
call blk_execute_rq_nowait. This gives the ULDs some flexibility and
kills my ugly function. This is what I originally did here
http://marc.theaimsgroup.com/?l=linux-scsi&m=112356952007369&w=2

For some reason, I flip flopped and went with scsi_execute_async and the
scatterlist argument hack. I think I did this because I thought it would
be less problems in converting the ULDs in stages. First stage was to
remove scsi_request usage and clean/fix up scsi-ml and LLDs, next would
be to convert to block layer functions directly, but looking back it
might have been better to just go through one big headache.

I think Christoph Hellwig has patches to remove scsi_execute_async as
part of his bidi work. He needs help testing and reviewing them, so you
should help him out instead of working around it :)
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: aic94xx panic on module load

2006-11-28 Thread Darrick J. Wong
Mark Haverkamp wrote:
> I got this panic when loading the aic94xx module.  The adapter is
> connected to an HP MSA50 SAS enclosure with 3 72GB SAS disks.
> 
> Kernel 2.6.19-rc6-scsi-misc on an x86_64

> sas: task finished with resp:0x0, stat:0x89
> sas: sas_discover_sata() for device 500508b300a27a2c at 500508b300a27a2f:0xc 
> returned 0xff06
> kobject_add failed for port-2:0:12 with -EEXIST, don't try to register things 
> with the same name in the same directory.

Your expander is reporting your SAS disks to aic94xx as SATA disks,
which is why the sas_discover_sata fails.  I don't know why it would do
that... flaky hardware?  I'm not really sure what to do when we're given
bad information.

> Kernel BUG at drivers/scsi/libsas/sas_expander.c:603

I believe this BUG is fixed by a few patches in aic94xx-sas.  For sure
you'll want the patch named "libsas: better error handling in
sas_ex_discover_end_dev()" patch; see commit
82f6bc0849b6fce9a965dde11dd6f685adc7285e.

There are some dependencies:
e384a0bdd9d3abb5ba2f6eac9ac4d0ac61e1c6a1 ->
1f8787b198c4ba058a0bfc06c2ca7f301168a5dd ->
82f6bc0849b6fce9a965dde11dd6f685adc7285e.

--D
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


remove_on_dev_loss module parameter in scsi_transport_fc

2006-11-28 Thread Qi, Yanling
Hi All,

 

I saw SLES10 2.6.16.21-0.8-smp kernel supports a new module parameter
"remove_on_dev_loss" in scsi_transport_fc. The parameter description is

 

remove_on_dev_loss:Boolean.  When the device loss timer fires, this
variable controls whether the scsi infrastructure for the target device
is removed.  Values: zero means do not remove, non-zero means remove.
Default is zero. (int)

 

Seems this module parameter is not available in the upstream kernel. Is
this SUES' private module parameter or the upstream kernel will support
this parameter soon?

 

Thanks,

 

Yanling

 

Yanling Qi

Engenio Storage Group - LSI Logic

512-794-3713 (Office)

512-794-3702 (Fax)

[EMAIL PROTECTED]

 

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: aic94xx panic on module load

2006-11-28 Thread Douglas Gilbert
Mark Haverkamp wrote:
> I got this panic when loading the aic94xx module.  The adapter is
> connected to an HP MSA50 SAS enclosure with 3 72GB SAS disks.
> 
> Kernel 2.6.19-rc6-scsi-misc on an x86_64
> 
> ---
> 
> 
> aic94xx: Adaptec aic94xx SAS/SATA driver version 1.0.2 loaded
> aic94xx: found Adaptec AIC-9410W SAS/SATA Host Adapter, device :08:01.0
> aic94xx: BIOS present (1,2), 1673
> aic94xx: ue num:3, ue size:88
> aic94xx: manuf sect SAS_ADDR 5d100045af00


> sas: phy1 matched wide port0
> sas: phy1 added to port0, phy_mask:0x3
> sas: phy2 matched wide port0
> sas: phy2 added to port0, phy_mask:0x7
> sas: phy3 matched wide port0
> sas: phy3 added to port0, phy_mask:0xf
> sas: DOING DISCOVERY on port 0, pid:3524
> sas: ex 500508b300a27a2f phy00:T attached: 500508b300a27a3f
> sas: ex 500508b300a27a2f phy01:T attached: 500508b300a27a3f
> sas: ex 500508b300a27a2f phy02:T attached: 
> sas: ex 500508b300a27a2f phy03:T attached: 
> sas: ex 500508b300a27a2f phy04:S attached: 5d100045af00
> sas: ex 500508b300a27a2f phy05:S attached: 5d100045af00
> sas: ex 500508b300a27a2f phy06:S attached: 5d100045af00
> sas: ex 500508b300a27a2f phy07:S attached: 5d100045af00
> sas: ex 500508b300a27a2f phy08:T attached: 
> sas: ex 500508b300a27a2f phy09:T attached: 
> sas: ex 500508b300a27a2f phy10:T attached: 
> sas: ex 500508b300a27a2f phy11:T attached: 
> sas: ex 500508b300a27a2f phy12:D attached: 500508b300a27a2c
> sas: ex 500508b300a27a3f phy00:D attached: 5000c595f8b5
> sas: ex 500508b300a27a3f phy01:D attached: 
> sas: ex 500508b300a27a3f phy02:D attached: 5000c595d3b5
> sas: ex 500508b300a27a3f phy03:D attached: 
> sas: ex 500508b300a27a3f phy04:D attached: 5000c595c0b9
> sas: ex 500508b300a27a3f phy05:D attached: 
> sas: ex 500508b300a27a3f phy06:D attached: 
> sas: ex 500508b300a27a3f phy07:D attached: 
> sas: ex 500508b300a27a3f phy08:D attached: 
> sas: ex 500508b300a27a3f phy09:D attached: 
> sas: ex 500508b300a27a3f phy10:S attached: 500508b300a27a2f
> sas: ex 500508b300a27a3f phy11:S attached: 500508b300a27a2f
> sas: task finished with resp:0x0, stat:0x89
> sas: sas_discover_sata() for device 500508b300a27a2c at 500508b300a27a2f:0xc 
> returned 0xff06
> kobject_add failed for port-2:0:12 with -EEXIST, don't try to register things 
> with the same name in the same directory.

So this is an interesting expander setup within the enclosure.
There are two expanders (500508b300a27a2f + 500508b300a27a3f)
interconnected via a two wide link (0,1 <-> 10,11 (T-S)) with
a four wide link back to the 94xx HBA (4,5,6,7 <-> 0,1,2,3).
My guess is that 500508b300a27a2f:12 is virtual and contains a
SES target. That leaves SAS disks on 500508b300a27a3f:0,
500508b300a27a3f,2 and 500508b300a27a3f,4

The pain starts immediately after the sas transport layer
tries to process those expander SMP DISCOVER responses.
The trace seems to suggest the device at 500508b300a27a2f:12
is SATA: extremely unlikely.

Mark, do you have a LSI MPT Fusion SAS HBA handy? If
so you might connect the enclosure to it, get smp_utils
and do something like:
 # modprobe mptctl
 # smp_discover -p 12 -s 0x500508b300a27a2f /dev/mptctl

and post the output.


BTW Darrick, SATA disks connected to an expander usually
get SAS addresses like  where
"n" is small. The device attached to 500508b300a27a2f:12
is in that region: 500508b300a27a2c

Doug Gilbert


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Possible bug in scsi_lib.c:scsi_req_map_sg()

2006-11-28 Thread Jens Axboe
On Mon, Nov 27 2006, Mike Christie wrote:
> Mike Christie wrote:
> > Boaz Harrosh wrote:
> >> Playing with some tests which I admit are not 100% orthodox I have
> >> stumbled upon a bug that raises a serious question:
> >>
> >> In the call to scsi_execute_async() in the use_sg case, must the
> >> scatterlist* (pointed to by buffer) map a buffer that's contiguous in
> >> virtual memory or is it allowed to map disjoint segments of memory?
> > 
> > I thought they were continguous. I think James has said before that they
> > can be disjoint. When we converted sg it did not look like sg or st
> > supported disjoint. The main non dio path used a buffer from
> > get_free_pages so I thought that would always be contiguous. The dio
> > path then always set the first sg offset, but the rest it set to zero.
> 
> And the len is set to page size for the middle entries too.
> 
> But for the non DIO st path we can end up with some middle sg entires
> that are not a full page so that code in scsi_execute_async is broken
> for that.

If something doesn't work with non-contig sg entries, that would be a
bug. If the question is regarding holes in the sg list, that is probably
unchartered territory and I would not regard that as supported.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: aic94xx panic on module load

2006-11-28 Thread Mark Haverkamp
On Tue, 2006-11-28 at 13:44 -0500, Douglas Gilbert wrote:
> Mark Haverkamp wrote:
> > I got this panic when loading the aic94xx module.  The adapter is
> > connected to an HP MSA50 SAS enclosure with 3 72GB SAS disks.
> > 
> > Kernel 2.6.19-rc6-scsi-misc on an x86_64

[ ... ]

> So this is an interesting expander setup within the enclosure.
> There are two expanders (500508b300a27a2f + 500508b300a27a3f)
> interconnected via a two wide link (0,1 <-> 10,11 (T-S)) with
> a four wide link back to the 94xx HBA (4,5,6,7 <-> 0,1,2,3).
> My guess is that 500508b300a27a2f:12 is virtual and contains a
> SES target. That leaves SAS disks on 500508b300a27a3f:0,
> 500508b300a27a3f,2 and 500508b300a27a3f,4
> 
> The pain starts immediately after the sas transport layer
> tries to process those expander SMP DISCOVER responses.
> The trace seems to suggest the device at 500508b300a27a2f:12
> is SATA: extremely unlikely.
> 
> Mark, do you have a LSI MPT Fusion SAS HBA handy? If
> so you might connect the enclosure to it, get smp_utils
> and do something like:
>  # modprobe mptctl
>  # smp_discover -p 12 -s 0x500508b300a27a2f /dev/mptctl
> 
> and post the output.

I do have one on another machine.  I'll get it installed in this machine
and give it a try.  We do have LSI cards connected to these kinds of
enclosures and disks and they seem to be working OK.

Mark.

> 
> BTW Darrick, SATA disks connected to an expander usually
> get SAS addresses like  where
> "n" is small. The device attached to 500508b300a27a2f:12
> is in that region: 500508b300a27a2c
> 
> Doug Gilbert
> 
> 
-- 
Mark Haverkamp <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Disable SCSI-Reservation at the driver level ?

2006-11-28 Thread Douglas Gilbert
James Bottomley wrote:
> On Sun, 2006-11-26 at 17:31 +0100, roland wrote:
>> VMWare ESX refuses to create VMFS Filesystem on SATA disk, attached to a 
>> onBoard SAS controller (lsi1068).
>> When i raid1 two SATA disks, it works, if i use a single SATA disk, the 
>> controller seems to "expose" the disk differently to the operating system 
>> and creation of a VMFS fails due to missing ability to issue SCSI 
>> reservation command.
> 
> There's no SCSI fix for this ... the SAT has no translation for the SCSI
> reservation commands, largely because there's no corresponding ATA
> equivalent and even for SCSI devices they may fail anyway.  The
> application should cope with such a failure, so in this case it's the
> application that needs fixing.

SAT originally did have persistent reservations and it
was dropped and is back on the agenda for SAT-2. A SAT
layer (such as the one found in libata) can do more
that just translate command, it may also emulate SCSI
commands.

And PERSISTENT RESERVE IN and OUT (and maybe the older
RESERVE and RELEASE) would be very good candidates for
emulation. To do this however libata would need to be
a lot more transport aware than it is now. To do such
an emulation a SAT layer needs to know:
  a) whether it has full control over the SATA device
 (i.e. there is no other path to it) and failing
 that, it has some other mechanism such as
 affiliations in SAS with SMP available to control
 them
  b) the identity of the initiator (port) asking for
 the reservation.

If libata could do this it would add a lot of value
over and above simple command translation.

Doug Gilbert


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fw: aic94xx breaks with SATA drives that have medium errors

2006-11-28 Thread Darrick J. Wong
> Everything works okay until I perform a read I/O to the media-error-causing
> location. Immediately I get:
> 
> aic94xx: escb_tasklet_complete: phy2: REQ_TASK_ABORT

Interesting that you get REQ_TASK_ABORT for a media error...

> But the I/O only returns to the SCSI layer after its full designated
> timeout, instead of returning quickly with MEDIUM_ERROR.

Yep.  The abort function doesn't know how to tell libata to abort the
command.  I suppose the "proper" thing to do would be to modify
sas_ata_task_done to check if the SAS_TASK_ABORTED or
SAS_TASK_INITIATOR_ABORTED flags are set and send some sort of ATA error
code back that would cause a retry.  Though, I don't see why the
sequencer sends back REQ_TASK_ABORT--presumably the drive generates some
media error data that could be fed to libata.

> After that particular I/O fails, every I/O to the driver will immediately
> return as aborted. Unloading and loading the driver reverses the problem
> but may crash the kernel not long after printing this:
> 
> Nov 28 02:13:58 pro210 kernel: aic94xx: Uh-oh! Pending is not empty!
> Nov 28 02:13:58 pro210 kernel: aic94xx: freeing from pending

Yep.  Side effect of above.  I'll send you a patch later today when I
get this sorted out.  In any case, thank you for testing out the driver! :)

--D
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Fix race condition between scsi scan and sdev block/unblock

2006-11-28 Thread Bino . Sebastian
James,

Our testing has encountered an error between sdev initialization/
scanning and the sdev block/unblock behavior. What we have seen is that 
new target detection will kick off a scan, and that an sdev will be in 
the creation process with the state SDEV_CREATED. At this point a link 
event occurs, which blocks the sdev, changes its state to SDEV_BLOCK, 
and stops its request queue.  However, the creation thread is still 
executing, and decides to transition the sdev state to SDEV_RUNNING. 
Note that the request queue is still blocked. The sdev then gets unblocked, 
attempting to change the state to SDEV_RUNNING, which fails as it is already 
SDEV_RUNNING, which causes the unblock routine to bypass the call to 
blk_start_queue().

This patch modifies the creation path so that it only changes to SDEV_RUNNING
if the state is SDEV_CREATED. This allows the block/unblock to work 
appropriately. It does have a side effect that unblock could early-transition 
the sdev to SDEV_RUNNING.

-- bino 

Signed-off-by: Bino Sebastian <[EMAIL PROTECTED]>

diff -upNr a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
--- a/drivers/scsi/scsi_scan.c  2006-11-20 16:40:21.977901000 -0500
+++ b/drivers/scsi/scsi_scan.c  2006-11-21 09:08:41.23268 -0500
@@ -762,7 +762,8 @@ static int scsi_add_lun(struct scsi_devi
 
/* set the device running here so that slave configure
 * may do I/O */
-   scsi_device_set_state(sdev, SDEV_RUNNING);
+   if (sdev->sdev_state == SDEV_CREATED)
+   scsi_device_set_state(sdev, SDEV_RUNNING);
 
if (*bflags & BLIST_MS_192_BYTES_FOR_3F)
sdev->use_192_bytes_for_3f = 1;
diff -upNr a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
--- a/drivers/scsi/scsi_sysfs.c 2006-11-20 16:40:21.988903000 -0500
+++ b/drivers/scsi/scsi_sysfs.c 2006-11-21 09:08:41.24568 -0500
@@ -676,7 +676,8 @@ int scsi_sysfs_add_sdev(struct scsi_devi
 {
int error, i;
 
-   if ((error = scsi_device_set_state(sdev, SDEV_RUNNING)) != 0)
+   if ((sdev->sdev_state == SDEV_CREATED) &&
+   ((error = scsi_device_set_state(sdev, SDEV_RUNNING)) != 0))
return error;
 
error = device_add(&sdev->sdev_gendev);



-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: aic94xx panic on module load

2006-11-28 Thread Mark Haverkamp
On Tue, 2006-11-28 at 13:44 -0500, Douglas Gilbert wrote:

[ ... ]

> So this is an interesting expander setup within the enclosure.
> There are two expanders (500508b300a27a2f + 500508b300a27a3f)
> interconnected via a two wide link (0,1 <-> 10,11 (T-S)) with
> a four wide link back to the 94xx HBA (4,5,6,7 <-> 0,1,2,3).
> My guess is that 500508b300a27a2f:12 is virtual and contains a
> SES target. That leaves SAS disks on 500508b300a27a3f:0,
> 500508b300a27a3f,2 and 500508b300a27a3f,4
> 
> The pain starts immediately after the sas transport layer
> tries to process those expander SMP DISCOVER responses.
> The trace seems to suggest the device at 500508b300a27a2f:12
> is SATA: extremely unlikely.
> 
> Mark, do you have a LSI MPT Fusion SAS HBA handy? If
> so you might connect the enclosure to it, get smp_utils
> and do something like:
>  # modprobe mptctl
>  # smp_discover -p 12 -s 0x500508b300a27a2f /dev/mptctl
> 
> and post the output.

I'm not sure how to interpret this, but it looks like something didn't
work right.  

After the modprobe I see this:

Nov 28 13:42:28 odt2-003 kernel: Fusion MPT misc device (ioctl) driver 3.04.02
Nov 28 13:42:28 odt2-003 kernel: mptctl: Registered with Fusion MPT base driver
Nov 28 13:42:28 odt2-003 kernel: mptctl: /dev/mptctl @ (major,minor=10,220)


Then running the smp_discover command: 
# ./smp_discover -p 12 -s 0x500508b300a27a2f /dev/mptctl
smp_send_req failed, res=-1

gets this in the log:
Nov 28 13:43:17 odt2-003 kernel: mptbase: ioc0: IOCStatus(0x0001): Invalid 
Function


> 
-- 
Mark Haverkamp <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Integrated aic7902B dual-channel SCSI adapter - high load average. Lots of processes in uninterruptible sleep

2006-11-28 Thread Michael Ulitskiy
Hello,

I have a problem with what I believe might be related to the aic79xx driver.
I'm running 2.4.33.3 kernel with default aic79xx driver version 1.3.10, no 
options.
There're 2 drives in software RAID 1, hostRAID disabled:

[EMAIL PROTECTED]:~# cat /proc/scsi/scsi
Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
  Vendor: COMPAQ   Model: BD3008A4C6   Rev: HPB4
  Type:   Direct-AccessANSI SCSI revision: 03
Host: scsi0 Channel: 00 Id: 01 Lun: 00
  Vendor: COMPAQ   Model: BD3008A4C6   Rev: HPB4
  Type:   Direct-AccessANSI SCSI revision: 03

[EMAIL PROTECTED]:~# cat /proc/scsi/aic79xx/0
Adaptec AIC79xx driver version: 1.3.10
Adaptec AIC7902 Ultra320 SCSI adapter
aic7902: Ultra320 Wide Channel A, SCSI Id=7, PCI-X 67-100Mhz, 512 SCBs
Allocated SCBs: 64, SG List Length: 102

Serial EEPROM:
0x17c8 0x17c8 0x17c8 0x17c8 0x17c8 0x17c8 0x17c8 0x17c8
0x17c8 0x17c8 0x17c8 0x17c8 0x17c8 0x17c8 0x17c8 0x17c8
0x09f4 0x0146 0x2807 0x0010 0x 0x 0x 0x
0x 0x 0x 0x 0x 0x 0x0430 0xb3f7

Target 0 Negotiation Settings
User: 320.000MB/s transfers (160.000MHz RDSTRM|DT|IU|RTI|QAS, 16bit)
Goal: 320.000MB/s transfers (160.000MHz DT|IU|RTI|QAS, 16bit)
Curr: 320.000MB/s transfers (160.000MHz DT|IU|RTI|QAS, 16bit)
Transmission Errors 0
Channel A Target 0 Lun 0 Settings
Commands Queued 8762259
Commands Active 3
Command Openings 29
Max Tagged Openings 32
Device Queue Frozen Count 0
Target 1 Negotiation Settings
User: 320.000MB/s transfers (160.000MHz RDSTRM|DT|IU|RTI|QAS, 16bit)
Goal: 320.000MB/s transfers (160.000MHz DT|IU|RTI|QAS, 16bit)
Curr: 320.000MB/s transfers (160.000MHz DT|IU|RTI|QAS, 16bit)
Transmission Errors 0
Channel A Target 1 Lun 0 Settings
Commands Queued 8784600
Commands Active 2
Command Openings 30
Max Tagged Openings 32
Device Queue Frozen Count 0
...

The machine is a mail server that was recently migrated to the new hardware
and now the load average is constantly between 10 and 20 and sometimes
climbs higher. top and vmstat shows lots of processes in uninterruptible 
sleep, which as far I understand means that the processes wait while the driver
code to complete:

[EMAIL PROTECTED]:~# vmstat 1
procs ---memory-- ---swap-- -io -system-- cpu
 r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy id wa
 0 12  0 316004 100400 102631200   11755  180  183  4  4 92  0
 1  3  0 337004 100400 10263360024  2052  526 1481  4  2 94  0
 0 14  0 334832 100404 102643200 0  1528  510 1059  4  2 94  0
 0  8  0 336312 100408 102642400 0  1868  494 1198  6  4 90  0

I wonder if anyone has seen anything like this and/or can advice any cure?
Do you think upgrading the driver to the latest provided by adaptec (2.0.20 I 
guess)
should help? Any other buttons to push?
There's another server running alongside with similar IO load on 2 SATA drives 
in software
RAID1 and load average never gets above 0.5
Please let me know if you need more info.
Thanks a lot,

Michael
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: aic94xx panic on module load

2006-11-28 Thread Mark Haverkamp
On Tue, 2006-11-28 at 13:46 -0800, Mark Haverkamp wrote:
> On Tue, 2006-11-28 at 13:44 -0500, Douglas Gilbert wrote:
> 
> [ ... ]
> 

I don't know if this helps, but I found the verbose option.  Here is a
little debug output.


./smp_discover -v  -p 12 -s 0x500508b300a27a2f /dev/mptctl
Discover request: 40 10 00 02 00 00 00 00 00 0c 00 00 00 00 00 00
send_req_mpt: subvalue=0  SAS address=0x500508b300a27a2f
mptctl two scatter gather list interface
IOCStatus=0x1
IOCStatus=0x1 IOCLogInfo=0xA27A2F SASStatus=0x0
smp_send_req failed, res=-1


> 
> > 
-- 
Mark Haverkamp <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Pass struct dev pointer to dma_cache_sync()

2006-11-28 Thread Ralf Baechle
So following the previous patch to pass a struct dev pointer to
dma_is_consistent() there is still dma_cache_sync left which does not
receive a dev pointer.

  Ralf

-

Pass struct dev pointer to dma_cache_sync()

dma_cache_sync() is ill-designed in that it does not have a struct
device pointer argument which makes proper support for systems that consist
of a mix of coherent and non-coherent DMA devices hard.  Change
dma_cache_sync to take a struct device pointer as first argument and fix
all its callers to pass it.

Signed-off-by: Ralf Baechle <[EMAIL PROTECTED]>

---
 Documentation/DMA-API.txt |2 -
 arch/avr32/mm/dma-coherent.c  |2 -
 arch/mips/mm/dma-coherent.c   |2 -
 arch/mips/mm/dma-ip27.c   |2 -
 arch/mips/mm/dma-ip32.c   |3 +
 arch/mips/mm/dma-noncoherent.c|3 +
 drivers/net/lasi_82596.c  |   94 +++--
 drivers/scsi/53c700.c |   80 +--
 drivers/scsi/53c700.h |   16 +++---
 drivers/serial/mpsc.c |   22 -
 include/asm-alpha/dma-mapping.h   |2 -
 include/asm-avr32/dma-mapping.h   |3 +
 include/asm-cris/dma-mapping.h|2 -
 include/asm-frv/dma-mapping.h |2 -
 include/asm-generic/dma-mapping.h |2 -
 include/asm-i386/dma-mapping.h|2 -
 include/asm-ia64/dma-mapping.h|3 +
 include/asm-m68k/dma-mapping.h|2 -
 include/asm-mips/dma-mapping.h|2 -
 include/asm-parisc/dma-mapping.h  |2 -
 include/asm-powerpc/dma-mapping.h |2 -
 include/asm-sh/dma-mapping.h  |2 -
 include/asm-sh64/dma-mapping.h|2 -
 include/asm-sparc64/dma-mapping.h |2 -
 include/asm-um/dma-mapping.h  |2 -
 include/asm-x86_64/dma-mapping.h  |3 +
 include/asm-xtensa/dma-mapping.h  |2 -
 27 files changed, 137 insertions(+), 126 deletions(-)

diff --git a/Documentation/DMA-API.txt b/Documentation/DMA-API.txt
index 6e826f4..b3dafd5 100644
--- a/Documentation/DMA-API.txt
+++ b/Documentation/DMA-API.txt
@@ -459,7 +459,7 @@ anything like this.  You must also be ex
 memory you intend to sync partially.
 
 void
-dma_cache_sync(void *vaddr, size_t size,
+dma_cache_sync(struct device *dev, void *vaddr, size_t size,
   enum dma_data_direction direction)
 
 Do a partial sync of memory that was allocated by
diff --git a/arch/avr32/mm/dma-coherent.c b/arch/avr32/mm/dma-coherent.c
index 44ab8a7..b68d669 100644
--- a/arch/avr32/mm/dma-coherent.c
+++ b/arch/avr32/mm/dma-coherent.c
@@ -11,7 +11,7 @@ #include 
 #include 
 #include 
 
-void dma_cache_sync(void *vaddr, size_t size, int direction)
+void dma_cache_sync(struct device *dev, void *vaddr, size_t size, int 
direction)
 {
/*
 * No need to sync an uncached area
diff --git a/arch/mips/mm/dma-coherent.c b/arch/mips/mm/dma-coherent.c
index 18bc83e..5697c6e 100644
--- a/arch/mips/mm/dma-coherent.c
+++ b/arch/mips/mm/dma-coherent.c
@@ -197,7 +197,7 @@ int dma_is_consistent(struct device *dev
 
 EXPORT_SYMBOL(dma_is_consistent);
 
-void dma_cache_sync(void *vaddr, size_t size,
+void dma_cache_sync(struct device *dev, void *vaddr, size_t size,
   enum dma_data_direction direction)
 {
BUG_ON(direction == DMA_NONE);
diff --git a/arch/mips/mm/dma-ip27.c b/arch/mips/mm/dma-ip27.c
index 8e9a5a8..f088344 100644
--- a/arch/mips/mm/dma-ip27.c
+++ b/arch/mips/mm/dma-ip27.c
@@ -204,7 +204,7 @@ int dma_is_consistent(struct device *dev
 
 EXPORT_SYMBOL(dma_is_consistent);
 
-void dma_cache_sync(void *vaddr, size_t size,
+void dma_cache_sync(struct device *dev, void *vaddr, size_t size,
   enum dma_data_direction direction)
 {
BUG_ON(direction == DMA_NONE);
diff --git a/arch/mips/mm/dma-ip32.c b/arch/mips/mm/dma-ip32.c
index 08720a4..b42b6f7 100644
--- a/arch/mips/mm/dma-ip32.c
+++ b/arch/mips/mm/dma-ip32.c
@@ -370,7 +370,8 @@ int dma_is_consistent(struct device *dev
 
 EXPORT_SYMBOL(dma_is_consistent);
 
-void dma_cache_sync(void *vaddr, size_t size, enum dma_data_direction 
direction)
+void dma_cache_sync(struct device *dev, void *vaddr, size_t size,
+   enum dma_data_direction direction)
 {
if (direction == DMA_NONE)
return;
diff --git a/arch/mips/mm/dma-noncoherent.c b/arch/mips/mm/dma-noncoherent.c
index 4a3efc6..8cecef0 100644
--- a/arch/mips/mm/dma-noncoherent.c
+++ b/arch/mips/mm/dma-noncoherent.c
@@ -306,7 +306,8 @@ int dma_is_consistent(struct device *dev
 
 EXPORT_SYMBOL(dma_is_consistent);
 
-void dma_cache_sync(void *vaddr, size_t size, enum dma_data_direction 
direction)
+void dma_cache_sync(struct device *dev, void *vaddr, size_t size,
+   enum dma_data_direction direction)
 {
if (direction == DMA_NONE)
return;
diff --git a/drivers/net/lasi_82596.c b/drivers/net/lasi_82596.c
index f4d815b..ea392f2 100644
--- a/drivers/net/lasi_82596.c
+++ b/drivers/net/lasi_82596.c
@@ -119,14 +119,14 @@ #define DEB_ANY 

Re: aic94xx panic on module load

2006-11-28 Thread Douglas Gilbert
Mark Haverkamp wrote:
> On Tue, 2006-11-28 at 13:46 -0800, Mark Haverkamp wrote:
>> On Tue, 2006-11-28 at 13:44 -0500, Douglas Gilbert wrote:
>>
>> [ ... ]
>>
> 
> I don't know if this helps, but I found the verbose option.  Here is a
> little debug output.
> 
> 
> ./smp_discover -v  -p 12 -s 0x500508b300a27a2f /dev/mptctl
> Discover request: 40 10 00 02 00 00 00 00 00 0c 00 00 00 00 00 00
> send_req_mpt: subvalue=0  SAS address=0x500508b300a27a2f
> mptctl two scatter gather list interface
> IOCStatus=0x1
> IOCStatus=0x1 IOCLogInfo=0xA27A2F SASStatus=0x0
> smp_send_req failed, res=-1

Mark,
The iocnum may be greater than 0 (especially if you have
other MPT Fusion HBAs (any kind) in that computer).
Have a look in the log around where the mptsas driver
is registered and look for the string "ioc". The number
following "ioc" is what you need. If you find "ioc3" then
try:

 ./smp_discover -p 12 -s 0x500508b300a27a2f /dev/mptctl,3

To verify that expander SAS address, try this:
  find /sys -name "sas_device:expander*"
cd to any directory found and try "cat sas_address".


BTW there is a smp_utils version 0.92 beta at
http://www.torque.net/sg
the error messages are somewhat clearer.


Doug Gilbert

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2.6.15.4 rel.2 1/1] libata: add hotswap to sata_svw

2006-11-28 Thread David Woodhouse
On Thu, 2006-02-16 at 16:09 +0100, Martin Devera wrote:
> From: Martin Devera <[EMAIL PROTECTED]>
> 
> Add hotswap capability to Serverworks/BroadCom SATA controlers. The
> controler has SIM register and it selects which bits in SATA_ERROR
> register fires interrupt.
> The solution hooks on COMWAKE (plug), PHYRDY change and 10B8B decode 
> error (unplug) and calls into Lukasz's hotswap framework.
> The code got one day testing on dual core Athlon64 H8SSL Supermicro 
> MoBo with HT-1000 SATA, SMP kernel and two CaviarRE SATA HDDs in
> hotswap bays.
> 
> Signed-off-by: Martin Devera <[EMAIL PROTECTED]>

What became of this?

-- 
dwmw2

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2.6.15.4 rel.2 1/1] libata: add hotswap to sata_svw

2006-11-28 Thread Benjamin Herrenschmidt
On Tue, 2006-11-28 at 23:22 +, David Woodhouse wrote:
> On Thu, 2006-02-16 at 16:09 +0100, Martin Devera wrote:
> > From: Martin Devera <[EMAIL PROTECTED]>
> > 
> > Add hotswap capability to Serverworks/BroadCom SATA controlers. The
> > controler has SIM register and it selects which bits in SATA_ERROR
> > register fires interrupt.
> > The solution hooks on COMWAKE (plug), PHYRDY change and 10B8B decode 
> > error (unplug) and calls into Lukasz's hotswap framework.
> > The code got one day testing on dual core Athlon64 H8SSL Supermicro 
> > MoBo with HT-1000 SATA, SMP kernel and two CaviarRE SATA HDDs in
> > hotswap bays.
> > 
> > Signed-off-by: Martin Devera <[EMAIL PROTECTED]>
> 
> What became of this?

I might be to blame for not testing it... The Xserve I had on my desk
was too noisy for most of my co-workers so I kept delaying and forgot
about it 

Also the Xserve I have only has one disk, which makes hotplug testing a
bit harder :-)

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2.6.15.4 rel.2 1/1] libata: add hotswap to sata_svw

2006-11-28 Thread Martin Devera

Benjamin Herrenschmidt wrote:

On Tue, 2006-11-28 at 23:22 +, David Woodhouse wrote:

On Thu, 2006-02-16 at 16:09 +0100, Martin Devera wrote:

From: Martin Devera <[EMAIL PROTECTED]>

Add hotswap capability to Serverworks/BroadCom SATA controlers. The
controler has SIM register and it selects which bits in SATA_ERROR
register fires interrupt.
The solution hooks on COMWAKE (plug), PHYRDY change and 10B8B decode 
error (unplug) and calls into Lukasz's hotswap framework.
The code got one day testing on dual core Athlon64 H8SSL Supermicro 
MoBo with HT-1000 SATA, SMP kernel and two CaviarRE SATA HDDs in

hotswap bays.

Signed-off-by: Martin Devera <[EMAIL PROTECTED]>

What became of this?


I might be to blame for not testing it... The Xserve I had on my desk
was too noisy for most of my co-workers so I kept delaying and forgot
about it 


Also the Xserve I have only has one disk, which makes hotplug testing a
bit harder :-)


Unfortunately my box with ht1000 is already deployed. Another similar one should
arrive soon so that I'll retest it.
Just now I've VIA based mobo here - and hotswap is NOT working with it ..

Martin
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.19-rc6 : Spontaneous reboots, stack overflows - seems to implicate xfs, scsi, networking, SMP

2006-11-28 Thread David Chinner
On Thu, Nov 23, 2006 at 12:18:09PM +1100, David Chinner wrote:
> On Wed, Nov 22, 2006 at 01:58:11PM +0100, Jesper Juhl wrote:
> > 
> > Attached are two files. The one named stack_overflows.txt.gz contains
> > one instance of each unique stack overflow + trace that I've got.  The
> > other file named kernel_BUG.txt.gz contains a few BUG() messages that
> > were also in the logs.

> I've just checked on a 2.6.17 build on i386 how much stack we
> are using (from checkstack.pl with min size reported set to 32 bytes)
> here in XFS:

> So, assuming the stacks less than 32 bytes are 32 bytes, we've got
> 1380 bytes in the XFS stack there, and very few functions where it
> can be reduced further. Still, 1380 bytes is way, way short of 4KB,
> so unless there is extra stack usage that checkstack doesn't tell us
> about I'm not sure why this amount of usage is causing repeated
> stack overflows with very little stack usage on either side of it.
> 
> Can someone enlighten me as to where all the rest of the stack
> is being used up here?

FYI.

With some help from Keith Owens, we've determined that gcc 3.3.5
resulted in XFS stack usage of about 1.9KB through the writeback and
allocation path with another ~800 bytes of stack usage in generic
code in this path.

The big difference between the numbers I was getting from checkstack
and reality was CONFIG_CC_OPTIMISE_FOR_SIZE=y being set on the
kernels I was stack checking. IOWs, CONFIG_CC_OPTIMISE_FOR_SIZE=y
appears to reduce XFS stack usage by at least 20% and so probably
should be used with XFS on 4k stacks.

Keith also confirmed that gcc-4.1's aggressive inlining of static
functions substantially increases stack usage (by ~15%) through this
call chain.  Given that many of the inlined static functions are not
required by the critical path (i.e. they'd previously been factored
out to reduce stack usage), gcc is effectively undoing past mods
that had substantially reduced XFS's stack usage.

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html