from:"Douglas Gilbert"

Re: [PATCH] rtc: rtc-at91rm9200: use a variable for storing IMR

2013-03-20 Thread Douglas Gilbert


On 13-03-20 05:50 PM, Andrew Morton wrote:

On Fri, 15 Mar 2013 18:37:12 +0100 Nicolas Ferre  
wrote:


On some revisions of AT91 SoCs, the RTC IMR register is not working.
Instead of elaborating a workaround for that specific SoC or IP version,
we simply use a software variable to store the Interrupt Mask Register and
modify it for each enabling/disabling of an interrupt. The overhead of this
is negligible anyway.


This description doesn't really allow me or others to work out whether
the fix should be included in 3.9 or backported into earlier kernels.

So please, when fixing a bug do include a full description of the
user-visible effects of that bug.  And your opinion regarding the
-mainline and -stable decision is always useful.


The interrupt mask register (IMR) for the RTC is broken
on the AT91SAM9x5 sub-family of SoCs (good overview of the
members here: http://www.eewiki.net/display/linuxonarm/AT91SAM9x5 ).
The "user visible effect" is the RTC doesn't work.

That sub-family is less than two years old and only has devicetree
(DT) support and came online circa lk 3.7 . The dust is yet to
settle on the DT stuff at least for AT91 SoCs (translation:
lots of stuff is still broken, so much that it is hard to know
where to start).

The fix in the patch is pretty simple: just shadow the silicon
IMR register with a variable in the driver. Some older SoCs (pre-DT)
use the the rtc-at91rm9200 driver (e.g. obviously the AT91RM9200)
and they should not be impacted by the change. There shouldn't
be a large volume of interrupts associated with a RTC.


Compared to a relatively stable kernel subsystem like SCSI, what
is happening in the ARM architecture with DT is huge and ongoing.
So I think you either need new rules or suspend some of the stricter
rules applied to more stable subsystems. Just my two cents worth.


Doug Gilbert

who hasn't seen that frill-necked lizard for a while
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH] rtc: rtc-at91rm9200: manage IMR depending on revision

2013-04-02 Thread Douglas Gilbert


On 13-04-02 09:06 AM, Nicolas Ferre wrote:

Signed-off-by: Nicolas Ferre 
---
Hi all,

The funny thing is that I was writing exactly the same code as Johan's
when he posted his series.

So, here is my single patch, with the comment about the readback stolen from
Johan's, but without the way to determine with IP is buggy and which one is
not...
After having dug the possibility to read the IP revision, I discovered that it
is not possible to use this information ("version" register offset changing
according to... IP version number: well done!).
In conclusion, I guess that the only way to determine if we need the workaround
is to use the DT.
One remark though: if we use the compatibility string for this purpose, I fear
that we would twist the meaning of this information: SoC using an
"atmel,at91sam9x5-rtc" compatible RTC will not necessarily be touched by the
"non responding IMR" bug: at91sam9n12 or upcoming sama5d3 are not affected for
instance, and we need to cling to "atmel,at91rm9200-rtc" for them...
I think that we can use this method for the moment and move to another
compatibility string later if it is needed.


Rather than have so many people working on rtc-at91rm9200.c,
how about someone bring its "RTT" sibling into the DT
world. I'm talking about drivers/rtc/rtc-at91sam9.c ...

Doug Gilbert

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] ARM: at91: add Acme Systems Aria G25 board

2013-04-03 Thread Douglas Gilbert


On 13-04-02 02:48 PM, Olof Johansson wrote:

Hi,

I just saw this since it came in through a pull request

On Tue, Mar 26, 2013 at 4:39 AM, Nicolas Ferre  wrote:

From: Douglas Gilbert 

Signed-off-by: Douglas Gilbert 
Signed-off-by: Nicolas Ferre 
---
  arch/arm/boot/dts/ariag25.dts | 175 ++
  1 file changed, 175 insertions(+)
  create mode 100644 arch/arm/boot/dts/ariag25.dts

diff --git a/arch/arm/boot/dts/ariag25.dts b/arch/arm/boot/dts/ariag25.dts
new file mode 100644
index 000..b43266f
--- /dev/null
+++ b/arch/arm/boot/dts/ariag25.dts


Please prefix the boards with the platform. Most other SoCs already do
this, and I see now that some at91 boards haven't been prefixed in the
past, but it's a good idea to not add more of them.

So, at91-ariag25.dts is a good name in this case.


That is fine with me. I have changed my working DT
config file to at91-ariag25.dts and will push that
out to Robert Nelson soon.

Also I'm working on a DT version of an existing Acme
product called the FoxG20 which is based on the
AT91SAM9G20 SoC. It already has non-DT support inside
the mainline kernel. I am using at91-foxg20.dts for
its DT config file.

Doug Gilbert


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3] ARM: at91: add Acme Systems Aria G25 board

2013-04-04 Thread Douglas Gilbert


On 13-04-04 11:42 AM, Nicolas Ferre wrote:

From: Douglas Gilbert 

Signed-off-by: Douglas Gilbert 
Signed-off-by: Nicolas Ferre 
---
Hi all,

Here is the third revision of this patch. I plan to include it in a
pull-request real-soon-now!

v3: - move to "at91-" prefix for .dts[i] files
 - remove the rtc activation code because of the ongoing discussions
   about this IP and its DT binding.



Nicolas,
It's a pity that the rtc activation code is removed.
At worst:
rtc@feb0 {
status = "okay";
};

does nothing. Also it is unlikely to be changed by any
movement on the rtc-at91rm9200 front.


The lack of use of uart1 is for my own, private reasons.
I think it would be more generally useful to show uart1's
definition and disable it as shown in the attached patch
fragment.


I also note that my date line was removed. I like dates,
so when I add comments like "the i2c-at91 driver is broken
for the SAM9G20 ** and use the i2c-gpio driver instead" then
this is not taken as an eternal truth. It worked in the
past and hopefully it will work again in the future.

While on the subject of I2C, I'm getting tired of seeing
this oft-copied line:
   i2c-gpio,delay-us = <2>;/* ~100 kHz */

It is the clock half period in microseconds and for the 100 kHz
(standard) I2C clock speed, it should be 5. Due to rounding
(up) that gives a measured clock speed of around 88 kHz on
my equipment. Crappy I2C devices *** seem to cope better
with 12% below the standard clock frequency than 80% above
it.

Doug Gilbert

** broken in my tests on the FoxG20 with lk 3.9.0-rc5

*** if my experience is anything to go by there are many
dodgy I2C devices, probably using I2C bit banging
code borrowed from Wikipedia.


--- a/arch/arm/boot/dts/ariag25.dts_orig	2013-04-04 11:58:40.518122816 -0400
+++ b/arch/arm/boot/dts/at91-ariag25.dts	2013-04-03 15:23:09.240385849 -0400
@@ -1,5 +1,5 @@
 /*
- * ariag25.dts - Device Tree file for Acme Systems Aria G25 (AT91SAM9G25 based)
+ * at91-ariag25.dts - Device Tree file for Acme Systems Aria G25 (AT91SAM9G25 based)
  *
  * Copyright (C) 2013 Douglas Gilbert ,
  *Robert Nelson 
@@ -21,6 +21,7 @@
 		serial3 = &usart2;
 		serial4 = &usart3;
 		serial5 = &uart0;
+		serial6 = &uart1;
 	};
 
 	chosen {
@@ -121,6 +122,16 @@
 status = "okay";
 			};
 
+			uart1: serial@f8044000 {
+compatible = "atmel,at91sam9260-usart";
+reg = <0xf8044000 0x200>;
+interrupts = <16 4 5>;
+pinctrl-names = "default";
+pinctrl-0 = <&pinctrl_uart1>;
+/* Remove following or change to "okay" if wanted */
+status = "disabled";
+			};
+
 			spi0: spi@f000 {
 status = "okay";
 cs-gpios = <&pioA 14 0>, <0>, <0>, <0>;

Re: WARNING: at drivers/ata/libata-core.c:5049 ata_qc_issue+0x1c7/0x3a0()

2013-02-19 Thread Douglas Gilbert


On 13-02-19 01:37 PM, Tommi Rantala wrote:

Hello,

Hit this WARNING once while fuzzing the kernel with trinity in a qemu
virtual machine as the root user.

Does this make any sense? I have occasionally seen some ATA related
troubles while fuzzing in a VM, but this warning is new to me.

[  490.717030] WARNING: at
/home/ttrantal/git/linux-2.6/drivers/ata/libata-core.c:5049
ata_qc_issue+0x1c7/0x3a0()
[  490.717030] Hardware name: Bochs
[  490.717030] Pid: 2548, comm: trinity-child6 Not tainted 3.8.0+ #87
[  490.717030] Call Trace:
[  490.717030]  [] warn_slowpath_common+0x86/0xb0
[  490.717030]  [] warn_slowpath_null+0x15/0x20
[  490.717030]  [] ata_qc_issue+0x1c7/0x3a0
[  490.717030]  [] ? ata_scsi_set_sense.constprop.13+0x30/0x30
[  490.717030]  [] ata_scsi_translate+0x120/0x190
[  490.717030]  [] ? ata_scsi_queuecmd+0x2e/0x2d0
[  490.717030]  [] ata_scsi_queuecmd+0x253/0x2d0
[  490.717030]  [] scsi_dispatch_cmd+0x161/0x230
[  490.717030]  [] scsi_request_fn+0x544/0x580
[  490.717030]  [] ? cfq_dispatch_requests+0x56/0xb30
[  490.717030]  [] ? __lock_is_held+0x5a/0x80
[  490.717030]  [] __blk_run_queue+0x32/0x40
[  490.717030]  [] __elv_add_request+0x10a/0x280
[  490.717030]  [] blk_execute_rq_nowait+0xb6/0xf0
[  490.717030]  [] ? __init_waitqueue_head+0x41/0x60
[  490.717030]  [] blk_execute_rq+0xa8/0x110
[  490.717030]  [] ? lock_release_non_nested+0xde/0x310
[  490.717030]  [] ? selinux_capable+0x34/0x50
[  490.717030]  [] ? security_capable+0x13/0x20
[  490.717030]  [] ? ns_capable+0x53/0x80
[  490.717030]  [] sg_scsi_ioctl+0x2b1/0x3a0
[  490.717030]  [] scsi_cmd_ioctl+0x412/0x4a0
[  490.717030]  [] ? __lock_acquire+0x957/0x1c20
[  490.717030]  [] ? kvm_clock_read+0x1f/0x30
[  490.717030]  [] bsg_ioctl+0x146/0x270
[  490.717030]  [] ? trace_hardirqs_off_caller+0x28/0xd0
[  490.717030]  [] ? trace_hardirqs_off+0xd/0x10
[  490.717030]  [] ? local_clock+0x4a/0x70
[  490.717030]  [] ? lock_release_holdtime+0x28/0x170
[  490.717030]  [] ? avc_has_perm_flags+0x1d0/0x2a0
[  490.717030]  [] ? avc_has_perm_flags+0x28/0x2a0
[  490.717030]  [] ? trace_hardirqs_off_caller+0x28/0xd0
[  490.717030]  [] ? trace_hardirqs_off+0xd/0x10
[  490.717030]  [] do_vfs_ioctl+0x532/0x580
[  490.717030]  [] ? file_has_perm+0x83/0xa0
[  490.717030]  [] sys_ioctl+0x5d/0xa0
[  490.717030]  [] ? trace_hardirqs_on_thunk+0x3a/0x3f
[  490.717030]  [] system_call_fastpath+0x16/0x1b
[  490.717030] ---[ end trace fce35d2b40bd0565 ]---
[  490.810874] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[  490.812538] ata1.00: failed command: READ DMA
[  490.813715] ata1.00: cmd c8/00:2c:00:01:00/00:00:00:00:00/e0 tag 0
[  490.813715]  res 50/01:00:b0:16:04/00:00:00:00:00/a0 Emask
0x40 (internal error)
[  490.817269] ata1.00: status: { DRDY }
[watchdog] 333615 iterations. [F:326712 S:6891]
[watchdog] kernel became tainted! Last seed was 71022097
[  491.266158] ata1.00: configured for MWDMA2
[  491.267358] ata1: EH complete
child 2548 exitting
child 2492 exitting
child 2500 exitting
[2351] Bailing main loop. Exit reason: kernel became tainted
[2350] Watchdog exiting

Ran 333617 syscalls. Successes: 6892  Failures: 326714


Looks like some application is using the deprecated
SCSI_IOCTL_SEND_COMMAND ioctl via a bsg node to send
a SCSI ATA PASS-THROUGH command tunnelling a ATA READ
DMA command.

Looking for something positive to say: only a very skilled
professional tester could come up with such a mismatch.

Is this a recent kernel (linux-2.6 shown in path)? Processes
using the SCSI_IOCTL_SEND_COMMAND ioctl now get a yellow flag
in the logs.


Doug Gilbert


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: WARNING: at drivers/ata/libata-core.c:5049 ata_qc_issue+0x1c7/0x3a0()

2013-02-19 Thread Douglas Gilbert


On 13-02-19 04:52 PM, Dave Jones wrote:

On Tue, Feb 19, 2013 at 04:04:33PM -0500, Douglas Gilbert wrote:
  > On 13-02-19 01:37 PM, Tommi Rantala wrote:
  > > Hello,
  > >
  > > Hit this WARNING once while fuzzing the kernel with trinity in a qemu
  > > virtual machine as the root user.
  > >
  > > Does this make any sense? I have occasionally seen some ATA related
  > > troubles while fuzzing in a VM, but this warning is new to me.
  > >
  > > [  490.717030] WARNING: at
  > > /home/ttrantal/git/linux-2.6/drivers/ata/libata-core.c:5049
  > > ata_qc_issue+0x1c7/0x3a0()
  > > [  490.717030] Hardware name: Bochs
  > > [  490.717030] Pid: 2548, comm: trinity-child6 Not tainted 3.8.0+ #87
  > > [  490.717030] Call Trace:
  > > [  490.717030]  [] warn_slowpath_common+0x86/0xb0
  > > [  490.717030]  [] warn_slowpath_null+0x15/0x20
  > > [  490.717030]  [] ata_qc_issue+0x1c7/0x3a0
  > > [  490.717030]  [] ? 
ata_scsi_set_sense.constprop.13+0x30/0x30
  > > [  490.717030]  [] ata_scsi_translate+0x120/0x190
  > > [  490.717030]  [] ? ata_scsi_queuecmd+0x2e/0x2d0
  > > [  490.717030]  [] ata_scsi_queuecmd+0x253/0x2d0
  > > [  490.717030]  [] scsi_dispatch_cmd+0x161/0x230
  > > [  490.717030]  [] scsi_request_fn+0x544/0x580
  > > [  490.717030]  [] ? cfq_dispatch_requests+0x56/0xb30
  > > [  490.717030]  [] ? __lock_is_held+0x5a/0x80
  > > [  490.717030]  [] __blk_run_queue+0x32/0x40
  > > [  490.717030]  [] __elv_add_request+0x10a/0x280
  > > [  490.717030]  [] blk_execute_rq_nowait+0xb6/0xf0
  > > [  490.717030]  [] ? __init_waitqueue_head+0x41/0x60
  > > [  490.717030]  [] blk_execute_rq+0xa8/0x110
  > > [  490.717030]  [] ? lock_release_non_nested+0xde/0x310
  > > [  490.717030]  [] ? selinux_capable+0x34/0x50
  > > [  490.717030]  [] ? security_capable+0x13/0x20
  > > [  490.717030]  [] ? ns_capable+0x53/0x80
  > > [  490.717030]  [] sg_scsi_ioctl+0x2b1/0x3a0
  > > [  490.717030]  [] scsi_cmd_ioctl+0x412/0x4a0
  > > [  490.717030]  [] ? __lock_acquire+0x957/0x1c20
  > > [  490.717030]  [] ? kvm_clock_read+0x1f/0x30
  > > [  490.717030]  [] bsg_ioctl+0x146/0x270
  > > [  490.717030]  [] ? trace_hardirqs_off_caller+0x28/0xd0
  > > [  490.717030]  [] ? trace_hardirqs_off+0xd/0x10
  > > [  490.717030]  [] ? local_clock+0x4a/0x70
  > > [  490.717030]  [] ? lock_release_holdtime+0x28/0x170
  > > [  490.717030]  [] ? avc_has_perm_flags+0x1d0/0x2a0
  > > [  490.717030]  [] ? avc_has_perm_flags+0x28/0x2a0
  > > [  490.717030]  [] ? trace_hardirqs_off_caller+0x28/0xd0
  > > [  490.717030]  [] ? trace_hardirqs_off+0xd/0x10
  > > [  490.717030]  [] do_vfs_ioctl+0x532/0x580
  > > [  490.717030]  [] ? file_has_perm+0x83/0xa0
  > > [  490.717030]  [] sys_ioctl+0x5d/0xa0
  > > [  490.717030]  [] ? trace_hardirqs_on_thunk+0x3a/0x3f
  > > [  490.717030]  [] system_call_fastpath+0x16/0x1b
  > > [  490.717030] ---[ end trace fce35d2b40bd0565 ]---
  > > [  490.810874] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
  > > [  490.812538] ata1.00: failed command: READ DMA
  > > [  490.813715] ata1.00: cmd c8/00:2c:00:01:00/00:00:00:00:00/e0 tag 0
  > > [  490.813715]  res 50/01:00:b0:16:04/00:00:00:00:00/a0 Emask
  > > 0x40 (internal error)
  > > [  490.817269] ata1.00: status: { DRDY }
  > > [watchdog] 333615 iterations. [F:326712 S:6891]
  > > [watchdog] kernel became tainted! Last seed was 71022097
  > > [  491.266158] ata1.00: configured for MWDMA2
  > > [  491.267358] ata1: EH complete
  > > child 2548 exitting
  > > child 2492 exitting
  > > child 2500 exitting
  > > [2351] Bailing main loop. Exit reason: kernel became tainted
  > > [2350] Watchdog exiting
  > >
  > > Ran 333617 syscalls. Successes: 6892  Failures: 326714
  >
  > Looks like some application is using the deprecated
  > SCSI_IOCTL_SEND_COMMAND ioctl via a bsg node to send
  > a SCSI ATA PASS-THROUGH command tunnelling a ATA READ
  > DMA command.
  >
  > Looking for something positive to say: only a very skilled
  > professional tester could come up with such a mismatch.

Unless Tommi has local modifications, what trinity does with sys_ioctl
is incredibly naive compared to some of the other syscalls.
It's a miracle it managed to pair an fd from a /dev node that understands
this ioctl with this set of arguments tbh.

The actual code it uses for fuzzing SG_IO looks like this..
https://github.com/kernelslacker/trinity/blob/master/ioctls/scsi-generic-sgio.c
It's not code I'm particularly proud of, it could be a lot more clever
than what it's currently doing, which is why I'm surprised it&

Re: Integration of SCST in the mainstream Linux kernel

2008-02-04 Thread Douglas Gilbert


Alan Cox wrote:
better. So for example, I personally suspect that ATA-over-ethernet is way 
better than some crazy SCSI-over-TCP crap, but I'm biased for simple and 
low-level, and against those crazy SCSI people to begin with.


Current ATAoE isn't. It can't support NCQ. A variant that did NCQ and IP
would probably trash iSCSI for latency if nothing else.


And a variant that doesn't do ATA or IP:
http://www.fcoe.com/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v5 0/4] [SCSI] sg: fix race condition in sg_open

2013-08-27 Thread Douglas Gilbert

On 13-08-27 10:16 AM, vaughan wrote:

On 08/13/2013 11:16 AM, Douglas Gilbert wrote:

On 13-08-12 10:46 PM, vaughan wrote:

On 08/06/2013 04:52 AM, Douglas Gilbert wrote:

On 13-08-04 10:19 PM, vaughan wrote:

On 08/03/2013 01:25 PM, Douglas Gilbert wrote:

On 13-08-01 01:01 AM, Douglas Gilbert wrote:

On 13-07-22 01:03 PM, Jörn Engel wrote:

On Mon, 22 July 2013 12:40:29 +0800, Vaughan Cao wrote:

There is a race when open sg with O_EXCL flag. Also a race may
happen between
sg_open and sg_remove.

Changes from v4:
 * [3/4] use ERR_PTR series instead of adding another
parameter in
sg_add_sfp
 * [4/4] fix conflict for cherry-pick from v3.

Changes from v3:
 * release o_sem in sg_release(), not in sg_remove_sfp().
 * not set exclude with sfd_lock held.

Vaughan Cao (4):
  [SCSI] sg: use rwsem to solve race during exclusive open
  [SCSI] sg: no need sg_open_exclusive_lock
  [SCSI] sg: checking sdp->detached isn't protected when open
  [SCSI] sg: push file descriptor list locking down to
per-device
locking

 drivers/scsi/sg.c | 178
+-
 1 file changed, 83 insertions(+), 95 deletions(-)

Patchset looks good to me, although I didn't test it on hardware
yet.
Signed-off-by: Joern Engel 

James, care to pick this up?

Acked-by: Douglas Gilbert 

Tested O_EXCL with multiple processes and threads; passed.
sg driver prior to this patch had "leaky" O_EXCL logic
according to the same test. Block device passed.

James, could you clean this up:
  drivers/scsi/sg.c:242:6: warning: unused variable ‘res’
[-Wunused-variable]

Further testing suggests this patch on the sg driver is
broken, so I'll rescind my ack.

The case it is broken for is when a device is opened
without O_EXCL. Now if, while it is open, a second
thread/process tries to open the same device O_EXCL
then IMO the second open should fail with EBUSY.

My testing shows that O_EXCL opens properly deflect
other O_EXCL opens.

Hi  Doug,

My test don't have this issue. The routine is something as below:

I start three opens without O_EXCL, wait 30s each, and open with
O_EXCL|O_NONBLOCK, it failed with EBUSY.
And I also call myopen with/without O_EXCL many times in background at
the same time, and the test is passed. I don't know why it failed in
your test.

Usage: myopen [-e][-n][-d delay] -f file
 -e: exclude
 -n: nonblock
 -d: delay N seconds and then close.

[root@vacaowol5 16835013]# ./myopen  -f /dev/sg5 -d 30 &
[1] 3417
[root@vacaowol5 16835013]# ./myopen  -f /dev/sg5 -d 30 &
[2] 3418
[root@vacaowol5 16835013]# ./myopen  -f /dev/sg5 -d 30 &
[3] 3419
[root@vacaowol5 16835013]# cat /proc/scsi/sg/debug
max_active_device=6(origin 1)
def_reserved_size=32768
>>> device=sg5 scsi5 chan=0 id=1 lun=0   em=0 sg_tablesize=55
excl=0
  FD(1): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0
  cmd_q=0 f_packid=0 k_orphan=0 closed=0
No requests active
  FD(2): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0
  cmd_q=0 f_packid=0 k_orphan=0 closed=0
No requests active
  FD(3): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0
  cmd_q=0 f_packid=0 k_orphan=0 closed=0
No requests active

[root@vacaowol5 16835013]# ./myopen -e -n  -f /dev/sg5 -d 30 &
[4] 3422
[3422:3351] /dev/sg5:exclude: Device or resource busy

[4]+  Exit 1  ./myopen -e -n -f /dev/sg5 -d 30

[root@vacaowol5 16835013]# cat /proc/scsi/sg/debug
max_active_device=6(origin 1)
def_reserved_size=32768
>>> device=sg5 scsi5 chan=0 id=1 lun=0   em=0 sg_tablesize=55
excl=0
  FD(1): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0
  cmd_q=0 f_packid=0 k_orphan=0 closed=0
No requests active
  FD(2): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0
  cmd_q=0 f_packid=0 k_orphan=0 closed=0
No requests active
  FD(3): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0
  cmd_q=0 f_packid=0 k_orphan=0 closed=0
No requests active
[root@vacaowol5 16835013]# cat /proc/scsi/sg/debug
[1]   Done./myopen -f /dev/sg5 -d 30
[2]-  Done./myopen -f /dev/sg5 -d 30
[3]+  Done./myopen -f /dev/sg5 -d 30

Hi,
After the initial failures about 36 hours ago, retesting
yesterday and today has not produced any unexpected
failures. And I have been trying hard on lk 3.10.4 and
lk 3.10.5 .

My test program is a bit more intense than yours and can
be found in the sg3_utils beta in the News section of this
page:
http://sg.danny.cz/sg/

It is in the examples directory, two variants called
sg_tst_excl and sg_tst_excl2 . You will need a recent gcc
compiler, IOW something that can compile c++11 . gcc 4.7.3
in Ubuntu 13.04 only just manages, fedora 19 should do
better with gcc 4.8.1 . The threading is implemented using
pthreads so it should be reliabl

Re: [PATCH 0/4] Hyper-V TRIM support

2013-09-13 Thread Douglas Gilbert


On 13-09-13 08:58 AM, Andy Whitcroft wrote:

tl;dr -- enable TRIM support for Hyper-V emulated disks.

The Hyper-V hypervisor can support TRIM for its devices, advertising this
via the appropriate VPD pages.  However the emulated disks only claim
to be SPC-2 devices.  According to the specs VPD pages (in general) did
exist at SPC-2 but the specific pages we interogate for the TRIM support


VPD pages are found in SPC (ANSI INCITS 301-1997) and many of its
drafts. By SPC-2 (ANSI INCITS 351-2001) the "supported VPD pages"
VPD page (0x0) ** and the "device identification" VPD page (0x83)
were mandatory (in SPC those pages were optional). So that is
approaching 20 years for manufacturers to get used to VPD pages.

TRIM is a T13 term (ATA/SATA) that was introduced after 2001
(i.e. after SBC-2). The corresponding SCSI term is now
"Logical Block Provisioning" (LBP). This covers the SCSI
UNMAP command (closest thing to TRIM) and the SCSI WRITE SAME
command which contains LBP options.

LBP capability was originally reported in the SCSI READ
CAPACITY(16) command (but not the more common READ
CAPACITY(10) command); namely the LBPME and LBPRZ bits. Those
bits have been renamed *** during the lifetime of SBC-3 drafts.
Those bits and a lot of additional LBP information are now
found in two VPD pages: "Block Limits" (0xb0) and "Logical
Block Provisioning" (0xb2).


After a 36 byte standard INQUIRY response, unless the
compliance is stone age (e.g. SCSI-2 or earlier) then
a 36 byte INQUIRY with the EVPD bit set and page=0
should be pretty safe. Check the response carefully
as USB devices will often ignore the EVPD bit and respond
with a standard INQUIRY response. Forget any such devices.
Now look for either of those LBP supporting VPD pages.
There should not be too many devices that support neither
and do LBP.

Doug Gilbert


**  some vendors do not include their own vendor specific VPD
pages in the "Supported VPD pages" VPD page. Grrr

*** those bits were previously named TPE and TPRZ


did not until SPC-3 therefore the kernel avoids reading the relevant pages
for SPC-2 devices and prevents TRIM from being offered for these devices.
Additionally at SPC-2 we prefer ReadCapacity10 over ReadCapacity16 and
unless we use RC16 we will not identify the device as TRIM capable also
preventing TRIM being offered.

As the VPD page zero does list which pages are valid for each device, it
could be argued that we could simply attempt to use these pages for all
devices which claim to be SPC-2 and above.  While this seems valid the
code documents a number of devices which take badly to having even VPD
page 0 interogated even when supposedly supported.  Therefore it seems
appropriate to add a scsi device flag to allow a device to request SPC-3
VPD pages be used when only at SPC-2.

Similarly for the ReadCapacity selection it seems dangerous to invert
the order for all SPC-2 devices.  So it seems appropriate to add a scsi
device flag to request we try RC16 before RC10 (mirroring the existing
flag for the opposite).

The following four patches add the two scsi device flags and select those
flags for the Hyper-V emulated disks.






--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Bug] 12.864681 BUG: lock held when returning to user space!

2013-10-08 Thread Douglas Gilbert


On 13-10-08 02:44 AM, vaughan wrote:

Hi Madper,

CC to Douglas to get comments.
I use the rw_semaphore o_sem to protect excl open, introduced in commit
15b06f9a02406e5460001db6d5af5c738cd3d4e7 since v3.12-rc1.
Is it forbidden to do like that in kernel?...


It appears you can not (allow sg_open() to hold a semaphore
then return to the user space). So you will need to do some
rework on that patch or revert it.

Doug Gilbert

Reference: scsi-linux + kernel lists, title:
  [PATCH v6 0/4][SCSI] sg: fix race condition in sg_open
  20130828


On 10/08/2013 01:57 PM, Madper Xie wrote:

Howdy Vaughan Cao,
I can't meet this issue on both 3.11 and 3.11.4. There are only four
patches between 3.11 and 3.12-rc2 and you are the author. Will you
please check them if you have time.

c...@redhat.com writes:


Hi all,
With kernel3.12-rc2 the dmesg shows following logs:
[   12.864680] 
[   12.864681] [ BUG: lock held when returning to user space! ]
[   12.864682] 3.12.0-rc2 #1 Not tainted
[   12.864683] 
[   12.864684] iprinit/719 is leaving the kernel with locks still held!
[   12.864685] 1 lock held by iprinit/719:
[   12.864686]  #0:  (&sdp->o_sem){.+.+..}, at: [] 
sg_open+0x4b5/0x644 [sg]
[   12.934954] ath9k :01:00.0: enabling device ( -> 0002)
[   12.940346] ath: phy0: timeout (1000 us) on reg 0x15f18: 0x & 
0x0007 != 0x0004
[   12.943125] ath: EEPROM regdomain: 0x60
[   12.943127] ath: EEPROM indicates we should expect a direct regpair map
[   12.943129] ath: Country alpha2 being used: 00
[   12.943130] ath: Regpair used: 0x60
[   12.960202] r8169 :02:00.0 p3p1: link down
[   12.960236] r8169 :02:00.0 p3p1: link down
[   12.960256] IPv6: ADDRCONF(NETDEV_UP): p3p1: link is not ready
[   13.003523] ieee80211 phy0: Selected rate control algorithm 'minstrel_ht'
[   13.003886] ieee80211 phy0: Atheros AR9485 Rev:1 mem=0xc9000bc8, 
irq=16
[   13.012120] ip6_tables: (C) 2000-2006 Netfilter Core Team
[   13.023667] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[   13.055802] Ebtables v2.0 registered
[   13.192291] nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
[   15.906392] r8169 :02:00.0 p3p1: link up
[   15.906416] IPv6: ADDRCONF(NETDEV_CHANGE): p3p1: link becomes ready
[   17.121989] systemd-udevd (334) used greatest stack depth: 3352 bytes left

I'm working on finding which version bring this bug in.







--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: PATCH: scsi: make scsi reset permissions more relaxed (RFC)

2013-08-30 Thread Douglas Gilbert


On 13-08-30 02:04 PM, Marcus Meissner wrote:

Hi folks,

cdrecord wants to whack the CD drive with a SCSI RESET ...

So far SCSI RESET can be done at 4 levels (target, device, bus, host)
and all 4 are checked for CAP_SYS_ADMIN / CAP_SYS_RAWIO.


As the cdrecord author wants special permissions for cdrecord, readcd ,
cdda2wav to allow it to send SCSI RESET commands I was wondering if
relaxing the permission is a potential idea?

This would allow SCSI reset on target/device if a local user
gets regular access to a SCSI device (via udev acls etc.)


(I know that the actual reset code will fall back into the chain
  target -> device -> bus -> host resetting if one fails.)


Hi,
That escalation sequence probably should be:
   device(LU) -> target -> bus -> host

I proposed the following patch some time back to give the
user space finer resolution on resets with the option of
stopping the escalation but it has gone nowhere:
  http://marc.info/?l=linux-scsi&m=136104139102048&w=2

With that patch you might only allow an unprivileged user
the non-escalating LU and target reset variants.

If changes are made in that area, we might like to think
about adding a new RESET variant mapping through to the I_T
Nexus Reset TMF.

Doug Gilbert


---
  drivers/scsi/scsi_ioctl.c | 8 ++--
  1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi_ioctl.c b/drivers/scsi/scsi_ioctl.c
index d9564fb..770720e 100644
--- a/drivers/scsi/scsi_ioctl.c
+++ b/drivers/scsi/scsi_ioctl.c
@@ -306,22 +306,26 @@ int scsi_nonblockable_ioctl(struct scsi_device *sdev, int 
cmd,
return 0;
switch (val) {
case SG_SCSI_RESET_DEVICE:
+   /* allowed if you can send scsi commands to the device 
*/
val = SCSI_TRY_RESET_DEVICE;
break;
case SG_SCSI_RESET_TARGET:
+   /* allowed if you can send scsi commands to the device 
*/
val = SCSI_TRY_RESET_TARGET;
break;
case SG_SCSI_RESET_BUS:
+   if (!capable(CAP_SYS_ADMIN) || !capable(CAP_SYS_RAWIO))
+   return -EACCES;
val = SCSI_TRY_RESET_BUS;
break;
case SG_SCSI_RESET_HOST:
+   if (!capable(CAP_SYS_ADMIN) || !capable(CAP_SYS_RAWIO))
+   return -EACCES;
val = SCSI_TRY_RESET_HOST;
break;
default:
return -EINVAL;
}
-   if (!capable(CAP_SYS_ADMIN) || !capable(CAP_SYS_RAWIO))
-   return -EACCES;
return (scsi_reset_provider(sdev, val) ==
SUCCESS) ? 0 : -EIO;
}



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] sg: fix memory leak

2013-09-25 Thread Douglas Gilbert


On 13-09-25 10:05 PM, vaughan wrote:

On 09/25/2013 10:26 PM, Vegard Nossum wrote:

Commit e32c9e6300e3af659cbfe45e90a1e7dcd3572ada introduced a memory
leak. Fix it.

Cc: sta...@vger.kernel.org
Cc: Vaughan Cao 
Cc: Douglas Gilbert 
Signed-off-by: Vegard Nossum 
---
  drivers/scsi/sg.c |1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 5cbc4bb..a97143f 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -2060,6 +2060,7 @@ sg_add_sfp(Sg_device * sdp, int dev)
spin_lock_irqsave(&sdp->sfd_lock, iflags);
if (sdp->detached) {
spin_unlock_irqrestore(&sdp->sfd_lock, iflags);
+   kfree(sfp);
return ERR_PTR(-ENODEV);
}
list_add_tail(&sfp->sfd_siblings, &sdp->sfds);

You're right. It's a memory leak.


Signed-off-by: Douglas Gilbert 


There also a memory leak at the second 'return NULL;'
in dev_seq_start() .

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] scsi_debug: Fix off-by-one bug when unmapping region

2012-09-05 Thread Douglas Gilbert


On 12-08-17 12:11 PM, Martin K. Petersen wrote:

"Lukas" == Lukas Czerner  writes:


Lukas> Currently it is possible to unmap one more block than user
Lukas> requested to due to the off-by-one error in unmap_region(). This
Lukas> is probably due to the fact that the end variable despite its
Lukas> name actually points to the last block to unmap + 1. However in
Lukas> the condition it is handled as the last block of the region to
Lukas> unmap.

Acked-by: Martin K. Petersen 


Acked-by: Douglas Gilbert 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/1] [SCSI] scsi_debug: Add "removable" parameter

2012-09-06 Thread Douglas Gilbert


On 12-09-06 06:04 AM, Martin Pitt wrote:

Add "removable" module parameter to set the "removable" attribute of any
subsequently created debug block device. It is a writable driver option, so
that you can switch between removable and "fixed" media block devices in
between the add_host calls.

This is useful for being able to test the different behaviour/required
privileges in e. g. the udisks test suite.

Signed-off-by: Martin Pitt 
Acked-By: David Zeuthen 


Acked-by: Douglas Gilbert 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: scsi target, likely GPL violation

2012-11-11 Thread Douglas Gilbert


On 12-11-11 04:34 AM, James Bottomley wrote:

On Wed, 2012-11-07 at 08:50 -0800, Andy Grover wrote:

Nick,

Your company appears to be shipping kernel features in RTS OS that are
not made available under the GPL, specifically support for the
EXTENDED_COPY and COMPARE_AND_WRITE SCSI commands, in order to claim
full Vmware vSphere 5 VAAI support.

http://www.risingtidesystems.com/storage.html
http://www.linux-iscsi.org/wiki/VAAI

Private emails to you and RTS CEO Marc Fleischmann have not elicited a
useful response.

You are subsystem maintainer for the in-kernel SCSI target support
(drivers/target/*), and your company appears to be violating the GPL.
Please explain.


Can we please cool it with the inflammatory accusations.  Please
remember that statements which damage or seek to damage the reputation
of a company amount to libel even under US law ... and using phrases
like "appears to" doesn't shield you from this.

I also note that whatever their website says RTS OS isn't in VMware's
certified compatibility list:

http://www.vmware.com/resources/compatibility/pdf/vi_io_guide.pdf

Plus it's a grey area what you actually have to support to make that
list (especially as XCOPY has now been removed from SBC-3 in favour of
token copy), so I'd say that the chain of reasoning you've used to come
up with this hearsay allegation of copyright violation is tenuous at
best.


The SCSI EXTENDED COPY command (also known by the abbreviation "XCOPY")
is specified in the SPC (SCSI Primary Commands) series of standards, not
the SBC (SCSI Block Commands) series. Yes, it has been enhanced in the
SPC-4 drafts (what you term as "token copy") but as far as I can
determine, still allows for the original EXTENDED COPY command usage.
EXTENDED COPY was first standardized in SPC-2 (ANSI INCITS 351-2001) in
2001. The most recent SPC standard is SPC-3 (ANSI INCITS 408-2005) and
if VMWare don't mention some other SCSI standard or draft, then the
SPC-3 specification of the EXTENDED COPY command should be the
reference. And that is prior to the addition of the "token copy"
functionality.

The latest released version of my sg3_utils package (1.34) contains
a contributed sg_xcopy utility that invokes the SCSI EXTENDED COPY
command. At this time it does not support the recently added "token
copy" functionality.


Anybody who does enforcement will tell you that you begin with first
hand proof of a violation.  That means obtain the product and make sure
it's been modified and that a request for corresponding source fails.
In this case, since I presume Red Hat, as a RTS partner, has a bona fide
copy of the RTS OS, please verify it does indeed implement or issue the
commands which are not in the public git repository and that whoever
owns the copy makes a request for the source code.

I would really appreciate it if the next email I see from you on this
subject is either

  1. Yes, I've got first hand proof of a GPL violation (in which case
 we'll then move to seeing how we can remedy this) or
  2. A genuine public apology for the libel, which I'll do my best to
 prevail on RTS to accept.


Sorry to add another category.

Doug Gilbert


Because any further discussion of unsubstantiated allegations of this
nature exposes us all to jeopardy of legal sanction.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/3] target/iblock: Add WRITE_SAME w/ UNMAP=0 emulation support

2012-11-15 Thread Douglas Gilbert


On 12-11-15 06:04 AM, Christoph Hellwig wrote:

+   /*
+* Enable WRITE_SAME emulation for IBLOCK, use scsi_debug.c default
+*/


Why would we care what scsi_debug.c uses?


Would you prefer no hint of where the magic number came
from? At least somebody who cares when they see that comment
might contact Martin Petersen and ask why he chose that
value.

t10.org documents are not big on sensible default values;
my favourite is where "0" means vendor specific.

Doug Gilbert


+   dev->dev_attrib.max_write_same_len = 0x;

if (blk_queue_nonrot(q))
dev->dev_attrib.is_nonrot = 1;
@@ -375,22 +379,80 @@ err:
return ret;
  }



+static struct bio *iblock_get_bio(struct se_cmd *, sector_t, u32);
+static void iblock_submit_bios(struct bio_list *, int);
+static void iblock_complete_cmd(struct se_cmd *);


I'd suggest moving the write_same callback below these to avoid
forward declarations.


+   if (cmd->se_cmd_flags & SCF_WRITE_SAME_DISCARD) {


I'd probably add separate write_same and write_same_unmap members to
the sbc_ops structure.  That'll keep decoding which one is used in the
SBC code, and it'll keep the implementations nicely separated.


+   if (sectors > cmd->se_dev->dev_attrib.max_write_same_len) {


This sort of check should stay in the SBC code.


+   sg = &cmd->t_data_sg[0];


Btw, it seems like we don't bother to ensure the S/G list length
is just one sector for WRITE SAME with either the unmap bit set or not.

Also please add testcases for WRITE SAME including corner cases like
incorrect transfer length to the scsi testsuite to ensure this code
has proper QA coverage.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] atmel_serial oops when peripheral clock misconfigured

2012-11-09 Thread Douglas Gilbert


In lk 3.7.0-rc4 when a peripheral clock is not found for
a serial port the atmel_serial driver brings down the
kernel with an oops during boot-up. This impacts the
Atmel AT91 family of MCUs.

For example, arch/arm/mach-at91/at91sam9x5.c does not
specify properly the peripheral clocks for the UTXD0/URXD0
and UTXD1/URXD1 serial ports. Selecting either of those
ports in a dts file will crash the kernel. at91sam9x5.c needs
to be fixed but stopping atmel_serial crashing the kernel is
more urgent.

Patch attached for your consideration.

Signed-of-by: Douglas Gilbert 
diff --git a/drivers/tty/serial/atmel_serial.c b/drivers/tty/serial/atmel_serial.c
index 3d7e1ee..cc385e0 100644
--- a/drivers/tty/serial/atmel_serial.c
+++ b/drivers/tty/serial/atmel_serial.c
@@ -1457,8 +1457,9 @@ static void __devinit atmel_of_init_port(struct atmel_uart_port *atmel_port,
 
 /*
  * Configure the port from the platform device resource info.
+ * Returns 0 for success or 1 in case of error.
  */
-static void __devinit atmel_init_port(struct atmel_uart_port *atmel_port,
+static int __devinit atmel_init_port(struct atmel_uart_port *atmel_port,
   struct platform_device *pdev)
 {
 	struct uart_port *port = &atmel_port->uart;
@@ -1496,6 +1497,8 @@ static void __devinit atmel_init_port(struct atmel_uart_port *atmel_port,
 	/* for console, the clock could already be configured */
 	if (!atmel_port->clk) {
 		atmel_port->clk = clk_get(&pdev->dev, "usart");
+		if (IS_ERR(atmel_port->clk))
+			return 1;	/* peripheral clock not found */
 		clk_enable(atmel_port->clk);
 		port->uartclk = clk_get_rate(atmel_port->clk);
 		clk_disable(atmel_port->clk);
@@ -1511,6 +1514,7 @@ static void __devinit atmel_init_port(struct atmel_uart_port *atmel_port,
 	} else {
 		atmel_port->tx_done_mask = ATMEL_US_TXRDY;
 	}
+	return 0;
 }
 
 /*
@@ -1666,13 +1670,18 @@ static int __init atmel_console_init(void)
 		struct atmel_uart_data *pdata =
 			atmel_default_console_device->dev.platform_data;
 		int id = pdata->num;
+		int ret;
 		struct atmel_uart_port *port = &atmel_ports[id];
 
 		port->backup_imr = 0;
 		port->uart.line = id;
 
 		add_preferred_console(ATMEL_DEVICENAME, id, NULL);
-		atmel_init_port(port, atmel_default_console_device);
+		ret = atmel_init_port(port, atmel_default_console_device);
+		if (ret) {
+			pr_err("No peripheral clock for Atmel console ??\n");
+			return -EINVAL;
+		}
 		register_console(&atmel_console);
 	}
 
@@ -1803,7 +1812,12 @@ static int __devinit atmel_serial_probe(struct platform_device *pdev)
 	port->backup_imr = 0;
 	port->uart.line = ret;
 
-	atmel_init_port(port, pdev);
+	ret = atmel_init_port(port, pdev);
+	if (ret) {
+		ret = -EINVAL;
+		pr_err("peripheral clock not found for serial port\n");
+		goto err;
+	}
 
 	if (!atmel_use_dma_rx(&port->uart)) {
 		ret = -ENOMEM;

Re: [PATCH v3 0/5] rtc-at91rm9200: add shadow interrupt mask

2013-05-29 Thread Douglas Gilbert

On 13-05-29 04:41 PM, Robert Nelson wrote:

On Wed, May 29, 2013 at 3:33 PM, Andrew Morton
wrote:

On Thu, 23 May 2013 10:38:50 +0200 Johan Hovold wrote:

This is an update of the shadow-interrupt-mask series against v3.10-rc2.

I guess we need Atmel to confirm that all sam9x5 SoCs are indeed
affected. If not, then some probing mechanism as the one Doug suggested
could be implemented on top of (a subset of) these patches. What do you
say, Nicolas?

Note that the first patch (adding a missing OF compile guard) could be
applied straight away.

At this stage it is unclear to me how to proceed with patches 2-5.

fyi:

A version of these patches had been applied once before:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=0ef1594c017521ea89278e80fe3f80dafb17abde

But due to a few issues it was later reverted:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=e24b0bfa2f0446ffaad2661040be23668133aef8

Strange life of a patch. Mine was the original, Johan Hovold
objected and had it reverted. Johan then presented his first
patch then v2. They got lost in the weeds.

My hardware was still broken and this bug caused collateral
damage. My original patch no longer applied to lk 3.10.0-rc1
so I rewrote it, borrowing some of Johan's ideas and doing a
probe time check for the broken RTC_IMR. That patch was
presented about a week ago:
http://marc.info/?l=linux-arm-kernel&m=136917492531478&w=2
The top of that post gives some more background.

That prompted Johan to produce v3 of his patch which is the
subject of this thread. I was hoping that Nicolas Ferre would
comment or ack one of these patches. Still waiting.

I have a copy of the original, publicly released manual for
the at91sam9g25 (a member of the at91sam9x5 family) marked
"11032A–ATARM–27-Jul-11". It contains the following:
Errata
49.3.1
RTC: Interrupt Mask Register cannot be used
Interrupt Mask Register reading always returns 0.

Both Rev B and Rev C of that manual drop that particular
erratum. My g25 SoC-based subsystems come from an Atmel
partner and still have the RTC IMR bug.

Doug Gilbert

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Re: [PATCH v3 0/5] rtc-at91rm9200: add shadow interrupt mask

2013-05-30 Thread Douglas Gilbert


On 13-05-30 03:36 PM, Andrew Morton wrote:

On Thu, 30 May 2013 09:50:27 +0200 Nicolas Ferre  
wrote:


The review of this patch series was in my TODO list for some time...

Today, I magically took time to review it ;-)
The patch series is good and I (even if it is too late) here is my:

Acked-by: Nicolas Ferre 

I do not know if the series can be stacked for inclusion in 3.10-rc but
the resolution of this bug can help a lot (as Douglas is saying in
subsequent email...).


We can do that, but looking through the discussion and changelogs I
can't seem to find a usable description of what impact the bug (and its
fix) have upon end-users.

A nicely packaged description of that impact would help grease the
wheels, please.


How about this:

The members of Atmel's at91sam9x5 family (9x5) have
a broken RTC interrupt mask register (AT91_RTC_IMR).
It does not reflect enabled interrupts but instead
always returns zero.

The kernel's rtc-at91rm9200 driver handles the RTC
for the 9x5 family. Currently when the date/time is
set, an interrupt is generated and this driver neglects
to handle the interrupt. The kernel complains about the
un-handled interrupt and disables it henceforth. This
not only breaks the RTC function, but since that
interrupt is shared (Atmel's SYS interrupt) then other
things break as well (e.g. the debug port no longer
accepts characters).

Tested on the at91sam9g25. Bug confirmed by Atmel.

Edit as you please.

Doug Gilbert

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Announce] sg3_utils-1.36 available

2013-06-02 Thread Douglas Gilbert


sg3_utils is a package of command line utilities for sending
SCSI and some ATA commands to devices. This package targets
the Linux 3, 2.6 and 2.4 kernel series. It also has ports to
FreeBSD, Tru64, Solaris, and Windows (cygwin and mingw).

Mainly small changes and fixes in this version including
several contributed additions to sg_xcopy. There is also
improved handling of 64 bit LUNs which may facilitate full
64 bit LUN support being added to the Linux kernel. This
version tracks various changes made by www.t10.org since
January 2013.

For an overview of sg3_utils and downloads see this page:
http://sg.danny.cz/sg/sg3_utils.html
The sg_ses utility (for enclosure devices) is discussed at:
http://sg.danny.cz/sg/sg_ses.html
The SG_IO ioctl is discussed at:
http://sg.danny.cz/sg/sg_io.html
A full changelog can be found at:
http://sg.danny.cz/sg/p/sg3_utils.ChangeLog

A release announcement will be sent to freecode.com .

Changelog for sg3_utils-1.36 [20130531] [svn: r497]
  - sg_vpd: Protocol-specific port information VPD page
for SAS SSP, persistent connection (spl3r2), power
disable (spl3r3)
- block device characteristics: add FUAB bit
  - sg_xcopy: handle more descriptor types; handle zero
maximum segment length; allow list IDs to be disabled;
improve skip/seek handling; allow xcopy on destination
  - sg_reset: and --no-esc option to stop reset escalation
- clean up cli, add long option names
  - sg_luns: add --test=ALUN option for decoding LUNs
- decoded luns output in decimal or hex (if -HH given)
- add '--linux' option to show Linux LUN after T10
  representation, can map one to the other
  - sg_inq: add --vendor option to show standard inquiry's
vendor specific fields in ASCII
- take resid into account with response output
  - sg_sync: add --16 (for 16 byte command) and --timeout=
  - sg_logs: add data compression page (ssc4)
  - sg_sat_set_features: increase --lba from 1 to 4 bytes
  - sg_write_same: add --ndob option (sbc3r35d)
  - sg_map: mark as deprecated
  - sginfo: mark as deprecated, especially -l (list)
  - sg_lib: improve snprintf handling
  - sg_lib_data: sync asc/ascq codes with T10 20130117
  - sg_cmds (lib): if noisy given, give more UA info
  - make code more C++ friendly

Changelog for sg3_utils-1.35 [20130117] [svn: r476]
...

Doug Gilbert
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 4/4] scsi_debug: fix do_device_access() with wrap around range

2013-06-24 Thread Douglas Gilbert


On 13-06-23 02:37 PM, Akinobu Mita wrote:

do_device_access() is a function that abstracts copying SG list from/to
ramdisk storage (fake_storep).

It must deal with the ranges exceeding actual fake_storep size, because
such ranges are valid if virtual_gb is set greater than zero, and they
should be treated as fake_storep is repeatedly mirrored up to virtual size.

Unfortunately, it can't deal with the range which wraps around the end of
fake_storep. A wrap around range is copied by two sg_copy_{from,to}_buffer()
calls, but sg_copy_{from,to}_buffer() can't copy from/to in the middle of
SG list, therefore the second call can't copy correctly.

This fixes it by using sg_pcopy_{from,to}_buffer() that can copy from/to
the middle of SG list.

This also simplifies the assignment of sdb->resid in fill_from_dev_buffer().
Because fill_from_dev_buffer() is now only called once per command
execution cycle.  So it is not necessary to take care to decrease
sdb->resid if fill_from_dev_buffer() is called more than once.

Signed-off-by: Akinobu Mita 
Cc: "James E.J. Bottomley" 
Cc: Douglas Gilbert 
Cc: linux-s...@vger.kernel.org
---

* No change from v2


Acked-by: Douglas Gilbert 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[ANNOUNCE] sdparm 1.08 available

2013-06-09 Thread Douglas Gilbert


sdparm is a command line utility designed to get and set
SCSI device parameters (cf hdparm for ATA disks). The
parameters are held in mode pages. Apart from SCSI devices
(e.g. disks, tapes and enclosures) sdparm can be used on
any device that uses a SCSI command set. Almost all CD/DVD/BD
drives use the SCSI MMC set irrespective of the transport.
sdparm also can decode VPD pages including the device
identification page. Commands to start and stop the media;
load and unload removable media and some other housekeeping
functions are supported. sdparm supports both the Linux
kernel 2.4, 2.6 and 3 series with ports to FreeBSD, Solaris,
Tru64 and Windows.

The version tracks changes in draft standards from
www.t10.org since January 2012.

For more information and downloads see:
http://sg.danny.cz/sg/sdparm.html

ChangeLog for sdparm-1.08 [20130606] [svn: r215]
  - device id VPD: add protocol specific port identifier
  - control extension mpage: add max sense data length
  - power condition mpage: FIDCPC->CCF_IDLE,
FSBCPC->CCF_STAND, FSTCPC->CCF_STOPP (spc4r34+)
  - caching mpage: add SYNC_PROG field (sbc3r33)
  - block device characteristics VPD page additions sbc3r34
  - extended inquiry vpd page: add max supported sense data
length
  - protocol-specific port information VPD page for SAS SSP,
persistent connection (spl3r2), power disable (spl3r3)
  - allow --readonly with --set= and --clear=
  - add placeholder for third party copy VPD page
  - supply more information if a UA occurs
  - add Makefile so scripts/sas_disk_blink installed
  - scripts/scsi_ch_swp: new, uses sdparm and blockdev
  - ./configure options:
- change --enable-no-linux-bsg to --disable-linuxbsg
- add --disable-scsistrings to reduce utility size
  with non-libsgutils build
  - point svn:externals to rev 498 of sg3_utils
- report sdat_ovfl bit (if set) in sense data
- sg_pt_linux: expand DID_ (host_byte) codes
  - cope with a transport error plus sense data
  - prefer major() over MAJOR() macro
- win32: fixes for cygwin version 1.7.17 headers

ChangeLog for sdparm-1.07 [20120121] [svn: r188]
...


Doug Gilbert
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ARM: at91: add Acme Systems Aria G25 board

2013-03-25 Thread Douglas Gilbert


On 13-03-25 08:22 AM, Jean-Christophe PLAGNIOL-VILLARD wrote:

On 09:49 Mon 25 Mar , Nicolas Ferre wrote:

From: Douglas Gilbert 

Signed-off-by: Douglas Gilbert 
Signed-off-by: Nicolas Ferre 
---
  arch/arm/boot/dts/ariag25.dts | 168 ++
  1 file changed, 168 insertions(+)
  create mode 100644 arch/arm/boot/dts/ariag25.dts

diff --git a/arch/arm/boot/dts/ariag25.dts b/arch/arm/boot/dts/ariag25.dts
new file mode 100644
index 000..d18ef50
--- /dev/null
+++ b/arch/arm/boot/dts/ariag25.dts
@@ -0,0 +1,168 @@
+/*
+ * ariag25.dts - Device Tree file for Acme Systems Aria G25 (AT91SAM9G25 based)
+ *
+ * Copyright (C) 2013 Douglas Gilbert ,
+ *Robert Nelson 
+ *
+ * Licensed under GPLv2 or later.
+ */
+/dts-v1/;
+/include/ "at91sam9g25.dtsi"
+
+/ {
+   model = "Acme Systems Aria G25";
+   compatible = "acme,ariag25", "atmel,at91sam9g25ek", 
"atmel,at91sam9x5ek",
+"atmel,at91sam9x5", "atmel,at91sam9";

I doube the code is compatible with the 9g25ek

specially when you do not include it

+
+   aliases {
+   serial4 = &usart3;
+   serial5 = &uart0;
+   };

you need to specify all

+
+   chosen {
+   bootargs = "console=ttyS0,115200 root=/dev/mmcblk0p2 rw 
rootwait";
+   };
+
+   memory {
+   /* 128 MB, change this for 256 MB revision */
+   reg = <0x2000 0x800>;
+   };
+
+   clocks {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   ranges;
+
+   main_clock: clock@0 {
+   compatible = "atmel,osc", "fixed-clock";
+   clock-frequency = <1200>;
+   };
+   };
+
+   ahb {
+   apb {
+   mmc0: mmc@f0008000 {
+   /* N.B. Aria has no SD card detect (CD), 
assumed present */
+
+   pinctrl-0 = <
+   &pinctrl_mmc0_slot0_clk_cmd_dat0
+   &pinctrl_mmc0_slot0_dat1_3>;
+   status = "okay";
+   slot@0 {
+   reg = <0>;
+   bus-width = <4>;
+   };
+   };
+
+   i2c0: i2c@f801 {
+   status = "okay";
+   };
+
+   i2c1: i2c@f8014000 {
+   status = "okay";
+   };
+
+   /* TWD2+TCLK2 hidden behind ethernet, so no i2c2 */
+
+   usart0: serial@f801c000 {
+   pinctrl-0 = <&pinctrl_usart0
+&pinctrl_usart0_rts
+&pinctrl_usart0_cts>;
+   status = "okay";
+   };
+
+   usart1: serial@f802 {
+   pinctrl-0 = <&pinctrl_usart1
+/* &pinctrl_usart1_rts */
+/* &pinctrl_usart1_cts */
+   >;
+   status = "okay";
+   };
+
+   usart2: serial@f8024000 {
+   /* cannot activate RTS2+CTS2, clash with
+* ethernet on PB0 and PB1 */
+   pinctrl-0 = <&pinctrl_usart2>;
+   status = "okay";
+   };
+
+   usart3: serial@f8028000 {
+   compatible = "atmel,at91sam9260-usart";
+   reg = <0xf8028000 0x200>;
+   interrupts = <8 4 5>;
+   pinctrl-names = "default";
+   pinctrl-0 = <&pinctrl_usart3
+/* &pinctrl_usart3_rts */
+/* &pinctrl_usart3_cts */
+   >;
+   status = "okay";
+   };
+
+   macb0: ethernet@f802c000 {
+   phy-mode = "rmii";
+   /* following can be overwritten by uboot 'ftd 
set' command */
+   local-mac-address = [00 04 25 dd 10 01];

drop this,

Re: [PATCH] ARM: at91: add Acme Systems Aria G25 board

2013-03-25 Thread Douglas Gilbert


On 13-03-25 08:22 AM, Jean-Christophe PLAGNIOL-VILLARD wrote:

On 09:49 Mon 25 Mar , Nicolas Ferre wrote:

From: Douglas Gilbert 

Signed-off-by: Douglas Gilbert 
Signed-off-by: Nicolas Ferre 
---
  arch/arm/boot/dts/ariag25.dts | 168 ++
  1 file changed, 168 insertions(+)
  create mode 100644 arch/arm/boot/dts/ariag25.dts

diff --git a/arch/arm/boot/dts/ariag25.dts b/arch/arm/boot/dts/ariag25.dts
new file mode 100644
index 000..d18ef50
--- /dev/null
+++ b/arch/arm/boot/dts/ariag25.dts
@@ -0,0 +1,168 @@
+/*
+ * ariag25.dts - Device Tree file for Acme Systems Aria G25 (AT91SAM9G25 based)
+ *
+ * Copyright (C) 2013 Douglas Gilbert ,
+ *Robert Nelson 
+ *
+ * Licensed under GPLv2 or later.
+ */
+/dts-v1/;
+/include/ "at91sam9g25.dtsi"
+
+/ {
+   model = "Acme Systems Aria G25";
+   compatible = "acme,ariag25", "atmel,at91sam9g25ek", 
"atmel,at91sam9x5ek",
+"atmel,at91sam9x5", "atmel,at91sam9";

I doube the code is compatible with the 9g25ek

specially when you do not include it

+
+   aliases {
+   serial4 = &usart3;
+   serial5 = &uart0;
+   };

you need to specify all

+
+   chosen {
+   bootargs = "console=ttyS0,115200 root=/dev/mmcblk0p2 rw 
rootwait";
+   };
+
+   memory {
+   /* 128 MB, change this for 256 MB revision */
+   reg = <0x2000 0x800>;
+   };
+
+   clocks {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   ranges;
+
+   main_clock: clock@0 {
+   compatible = "atmel,osc", "fixed-clock";
+   clock-frequency = <1200>;
+   };
+   };
+
+   ahb {
+   apb {
+   mmc0: mmc@f0008000 {
+   /* N.B. Aria has no SD card detect (CD), 
assumed present */
+
+   pinctrl-0 = <
+   &pinctrl_mmc0_slot0_clk_cmd_dat0
+   &pinctrl_mmc0_slot0_dat1_3>;
+   status = "okay";
+   slot@0 {
+   reg = <0>;
+   bus-width = <4>;
+   };
+   };
+
+   i2c0: i2c@f801 {
+   status = "okay";
+   };
+
+   i2c1: i2c@f8014000 {
+   status = "okay";
+   };
+
+   /* TWD2+TCLK2 hidden behind ethernet, so no i2c2 */
+
+   usart0: serial@f801c000 {
+   pinctrl-0 = <&pinctrl_usart0
+&pinctrl_usart0_rts
+&pinctrl_usart0_cts>;
+   status = "okay";
+   };
+
+   usart1: serial@f802 {
+   pinctrl-0 = <&pinctrl_usart1
+/* &pinctrl_usart1_rts */
+/* &pinctrl_usart1_cts */
+   >;
+   status = "okay";
+   };
+
+   usart2: serial@f8024000 {
+   /* cannot activate RTS2+CTS2, clash with
+* ethernet on PB0 and PB1 */
+   pinctrl-0 = <&pinctrl_usart2>;
+   status = "okay";
+   };
+
+   usart3: serial@f8028000 {
+   compatible = "atmel,at91sam9260-usart";
+   reg = <0xf8028000 0x200>;
+   interrupts = <8 4 5>;
+   pinctrl-names = "default";
+   pinctrl-0 = <&pinctrl_usart3
+/* &pinctrl_usart3_rts */
+/* &pinctrl_usart3_cts */
+   >;
+   status = "okay";
+   };
+
+   macb0: ethernet@f802c000 {
+   phy-mode = "rmii";
+   /* following can be overwritten by uboot 'ftd 
set' command */
+   local-mac-address = [00 04 25 dd 10 01];

drop this, t

Re: [PATCH] ARM: at91: add Acme Systems Aria G25 board

2013-03-25 Thread Douglas Gilbert


On 13-03-25 10:31 AM, Jean-Christophe PLAGNIOL-VILLARD wrote:

On 09:48 Mon 25 Mar , Douglas Gilbert wrote:

On 13-03-25 08:22 AM, Jean-Christophe PLAGNIOL-VILLARD wrote:

On 09:49 Mon 25 Mar , Nicolas Ferre wrote:

From: Douglas Gilbert 

Signed-off-by: Douglas Gilbert 
Signed-off-by: Nicolas Ferre 
---
  arch/arm/boot/dts/ariag25.dts | 168 ++
  1 file changed, 168 insertions(+)
  create mode 100644 arch/arm/boot/dts/ariag25.dts

diff --git a/arch/arm/boot/dts/ariag25.dts b/arch/arm/boot/dts/ariag25.dts
new file mode 100644
index 000..d18ef50
--- /dev/null
+++ b/arch/arm/boot/dts/ariag25.dts
@@ -0,0 +1,168 @@
+/*
+ * ariag25.dts - Device Tree file for Acme Systems Aria G25 (AT91SAM9G25 based)
+ *
+ * Copyright (C) 2013 Douglas Gilbert ,
+ *Robert Nelson 
+ *
+ * Licensed under GPLv2 or later.
+ */
+/dts-v1/;
+/include/ "at91sam9g25.dtsi"
+
+/ {
+   model = "Acme Systems Aria G25";
+   compatible = "acme,ariag25", "atmel,at91sam9g25ek", 
"atmel,at91sam9x5ek",
+"atmel,at91sam9x5", "atmel,at91sam9";

I doube the code is compatible with the 9g25ek

specially when you do not include it

+
+   aliases {
+   serial4 = &usart3;
+   serial5 = &uart0;
+   };

you need to specify all

+
+   chosen {
+   bootargs = "console=ttyS0,115200 root=/dev/mmcblk0p2 rw 
rootwait";
+   };
+
+   memory {
+   /* 128 MB, change this for 256 MB revision */
+   reg = <0x2000 0x800>;
+   };
+
+   clocks {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   ranges;
+
+   main_clock: clock@0 {
+   compatible = "atmel,osc", "fixed-clock";
+   clock-frequency = <1200>;
+   };
+   };
+
+   ahb {
+   apb {
+   mmc0: mmc@f0008000 {
+   /* N.B. Aria has no SD card detect (CD), 
assumed present */
+
+   pinctrl-0 = <
+   &pinctrl_mmc0_slot0_clk_cmd_dat0
+   &pinctrl_mmc0_slot0_dat1_3>;
+   status = "okay";
+   slot@0 {
+   reg = <0>;
+   bus-width = <4>;
+   };
+   };
+
+   i2c0: i2c@f801 {
+   status = "okay";
+   };
+
+   i2c1: i2c@f8014000 {
+   status = "okay";
+   };
+
+   /* TWD2+TCLK2 hidden behind ethernet, so no i2c2 */
+
+   usart0: serial@f801c000 {
+   pinctrl-0 = <&pinctrl_usart0
+&pinctrl_usart0_rts
+&pinctrl_usart0_cts>;
+   status = "okay";
+   };
+
+   usart1: serial@f802 {
+   pinctrl-0 = <&pinctrl_usart1
+/* &pinctrl_usart1_rts */
+/* &pinctrl_usart1_cts */
+   >;
+   status = "okay";
+   };
+
+   usart2: serial@f8024000 {
+   /* cannot activate RTS2+CTS2, clash with
+* ethernet on PB0 and PB1 */
+   pinctrl-0 = <&pinctrl_usart2>;
+   status = "okay";
+   };
+
+   usart3: serial@f8028000 {
+   compatible = "atmel,at91sam9260-usart";
+   reg = <0xf8028000 0x200>;
+   interrupts = <8 4 5>;
+   pinctrl-names = "default";
+   pinctrl-0 = <&pinctrl_usart3
+/* &pinctrl_usart3_rts */
+/* &pinctrl_usart3_cts */
+   >;
+   status = "okay";
+   };
+
+   macb0: ethernet@f802c000 {
+   phy-mode = "rmii";
+   /* following can be overwritten by uboot '

Re: [PATCH v2] [SCSI] sg: Fix user memory corruption when SG_IO is interrupted by a signal

2013-08-15 Thread Douglas Gilbert


On 13-08-15 12:45 PM, Roland Dreier wrote:

Jens / James, do you guys plan to send this to Linus for 3.11?
Triggering this bug is a bit esoteric but the impact is pretty nasty
(corrupting an unrelated process).


The patch is fine with me. Even though the sg driver is
named in the patch title, I note that the v2 patch is
only against the block layer. Hence it does not need an
ack from me.

Doug Gilbert


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [SCSI REGRESSION] 3.10.2 or 3.10.3: arcmsr failure at bootup / early userspace transition

2013-07-29 Thread Douglas Gilbert


On 13-07-29 05:09 PM, Nix wrote:

On 29 Jul 2013, Bernd Schubert uttered the following:


On 07/29/2013 03:05 PM, Nix wrote:

On 29 Jul 2013, Bernd Schubert said:


Hi Nick,

On 07/29/2013 12:10 PM, Nick Alcock wrote:

arcmsr0: abort device command of scsi id = 0 lun = 1
arcmsr0: abort device command of scsi id = 0 lun = 0
arcmsr: executing bus reset eh.num_resets=0, num_[...]

arcmsr0: wait 'abort all outstanding command' timeout
arcmsr0: executing hw bus reset 
arcmsr0: waiting for hw bus reset return, retry=0
arcmsr0: waiting for hw bus reset return, retry=1
Areca RAID Controller0: F/W V1.46 2009-01-06 & Model ARC-1210
arcmsr: scsi  bus reset eh returns with success
[and back to the top of the error messages again, apparently forever,
not that the machine would be much use without its RAID array even
if this loop terminated at some point, so I only gave it a couple
of minutes]

The failure happens precisely at the moment we transition to early
userspace, so presumably userspace I/O is failing (or something related
to raw device access, perhaps, since the first thing it does is a
vgscan).

I haven't bisected yet (sorry, I have work to do which means this
machine must be running right now), but nothing has changed in the
arcmsr controller, nor in SCSI-land excepting

commit 98dcc2946adbe4349ef1ef9b99873b912831edd4
Author: Martin K. Petersen 
Date:   Thu Jun 6 22:15:55 2013 -0400


I can now confirm that reverting this commit causes this problem to go
away, and my machine boots fine again.

Please revert (and figure out what is wrong so that 3.11 doesn't
implode in the same way? I'm happy to assist...)


Hi,
Please supply the information that Martin Petersen asked
for.

I just examined a more recent Areca SAS RAID controller
and would describe it as the SCSI device from hell. One solution
to this problem is to modify the arcmsr driver so it returns
a more consistent set of lies to the management SCSI commands that
Martin is asking about.

Doug Gilbert

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v5 0/4] [SCSI] sg: fix race condition in sg_open

2013-07-31 Thread Douglas Gilbert


On 13-07-22 01:03 PM, Jörn Engel wrote:

On Mon, 22 July 2013 12:40:29 +0800, Vaughan Cao wrote:


There is a race when open sg with O_EXCL flag. Also a race may happen between
sg_open and sg_remove.

Changes from v4:
  * [3/4] use ERR_PTR series instead of adding another parameter in sg_add_sfp
  * [4/4] fix conflict for cherry-pick from v3.

Changes from v3:
  * release o_sem in sg_release(), not in sg_remove_sfp().
  * not set exclude with sfd_lock held.

Vaughan Cao (4):
   [SCSI] sg: use rwsem to solve race during exclusive open
   [SCSI] sg: no need sg_open_exclusive_lock
   [SCSI] sg: checking sdp->detached isn't protected when open
   [SCSI] sg: push file descriptor list locking down to per-device
 locking

  drivers/scsi/sg.c | 178 +-
  1 file changed, 83 insertions(+), 95 deletions(-)


Patchset looks good to me, although I didn't test it on hardware yet.
Signed-off-by: Joern Engel 

James, care to pick this up?


Acked-by: Douglas Gilbert 

Tested O_EXCL with multiple processes and threads; passed.
sg driver prior to this patch had "leaky" O_EXCL logic
according to the same test. Block device passed.

James, could you clean this up:
  drivers/scsi/sg.c:242:6: warning: unused variable ‘res’ [-Wunused-variable]

Doug Gilbert


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v5 0/4] [SCSI] sg: fix race condition in sg_open

2013-08-02 Thread Douglas Gilbert


On 13-08-01 01:01 AM, Douglas Gilbert wrote:

On 13-07-22 01:03 PM, Jörn Engel wrote:

On Mon, 22 July 2013 12:40:29 +0800, Vaughan Cao wrote:


There is a race when open sg with O_EXCL flag. Also a race may happen between
sg_open and sg_remove.

Changes from v4:
  * [3/4] use ERR_PTR series instead of adding another parameter in sg_add_sfp
  * [4/4] fix conflict for cherry-pick from v3.

Changes from v3:
  * release o_sem in sg_release(), not in sg_remove_sfp().
  * not set exclude with sfd_lock held.

Vaughan Cao (4):
   [SCSI] sg: use rwsem to solve race during exclusive open
   [SCSI] sg: no need sg_open_exclusive_lock
   [SCSI] sg: checking sdp->detached isn't protected when open
   [SCSI] sg: push file descriptor list locking down to per-device
 locking

  drivers/scsi/sg.c | 178 +-
  1 file changed, 83 insertions(+), 95 deletions(-)


Patchset looks good to me, although I didn't test it on hardware yet.
Signed-off-by: Joern Engel 

James, care to pick this up?


Acked-by: Douglas Gilbert 

Tested O_EXCL with multiple processes and threads; passed.
sg driver prior to this patch had "leaky" O_EXCL logic
according to the same test. Block device passed.

James, could you clean this up:
   drivers/scsi/sg.c:242:6: warning: unused variable ‘res’ [-Wunused-variable]


Further testing suggests this patch on the sg driver is
broken, so I'll rescind my ack.

The case it is broken for is when a device is opened
without O_EXCL. Now if, while it is open, a second
thread/process tries to open the same device O_EXCL
then IMO the second open should fail with EBUSY.

My testing shows that O_EXCL opens properly deflect
other O_EXCL opens.

BTW the standard block driver (e.g. /dev/sdc) is broken
in exactly the same way, according to my tests.

Doug Gilbert


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/9] target: Add support for COMPARE_AND_WRITE (VAAI) emulation

2013-08-20 Thread Douglas Gilbert


On 13-08-20 05:53 PM, Nicholas A. Bellinger wrote:

On Tue, 2013-08-20 at 23:29 +0200, Christoph Hellwig wrote:

On Tue, Aug 20, 2013 at 08:07:51PM +, Nicholas A. Bellinger wrote:


It's also currently lacking the necessary sychronization between I/O
submission of COMPARE_AND_WRITE verify instance and write instance
user data, which is still being worked on in order to avoid additional
overhead in the main I/O fast path.


I don't think merging such a non-conforming implementation makes any sense.



Yes, I don't intend to merge anything that's not fully functional.

The idea was to get review going on these pieces first.  I'll be posting
an PATCH-v2 to complete the implementation over the next days.


Also for a complex command like this with all it's race potential I would
really like to see some test cases to go along with it.



Yes, Eric @ PureStorage has a sg_compare_write that I'm using to test
this.  It's probably about time that this be included in upstream
sg3-utils too..


Changelog for sg3_utils-1.35 [20130117] [svn: r476]
  - sg_compare_and_write: new utility
...

So it has been released for 6 months. Also version 1.36
has been released since then so you might check more
often. Does Eric's version have any improvements over the
version already in sg3_utils? [Apart from a shorter name ...]

Doug Gilbert


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/9] target: Add support for EXTENDED_COPY (VAAI) offload emulation

2013-08-23 Thread Douglas Gilbert


On 13-08-23 04:26 AM, Nicholas A. Bellinger wrote:

From: Nicholas Bellinger 

Hi folks!

This series adds support to target-core for generic EXTENDED_COPY offload
emulation as defined by SPC-4 using virtual (IBLOCK, FILEIO, RAMDISK)
backends.

EXTENDED_COPY is a VMWare ESX VAAI primative that is used to perform copy
offload, that allows a target to perform local READ + WRITE I/O requests
for bulk data transfers (cloning a virtual machine for example), instead
of requiring these I/Os to actually be sent to/from the requesting SCSI
initiator port.


Recently I have been looking at EXTENDED COPY since
T10 has been working on it. The SCSI opcodes associated
with it (0x83 and 0x84) have been renamed THIRD PARTY
COPY OUT and IN, and each have several service actions.
The Extended copy found in SPC-2 and SPC-3 is now
termed as "LID1" (for List Identifier length of 1 byte)
while the new stuff is termed "LID4" in SPC-4. The
"LID4" variants are "ROD" token based. A ROD
(Representation Of Data) token is 512 bytes long.

sg_xcopy (written by Hannes Reinecke and found in the
sg3_utils package) only supports LID1 but that covers
most of the existing hardware including kit that
supports VMWare ESX VAAI.

As defined by T10 both EXTENDED COPY (LID1) and (LID4)
do copies between disks, tapes and memory (all
combinations (e.g. disk->tape) (but maybe not memory
to memory).

There is a stripped down version of EXTENDED COPY(LID4)
that was called XCOPY LITE at one stage that only does
disk to disk copies (based on a ROD token). That is
the basis of Microsoft's Offload Data Transfer (ODX).
It uses commands defined in SBC-3 (e.g. POPULATE TOKEN
and WRITE USING TOKEN).


Confused? I certainly was. Feel free to correct and
clarify the above.

Doug Gilbert



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/9] target: Add support for EXTENDED_COPY (VAAI) offload emulation

2013-08-23 Thread Douglas Gilbert


On 13-08-23 02:33 PM, Martin K. Petersen wrote:

"Doug" == Douglas Gilbert  writes:


Doug> The SCSI opcodes associated with it (0x83 and 0x84) have been
Doug> renamed THIRD PARTY COPY OUT and IN, and

Where did you see that? My SPC still has EXTENDED COPY.


SCSI _opcodes_ == SCSI operation codes. In other words the
name associated with all service actions (commands) that have
as their first byte 0x83 and 0x84 .

spc4r36h.pdf Annex F.3.1 and changed in spc4r34. Yes, well
hidden but IMO useful. So now we can use the term "Third party
copy" to cover:
  - EXTENDED COPY(LID1) and associated commands
also found in SPC-2 and SPC-3 (with the "LID1" suffix)
  - EXTENDED COPY(LID4) and associated commands
  - the XCOPY LITE commands: POPULATE TOKEN and WRITE USING TOKEN


Doug> Confused? I certainly was.

Yeah, this is UNMAP all over again, just 100 times worse :(


Well what EXTENDED COPY (offload copy) is trying to do
ain't simple but obviously there could be a substantial
performance pay-off.

There is a "Third-party copy implementation and usage" Annex
(D) in spc4r36h.pdf . It could do with some more explanatory
text.


Anyway. Excited to see nab posting the patches! My copy offload code
from the spring has been getting stale both in the T10 and the kernel
sense. But at least we know what I'll be working on next week :)


BTW I have ported the sg_xcopy "LID1" xcopy logic into
my ddpt utility. That gives two advantages:
  - can cope with ibs!=obs
  - runs on OSes other than Linux

Doug Gilbert


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v5 0/4] [SCSI] sg: fix race condition in sg_open

2013-08-05 Thread Douglas Gilbert

On 13-08-04 10:19 PM, vaughan wrote:

On 08/03/2013 01:25 PM, Douglas Gilbert wrote:

On 13-08-01 01:01 AM, Douglas Gilbert wrote:

On 13-07-22 01:03 PM, Jörn Engel wrote:

On Mon, 22 July 2013 12:40:29 +0800, Vaughan Cao wrote:

There is a race when open sg with O_EXCL flag. Also a race may
happen between
sg_open and sg_remove.

Changes from v4:
   * [3/4] use ERR_PTR series instead of adding another parameter in
sg_add_sfp
   * [4/4] fix conflict for cherry-pick from v3.

Changes from v3:
   * release o_sem in sg_release(), not in sg_remove_sfp().
   * not set exclude with sfd_lock held.

Vaughan Cao (4):
[SCSI] sg: use rwsem to solve race during exclusive open
[SCSI] sg: no need sg_open_exclusive_lock
[SCSI] sg: checking sdp->detached isn't protected when open
[SCSI] sg: push file descriptor list locking down to per-device
  locking

   drivers/scsi/sg.c | 178
+-
   1 file changed, 83 insertions(+), 95 deletions(-)

Patchset looks good to me, although I didn't test it on hardware yet.
Signed-off-by: Joern Engel 

James, care to pick this up?

Acked-by: Douglas Gilbert 

Tested O_EXCL with multiple processes and threads; passed.
sg driver prior to this patch had "leaky" O_EXCL logic
according to the same test. Block device passed.

James, could you clean this up:
drivers/scsi/sg.c:242:6: warning: unused variable ‘res’
[-Wunused-variable]

Further testing suggests this patch on the sg driver is
broken, so I'll rescind my ack.

The case it is broken for is when a device is opened
without O_EXCL. Now if, while it is open, a second
thread/process tries to open the same device O_EXCL
then IMO the second open should fail with EBUSY.

My testing shows that O_EXCL opens properly deflect
other O_EXCL opens.

Hi  Doug,

My test don't have this issue. The routine is something as below:

I start three opens without O_EXCL, wait 30s each, and open with
O_EXCL|O_NONBLOCK, it failed with EBUSY.
And I also call myopen with/without O_EXCL many times in background at
the same time, and the test is passed. I don't know why it failed in
your test.

Usage: myopen [-e][-n][-d delay] -f file
   -e: exclude
   -n: nonblock
   -d: delay N seconds and then close.

[root@vacaowol5 16835013]# ./myopen  -f /dev/sg5 -d 30 &
[1] 3417
[root@vacaowol5 16835013]# ./myopen  -f /dev/sg5 -d 30 &
[2] 3418
[root@vacaowol5 16835013]# ./myopen  -f /dev/sg5 -d 30 &
[3] 3419
[root@vacaowol5 16835013]# cat /proc/scsi/sg/debug
max_active_device=6(origin 1)
  def_reserved_size=32768
  >>> device=sg5 scsi5 chan=0 id=1 lun=0   em=0 sg_tablesize=55 excl=0
FD(1): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0
cmd_q=0 f_packid=0 k_orphan=0 closed=0
  No requests active
FD(2): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0
cmd_q=0 f_packid=0 k_orphan=0 closed=0
  No requests active
FD(3): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0
cmd_q=0 f_packid=0 k_orphan=0 closed=0
  No requests active

[root@vacaowol5 16835013]# ./myopen -e -n  -f /dev/sg5 -d 30 &
[4] 3422
[3422:3351] /dev/sg5:exclude: Device or resource busy

[4]+  Exit 1  ./myopen -e -n -f /dev/sg5 -d 30

[root@vacaowol5 16835013]# cat /proc/scsi/sg/debug
max_active_device=6(origin 1)
  def_reserved_size=32768
  >>> device=sg5 scsi5 chan=0 id=1 lun=0   em=0 sg_tablesize=55 excl=0
FD(1): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0
cmd_q=0 f_packid=0 k_orphan=0 closed=0
  No requests active
FD(2): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0
cmd_q=0 f_packid=0 k_orphan=0 closed=0
  No requests active
FD(3): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0
cmd_q=0 f_packid=0 k_orphan=0 closed=0
  No requests active
[root@vacaowol5 16835013]# cat /proc/scsi/sg/debug
[1]   Done./myopen -f /dev/sg5 -d 30
[2]-  Done./myopen -f /dev/sg5 -d 30
[3]+  Done./myopen -f /dev/sg5 -d 30

Hi,
After the initial failures about 36 hours ago, retesting
yesterday and today has not produced any unexpected
failures. And I have been trying hard on lk 3.10.4 and
lk 3.10.5 .

My test program is a bit more intense than yours and can
be found in the sg3_utils beta in the News section of this
page:
  http://sg.danny.cz/sg/

It is in the examples directory, two variants called
sg_tst_excl and sg_tst_excl2 . You will need a recent gcc
compiler, IOW something that can compile c++11 . gcc 4.7.3
in Ubuntu 13.04 only just manages, fedora 19 should do
better with gcc 4.8.1 . The threading is implemented using
pthreads so it should be reliable.

Typically I run multiple instances (processes) and each has
multiple threads. One instance can run '-x' which will cause
its first thread not to use O_EXCL **. All my tests currently
use O_NONBLOCK and that leads to lots of EBUSYs (someti

Re: [PATCH] [SCSI] sg: Fix user memory corruption when SG_IO is interrupted by a signal

2013-08-05 Thread Douglas Gilbert


Roland,
When this sg code was originally designed, there wasn't a bio
in sight :-)

Now I'm trying to get my head around this. We have launched
a "data-in" SCSI command like READ(10) and the DMA is underway
so we are waiting for a "done" indication. Instead we receive
a signal interruption. It is not clear to me why that DMA
would not just keep going unless we can get to something that
can stop or redirect the DMA. That something is more likely to
be the low level driver being used rather than the block layer.
In the original design to cope with this the destination pages
were locked in memory until the DMA completed.

So originally the design was to allow for this case at the top
of the waterfall. Now it seems there is bio magic going on
half way down the waterfall in the case of a signal interruption.


BTW, the keep_orphan logic probably only works for the
asynchronous sg interface (i.e. write sg_io_hdr then read response)
rather the the synchronous SG_IO ioctl. To support the keep_orphan
the user would need to do a read() on the sg device after the
SG_IO ioctl was interrupted.


Anyway, this obviously needs to be fixed.

Doug Gilbert

On 13-08-05 06:02 PM, Roland Dreier wrote:

From: Roland Dreier 

There is a nasty bug in the SCSI SG_IO ioctl that in some circumstances
leads to one process writing data into the address space of some other
random unrelated process if the ioctl is interrupted by a signal.
What happens is the following:

  - A process issues an SG_IO ioctl with direction DXFER_FROM_DEV (ie the
underlying SCSI command will transfer data from the SCSI device to
the buffer provided in the ioctl)

  - Before the command finishes, a signal is sent to the process waiting
in the ioctl.  This will end up waking up the sg_ioctl() code:

result = wait_event_interruptible(sfp->read_wait,
(srp_done(sfp, srp) || sdp->detached));

but neither srp_done() nor sdp->detached is true, so we end up just
setting srp->orphan and returning to userspace:

srp->orphan = 1;
write_unlock_irq(&sfp->rq_list_lock);
return result;  /* -ERESTARTSYS because signal hit process */

At this point the original process is done with the ioctl and
blithely goes ahead handling the signal, reissuing the ioctl, etc.

  - Eventually, the SCSI command issued by the first ioctl finishes and
ends up in sg_rq_end_io().  At the end of that function, we run through:

write_lock_irqsave(&sfp->rq_list_lock, iflags);
if (unlikely(srp->orphan)) {
if (sfp->keep_orphan)
srp->sg_io_owned = 0;
else
done = 0;
}
srp->done = done;
write_unlock_irqrestore(&sfp->rq_list_lock, iflags);

if (likely(done)) {
/* Now wake up any sg_read() that is waiting for this
 * packet.
 */
wake_up_interruptible(&sfp->read_wait);
kill_fasync(&sfp->async_qp, SIGPOLL, POLL_IN);
kref_put(&sfp->f_ref, sg_remove_sfp);
} else {
INIT_WORK(&srp->ew.work, sg_rq_end_io_usercontext);
schedule_work(&srp->ew.work);
}

Since srp->orphan *is* set, we set done to 0 (assuming the
userspace app has not set keep_orphan via an SG_SET_KEEP_ORPHAN
ioctl), and therefore we end up scheduling sg_rq_end_io_usercontext()
to run in a workqueue.

  - In workqueue context we go through sg_rq_end_io_usercontext() ->
sg_finish_rem_req() -> blk_rq_unmap_user() -> ... ->
bio_uncopy_user() -> __bio_copy_iov() -> copy_to_user().

The key point here is that we are doing copy_to_user() on a
workqueue -- that is, we're on a kernel thread with current->mm
equal to whatever random previous user process was scheduled before
this kernel thread.  So we end up copying whatever data the SCSI
command returned to the virtual address of the buffer passed into
the original ioctl, but it's quite likely we do this copying into a
different address space!

Fix this by telling sg_finish_rem_req() whether we're on a workqueue
or not, and if we are, calling a new function blk_rq_unmap_user_nocopy()
that does everything the original blk_rq_unmap_user() does except
calling copy_{to,from}_user().  This requires a few levels of plumbing
through a "copy" flag in the bio layer.

I also considered fixing this by having the sg code just set
BIO_NULL_MAPPED for bios that are unmapped from a workqueue, which
happens to work because the __free_page() part of __bio_copy_iov()
isn't needed for sg (because sg handles its own pages).  However, this
seems coincidental and fragile, so I preferred making the fix
explicit, at the cost of minor tweaks to the bio code.

Huge thanks to Costa Sapuntzakis  for the
original pointer to this bug in the sg code.

Signed-off-by: Roland Dreier 
Cc: 
---
  block/blk-

Re: [PATCH] [SCSI] sg: Fix user memory corruption when SG_IO is interrupted by a signal

2013-08-06 Thread Douglas Gilbert


On 13-08-05 11:54 PM, Peter Chang wrote:

2013/8/5 Roland Dreier :

From: Roland Dreier 

There is a nasty bug in the SCSI SG_IO ioctl that in some circumstances
leads to one process writing data into the address space of some other
random unrelated process if the ioctl is interrupted by a signal.
What happens is the following:

  - A process issues an SG_IO ioctl with direction DXFER_FROM_DEV (ie the
underlying SCSI command will transfer data from the SCSI device to
the buffer provided in the ioctl)

  - Before the command finishes, a signal is sent to the process waiting
in the ioctl.  This will end up waking up the sg_ioctl() code:

 result = wait_event_interruptible(sfp->read_wait,
 (srp_done(sfp, srp) || sdp->detached));

but neither srp_done() nor sdp->detached is true, so we end up just
setting srp->orphan and returning to userspace:

 srp->orphan = 1;
 write_unlock_irq(&sfp->rq_list_lock);
 return result;  /* -ERESTARTSYS because signal hit process */

At this point the original process is done with the ioctl and
blithely goes ahead handling the signal, reissuing the ioctl, etc.


i think that an additional issue here is that part of reissuing the
ioctl is re-queueing the command. since the re-queue is at the front
of the block queue there are issues if the command is non-idempotent.


Re-issuing a SG_IO ioctl is wrong. More and more SCSI commands (even
in SBC-3) are non-idempotent (e.g. COMPARE AND WRITE). And the st
driver gets to use the block layer as well and many of its important
SCSI commands (SSC) are non-idempotent.


we have a local fix that gets rid of most of the orphan stuff and
re-waiting if a non-fatal signal was waiting. simpler than unmapping
but maybe we're missing some other interesting case?


Like to share that fix with us?

Also I'm interested in how you know from within a kernel driver
whether a signal sent to the controlling process is fatal or
not? For example SIGIO's default action is terminate but sg
assumes if the controlling process requests SIGIO generation
then it will at least override that default action and
handle it sensibly. Is there a way to check that assumption?

Looked at bsg in the situation where a signal interrupts a
SG_IO ioctl. It seems broken; anyone like to comment on this
snippet from bsg if a signal hits the first call:
blk_execute_rq(bd->queue, NULL, rq, at_head);
ret = blk_complete_sgv4_hdr_rq(rq, &hdr, bio, bidi_bio);
if (copy_to_user(uarg, &hdr, sizeof(hdr)))
return -EFAULT;


As an aside, I got tired of handling signals during SCSI commands
in the ddpt utility, especially after adding tape support. So
it now masks all the usual suspects during the IO then checks
for signals in a small window between IOs. Non maskable signals
will still terminate but of course the user gets no guarantees,
but it would be still reasonable in the termination case that
the interrupted SCSI command was _not_ resubmitted.

Doug Gilbert

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] [SCSI] sg: Fix user memory corruption when SG_IO is interrupted by a signal

2013-08-07 Thread Douglas Gilbert


On 13-08-07 11:50 AM, Roland Dreier wrote:

On Wed, Aug 7, 2013 at 7:38 AM, David Milburn  wrote:

I was able to succesfully test this patch overnight, I had been experimenting 
with the
sg driver setting the BIO_NULL_MAPPED flag in sg_rq_end_io_usercontext for a 
orphan process
which prevented the corruption, but your solution seems much better.


Very cool, thanks for the testing.

I actually looked at using BIO_NULL_MAPPED as well, but it seemed a
bit too fragile to me -- it had the right effect of skipping
__bio_copy_iov(), and skipping the __free_pages() stuff in there is OK
because sg owns its pages rather than the bio layer, but all that
seemed vulnerable to being broken by an unrelated change.

Out of curiousity, were you already working on this bug?  Because if
you had fixed it a few weeks earlier we might not have spent so long
wondering WTF was stomping on the memory of one of our processes :)


Roland,
So what kind of signal was leading to your "stomping on the memory"?
Was it user generated or something like SIGIO, SIGPIPE or a RT signal?

To get around the SG_IO ioctl restart problem (for non idempotent
SCSI commands) could we replace a -ERESTARTSYS return value
with -EINTR ?

As I noted in a previous post, for robust user space code using the
SG_IO ioctl, masking signals during the IO may help.


And what about bsg? Is it any better or worse than sg in the case
of interrupted SG_IO ioctls? Apart from the interface (sg_io_hdr
v3 versus v4) it should be a drop in replacement for sg.

Doug Gilbert


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ARM: at91: add missing uart clocks DT entries

2013-08-07 Thread Douglas Gilbert


On 13-08-07 12:29 PM, Boris BREZILLON wrote:

Add clocks to clock lookup table for uart DT entries.

Signed-off-by: Boris BREZILLON 

Tested-by: Douglas Gilbert 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v5 0/4] [SCSI] sg: fix race condition in sg_open

2013-07-25 Thread Douglas Gilbert


On 13-07-25 11:32 AM, vaughan wrote:

On 07/23/2013 01:03 AM, Jörn Engel wrote:

On Mon, 22 July 2013 12:40:29 +0800, Vaughan Cao wrote:

There is a race when open sg with O_EXCL flag. Also a race may happen between
sg_open and sg_remove.

Changes from v4:
  * [3/4] use ERR_PTR series instead of adding another parameter in sg_add_sfp
  * [4/4] fix conflict for cherry-pick from v3.

Changes from v3:
  * release o_sem in sg_release(), not in sg_remove_sfp().
  * not set exclude with sfd_lock held.

Vaughan Cao (4):
   [SCSI] sg: use rwsem to solve race during exclusive open
   [SCSI] sg: no need sg_open_exclusive_lock
   [SCSI] sg: checking sdp->detached isn't protected when open
   [SCSI] sg: push file descriptor list locking down to per-device
 locking

  drivers/scsi/sg.c | 178 +-
  1 file changed, 83 insertions(+), 95 deletions(-)

Patchset looks good to me, although I didn't test it on hardware yet.
Signed-off-by: Joern Engel 

James, care to pick this up?

Jörn

Hi James,

sg driver has two races happen in
  a) exclusive open and non-exclusive open
  b) sg removal and sg open
I explained the scenario detail in the separate patches. I did test
those patches and
Jörn has reviewed them.  I got no response from Doug Gilbert for a long
time.
Would you care to pick these up?


Hi,
Your patches applied with a little noise to lk 3.10.2 and
gave this warning from the build.

  CC [M]  drivers/scsi/sg.o
drivers/scsi/sg.c: In function ‘sg_open’:
drivers/scsi/sg.c:242:6: warning: unused variable ‘res’ [-Wunused-variable]

I'll keep testing ...

Doug Gilbert


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v5 0/4] [SCSI] sg: fix race condition in sg_open

2013-08-12 Thread Douglas Gilbert

On 13-08-12 10:46 PM, vaughan wrote:

On 08/06/2013 04:52 AM, Douglas Gilbert wrote:

On 13-08-04 10:19 PM, vaughan wrote:

On 08/03/2013 01:25 PM, Douglas Gilbert wrote:

On 13-08-01 01:01 AM, Douglas Gilbert wrote:

On 13-07-22 01:03 PM, Jörn Engel wrote:

On Mon, 22 July 2013 12:40:29 +0800, Vaughan Cao wrote:

There is a race when open sg with O_EXCL flag. Also a race may
happen between
sg_open and sg_remove.

Changes from v4:
* [3/4] use ERR_PTR series instead of adding another parameter in
sg_add_sfp
* [4/4] fix conflict for cherry-pick from v3.

Changes from v3:
* release o_sem in sg_release(), not in sg_remove_sfp().
* not set exclude with sfd_lock held.

Vaughan Cao (4):
 [SCSI] sg: use rwsem to solve race during exclusive open
 [SCSI] sg: no need sg_open_exclusive_lock
 [SCSI] sg: checking sdp->detached isn't protected when open
 [SCSI] sg: push file descriptor list locking down to per-device
   locking

drivers/scsi/sg.c | 178
+-
1 file changed, 83 insertions(+), 95 deletions(-)

Patchset looks good to me, although I didn't test it on hardware yet.
Signed-off-by: Joern Engel 

James, care to pick this up?

Acked-by: Douglas Gilbert 

Tested O_EXCL with multiple processes and threads; passed.
sg driver prior to this patch had "leaky" O_EXCL logic
according to the same test. Block device passed.

James, could you clean this up:
 drivers/scsi/sg.c:242:6: warning: unused variable ‘res’
[-Wunused-variable]

Further testing suggests this patch on the sg driver is
broken, so I'll rescind my ack.

The case it is broken for is when a device is opened
without O_EXCL. Now if, while it is open, a second
thread/process tries to open the same device O_EXCL
then IMO the second open should fail with EBUSY.

My testing shows that O_EXCL opens properly deflect
other O_EXCL opens.

Hi  Doug,

My test don't have this issue. The routine is something as below:

I start three opens without O_EXCL, wait 30s each, and open with
O_EXCL|O_NONBLOCK, it failed with EBUSY.
And I also call myopen with/without O_EXCL many times in background at
the same time, and the test is passed. I don't know why it failed in
your test.

Usage: myopen [-e][-n][-d delay] -f file
-e: exclude
-n: nonblock
-d: delay N seconds and then close.

[root@vacaowol5 16835013]# ./myopen  -f /dev/sg5 -d 30 &
[1] 3417
[root@vacaowol5 16835013]# ./myopen  -f /dev/sg5 -d 30 &
[2] 3418
[root@vacaowol5 16835013]# ./myopen  -f /dev/sg5 -d 30 &
[3] 3419
[root@vacaowol5 16835013]# cat /proc/scsi/sg/debug
max_active_device=6(origin 1)
   def_reserved_size=32768
   >>> device=sg5 scsi5 chan=0 id=1 lun=0   em=0 sg_tablesize=55 excl=0
 FD(1): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0
 cmd_q=0 f_packid=0 k_orphan=0 closed=0
   No requests active
 FD(2): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0
 cmd_q=0 f_packid=0 k_orphan=0 closed=0
   No requests active
 FD(3): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0
 cmd_q=0 f_packid=0 k_orphan=0 closed=0
   No requests active

[root@vacaowol5 16835013]# ./myopen -e -n  -f /dev/sg5 -d 30 &
[4] 3422
[3422:3351] /dev/sg5:exclude: Device or resource busy

[4]+  Exit 1  ./myopen -e -n -f /dev/sg5 -d 30

[root@vacaowol5 16835013]# cat /proc/scsi/sg/debug
max_active_device=6(origin 1)
   def_reserved_size=32768
   >>> device=sg5 scsi5 chan=0 id=1 lun=0   em=0 sg_tablesize=55 excl=0
 FD(1): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0
 cmd_q=0 f_packid=0 k_orphan=0 closed=0
   No requests active
 FD(2): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0
 cmd_q=0 f_packid=0 k_orphan=0 closed=0
   No requests active
 FD(3): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0
 cmd_q=0 f_packid=0 k_orphan=0 closed=0
   No requests active
[root@vacaowol5 16835013]# cat /proc/scsi/sg/debug
[1]   Done./myopen -f /dev/sg5 -d 30
[2]-  Done./myopen -f /dev/sg5 -d 30
[3]+  Done./myopen -f /dev/sg5 -d 30

Hi,
After the initial failures about 36 hours ago, retesting
yesterday and today has not produced any unexpected
failures. And I have been trying hard on lk 3.10.4 and
lk 3.10.5 .

My test program is a bit more intense than yours and can
be found in the sg3_utils beta in the News section of this
page:
   http://sg.danny.cz/sg/

It is in the examples directory, two variants called
sg_tst_excl and sg_tst_excl2 . You will need a recent gcc
compiler, IOW something that can compile c++11 . gcc 4.7.3
in Ubuntu 13.04 only just manages, fedora 19 should do
better with gcc 4.8.1 . The threading is implemented using
pthreads so it should be reliable.

Typically I run multiple instances (processes) and each has
multiple threads. One instance can run '-x'

[ANNOUNCE] lsscsi version 0.27 released

2013-05-08 Thread Douglas Gilbert


lsscsi is a command line utility that probes sysfs in Linux
2.6 and 3 series kernels in order to list information about
SCSI devices and SCSI hosts. Both a compact format which is
one line per device and a "classic" format (like the output
of 'cat /proc/scsi/scsi') are supported.

Version 0.27 is available at:
http://sg.danny.cz/scsi/lsscsi.html
More information can be found on that page including examples
plus a Download and Build information section containing
tarballs, rpm and deb packages.


ChangeLog:
Version 0.27 2013/05/08 [svn: r111]
  - rework buffer handling for systems with many disks
  - add --lunhex option for displaying LUNs in hex
  - accept LUNs from sysfs as large as a 64 bit unsigned
decimal number (largest was signed 32 bit decimal)
  - accept LSSCSI_LUNHEX_OPT environment variable
  - add --scsi_id option for /dev/disk/by-id/scsi*

Version 0.26 2012/01/31 [svn: r97]


Doug Gilbert
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] rtc: rtc-at91rm9200: use a variable for storing IMR

2013-03-26 Thread Douglas Gilbert


On 13-03-26 03:27 PM, Johan Hovold wrote:

On Fri, Mar 15, 2013 at 06:37:12PM +0100, Nicolas Ferre wrote:

On some revisions of AT91 SoCs, the RTC IMR register is not working.
Instead of elaborating a workaround for that specific SoC or IP version,
we simply use a software variable to store the Interrupt Mask Register and
modify it for each enabling/disabling of an interrupt. The overhead of this
is negligible anyway.


The patch does not add any memory barriers or register read-backs when
manipulating the interrupt-mask variable. This could possibly lead to
spurious interrupts both when enabling and disabling the various
RTC-interrupts due to write reordering and bus latencies.

Has this been considered? And is this reason enough for a more targeted
work-around so that the SOCs with functional RTC_IMR are not affected?


The SoCs in question use a single embedded ARM926EJ-S and
according to the Atmel documentation, that CPU's instruction
set contains no barrier (or related) instructions.

In the arch/arm/mach-at91 sub-tree of the kernel source
I can find no use of the wmb() call. Also checked all drivers
in the kernel containing "at91" and none called wmb().

Doug Gilbert

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] rtc: rtc-at91rm9200: use a variable for storing IMR

2013-03-28 Thread Douglas Gilbert


On 13-03-28 05:57 AM, Johan Hovold wrote:

On Tue, Mar 26, 2013 at 05:09:59PM -0400, Douglas Gilbert wrote:

On 13-03-26 03:27 PM, Johan Hovold wrote:

On Fri, Mar 15, 2013 at 06:37:12PM +0100, Nicolas Ferre wrote:

On some revisions of AT91 SoCs, the RTC IMR register is not working.
Instead of elaborating a workaround for that specific SoC or IP version,
we simply use a software variable to store the Interrupt Mask Register and
modify it for each enabling/disabling of an interrupt. The overhead of this
is negligible anyway.


The patch does not add any memory barriers or register read-backs when
manipulating the interrupt-mask variable. This could possibly lead to
spurious interrupts both when enabling and disabling the various
RTC-interrupts due to write reordering and bus latencies.

Has this been considered? And is this reason enough for a more targeted
work-around so that the SOCs with functional RTC_IMR are not affected?


The SoCs in question use a single embedded ARM926EJ-S and
according to the Atmel documentation, that CPU's instruction
set contains no barrier (or related) instructions.


The ARM926EJ-S actually does have a Drain Write Buffer instruction but
it's not used by the ARM barrier-implementation unless
CONFIG_ARM_DMA_MEM_BUFFERABLE or CONFIG_SMP is set.


The ARM926EJ-S is ARMv5 so CONFIG_ARM_DMA_MEM_BUFFERABLE is not
available. SMP is not an option for arm/mach-at91.


However, wmb() always implies a compiler barrier which is what is needed
in this case.


Even if wmb() did anything, would it make this case "safe"?


In the arch/arm/mach-at91 sub-tree of the kernel source
I can find no use of the wmb() call. Also checked all drivers
in the kernel containing "at91" and none called wmb().


I/O-operations are normally not reordered, but this patch is faking a
hardware register and thus extra care needs to be taken.

To repeat:


@@ -198,9 +203,12 @@ static int at91_rtc_alarm_irq_enable(struct device *dev, 
unsigned int enabled)

   if (enabled) {
   at91_rtc_write(AT91_RTC_SCCR, AT91_RTC_ALARM);
+ at91_rtc_imr |= AT91_RTC_ALARM;


Here a barrier is needed to prevent the compiler from reordering the two
writes (i.e., mask update and interrupt enable).


Isn't either order potentially unsafe? So even if the compiler
did foolishly re-order them, the sequence is still unsafe when
a SYS interrupt splits those two lines (since the SYS interrupt
is shared, it can occur at any time).


   at91_rtc_write(AT91_RTC_IER, AT91_RTC_ALARM);
- } else
+ } else {
   at91_rtc_write(AT91_RTC_IDR, AT91_RTC_ALARM);


Here a barrier is again needed to prevent the compiler from reordering,
but we also need a register read back (of some RTC-register) before
updating the mask. Without the register read back, there could be a
window where the mask does not match the hardware state due to bus
latencies.

Note that even with a register read back there is a (theoretical)
possibility that the interrupts have not yet been disabled when the fake
mask is updated. The only way to know for sure is to poll RTC_IMR but
that is the very register you're trying to emulate.


+ at91_rtc_imr &= ~AT91_RTC_ALARM;
+ }

   return 0;
}


In the worst-case scenario ignoring the shared RTC-interrupt could lead
to the disabling of the system interrupt and thus also PIT, DBGU, ...


And how often does the AT91_RTC_ALARM alarm interrupt fire?


I think this patch should be reverted and a fix for the broken SoCs be
implemented which does not penalise the other SoCs. That is, only
fall-back to faking IMR on the SoCs where it is actually broken.


Even though I sent a patch to fix this problem to Nicolas,
what was presented is not my version. In mine I added DT
support:

#ifdef CONFIG_OF
static const struct of_device_id at91rm9200_rtc_dt_ids[] = {
   { .compatible = "atmel,at91rm9200-rtc", .data = &at91rm9200_config },
   { .compatible = "atmel,at91sam9x5-rtc", .data = &at91sam9x5_config },
   { /* sentinel */ }
};
MODULE_DEVICE_TABLE(of, at91rm9200_rtc_dt_ids);
#else
#define at91rm9200_rtc_dt_ids NULL
#endif /* CONFIG_OF */


The shadow IMR variable was only active in the
 .compatible = "atmel,at91sam9x5-rtc"
case. That protected all existing users from any problems
that might be introduced.

Doug Gilbert

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 5/5] rtc-at91rm9200: add support for at91sam9x5

2013-03-29 Thread Douglas Gilbert


On 13-03-29 12:03 PM, Johan Hovold wrote:

Add support for the at91sam9x5-family which must use the shadow
interrupt mask due to a hardware issue.
---
  drivers/rtc/rtc-at91rm9200.c | 8 
  1 file changed, 8 insertions(+)

diff --git a/drivers/rtc/rtc-at91rm9200.c b/drivers/rtc/rtc-at91rm9200.c
index 2921866..f3e351f 100644
--- a/drivers/rtc/rtc-at91rm9200.c
+++ b/drivers/rtc/rtc-at91rm9200.c
@@ -318,12 +318,20 @@ static irqreturn_t at91_rtc_interrupt(int irq, void 
*dev_id)
  static const struct at91_rtc_config at91rm9200_config = {
  };

+static const struct at91_rtc_config at91sam9x5_config = {
+   .use_shadow_imr = true,
+};
+
  #if defined(CONFIG_OF)
  static const struct of_device_id at91_rtc_dt_ids[] = {
{
.compatible = "atmel,at91rm9200-rtc",
.data = &at91rm9200_config,
},
+   {
+   .compatible = "atmel,at91sam9x5-rtc",
+   .data = &at91sam9x5_config,
+   },
/* terminator */
}
  };



Johan,
Looks good.

Plus add something like this to at91sam9x5.dtsi after the
i2c@2 entry (at the end):

rtc {
compatible = "atmel,at91sam9x5-rtc";
reg = <0xfeb0 0x40>;
interrupts = <1 4 7>;
status = "disabled";
};


and an "enabler" in ariag25.dts (and perhaps other members
of the 9x5 sub-family), also at the end:

rtc {
status = "okay";
};

My patches are in Robert Nelson's tree at:
   http://www.eewiki.net/display/linuxonarm/AT91SAM9x5
in the Linux kernel section. My RTC code amounts to the same
thing as you are proposing, without the safety code around
the IMR shadow.

I provide binaries based on that work to Aria G25 users
via a google group. No-one has complained about RTC not
working. SPI and I2C problems are on-going but gradually
being sorted. Hence I know people are using and testing
this code, other than me.

Doug Gilbert
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] [SCSI] scsi_transport_sas: check for allocation failure

2013-03-08 Thread Douglas Gilbert


On 13-03-08 07:02 AM, Dan Carpenter wrote:

Static checkers complain that this allocation isn't checked.  We
should return zero if the allocation fails.

Signed-off-by: Dan Carpenter 

diff --git a/drivers/scsi/scsi_transport_sas.c 
b/drivers/scsi/scsi_transport_sas.c
index 1b68142..a022997 100644
--- a/drivers/scsi/scsi_transport_sas.c
+++ b/drivers/scsi/scsi_transport_sas.c
@@ -379,9 +379,12 @@ sas_tlr_supported(struct scsi_device *sdev)
  {
const int vpd_len = 32;
struct sas_end_device *rdev = sas_sdev_to_rdev(sdev);
-   char *buffer = kzalloc(vpd_len, GFP_KERNEL);
+   char *buffer;
int ret = 0;

+   buffer = kzalloc(vpd_len, GFP_KERNEL);
+   if (!buffer)
+   goto out;
if (scsi_get_vpd_page(sdev, 0x90, buffer, vpd_len))
goto out;



For 32 bytes, why not use the stack?

unsigned int
sas_tlr_supported(struct scsi_device *sdev)
{
unsigned char buffer[32];
struct sas_end_device *rdev = sas_sdev_to_rdev(sdev);
int ret = 0;

if (scsi_get_vpd_page(sdev, 0x90, buffer, sizeof(buffer)))
goto out;

/*
 * The VPD Protocol Specific Logical Unit page (0x90) for SAS
 * has a 4 byte header and then one descriptor per device port.
 * The TLR bit is at offset 8 on each port descriptor.
 * We take the TLR value in the first descriptor.
 */
ret = buffer[4 + 8] & 0x01;

 out:
rdev->tlr_supported = ret;
return ret;
}


Note the comment is changed.

Doug Gilbert


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] [SCSI] scsi_transport_sas: check for allocation failure

2013-03-08 Thread Douglas Gilbert


On 13-03-08 05:50 PM, James Bottomley wrote:

On Fri, 2013-03-08 at 12:57 -0500, Douglas Gilbert wrote:

On 13-03-08 07:02 AM, Dan Carpenter wrote:

Static checkers complain that this allocation isn't checked.  We
should return zero if the allocation fails.

Signed-off-by: Dan Carpenter 

diff --git a/drivers/scsi/scsi_transport_sas.c 
b/drivers/scsi/scsi_transport_sas.c
index 1b68142..a022997 100644
--- a/drivers/scsi/scsi_transport_sas.c
+++ b/drivers/scsi/scsi_transport_sas.c
@@ -379,9 +379,12 @@ sas_tlr_supported(struct scsi_device *sdev)
   {
const int vpd_len = 32;
struct sas_end_device *rdev = sas_sdev_to_rdev(sdev);
-   char *buffer = kzalloc(vpd_len, GFP_KERNEL);
+   char *buffer;
int ret = 0;

+   buffer = kzalloc(vpd_len, GFP_KERNEL);
+   if (!buffer)
+   goto out;
if (scsi_get_vpd_page(sdev, 0x90, buffer, vpd_len))
goto out;



For 32 bytes, why not use the stack?


Because the buffer is a DMA target.  You can't DMA to stack because of
padding and cacheline issues.


And I went to the definition of scsi_get_vpd_page()
to see if that was called out in the header comments.
Guess what ... and those same header comments talked
about freeing a returned pointer. It needs to be
cleaned up, IMO.

Doug Gilbert

/**
 * scsi_get_vpd_page - Get Vital Product Data from a SCSI device
 * @sdev: The device to ask
 * @page: Which Vital Product Data to return
 * @buf: where to store the VPD
 * @buf_len: number of bytes in the VPD buffer area
 *
 * SCSI devices may optionally supply Vital Product Data.  Each 'page'
 * of VPD is defined in the appropriate SCSI document (eg SPC, SBC).
 * If the device supports this VPD page, this routine returns a pointer
 * to a buffer containing the data from that page.  The caller is
 * responsible for calling kfree() on this pointer when it is no longer
 * needed.  If we cannot retrieve the VPD page this routine returns %NULL.
 */
int scsi_get_vpd_page(struct scsi_device *sdev, u8 page, unsigned char *buf,
  int buf_len)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] [SCSI] scsi_transport_sas: check for allocation failure

2013-03-11 Thread Douglas Gilbert


On 13-03-11 09:10 AM, Dan Carpenter wrote:

On Fri, Mar 08, 2013 at 10:50:19PM +, James Bottomley wrote:

On Fri, 2013-03-08 at 12:57 -0500, Douglas Gilbert wrote:

On 13-03-08 07:02 AM, Dan Carpenter wrote:

Static checkers complain that this allocation isn't checked.  We
should return zero if the allocation fails.

Signed-off-by: Dan Carpenter 

diff --git a/drivers/scsi/scsi_transport_sas.c 
b/drivers/scsi/scsi_transport_sas.c
index 1b68142..a022997 100644
--- a/drivers/scsi/scsi_transport_sas.c
+++ b/drivers/scsi/scsi_transport_sas.c
@@ -379,9 +379,12 @@ sas_tlr_supported(struct scsi_device *sdev)
   {
const int vpd_len = 32;
struct sas_end_device *rdev = sas_sdev_to_rdev(sdev);
-   char *buffer = kzalloc(vpd_len, GFP_KERNEL);
+   char *buffer;
int ret = 0;

+   buffer = kzalloc(vpd_len, GFP_KERNEL);
+   if (!buffer)
+   goto out;
if (scsi_get_vpd_page(sdev, 0x90, buffer, vpd_len))
goto out;



For 32 bytes, why not use the stack?


Because the buffer is a DMA target.  You can't DMA to stack because of
padding and cacheline issues.



I think stack data works here.  scsi_execute() calls
blk_rq_map_kern() which handles stack memory and alignment issues.


That being the case, several other callers of
scsi_get_vpd_page() 9and friends) could be
simplified and sped up.

Also since VPD pages don't change and they can carry
a lot of disparate information (e.g. the Extended
Inquiry and Block Limits pages) perhaps they could
be cached by the appropriate level (e.g. Extended
Inquiry cached by mid-level; Block Limits cached
by sd driver).

Doug Gilbert


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 10/18] sg: nopage

2008-02-07 Thread Douglas Gilbert


For the patch shown below:

Signed-off-by: Douglas Gilbert <[EMAIL PROTECTED]>



[EMAIL PROTECTED] wrote:
Convert SG from nopage to fault.

Signed-off-by: Nick Piggin <[EMAIL PROTECTED]>
Cc: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Cc: linux-kernel@vger.kernel.org
---
 drivers/scsi/sg.c |   23 +++
 1 file changed, 11 insertions(+), 12 deletions(-)

Index: linux-2.6/drivers/scsi/sg.c
===
--- linux-2.6.orig/drivers/scsi/sg.c
+++ linux-2.6/drivers/scsi/sg.c
@@ -1144,23 +1144,22 @@ sg_fasync(int fd, struct file *filp, int
return (retval < 0) ? retval : 0;
 }

-static struct page *
-sg_vma_nopage(struct vm_area_struct *vma, unsigned long addr, int *type)
+static int
+sg_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 {
Sg_fd *sfp;
-   struct page *page = NOPAGE_SIGBUS;
unsigned long offset, len, sa;
Sg_scatter_hold *rsv_schp;
struct scatterlist *sg;
int k;

if ((NULL == vma) || (!(sfp = (Sg_fd *) vma->vm_private_data)))
-   return page;
+   return VM_FAULT_SIGBUS;
rsv_schp = &sfp->reserve;
-   offset = addr - vma->vm_start;
+   offset = vmf->pgoff << PAGE_SHIFT;
if (offset >= rsv_schp->bufflen)
-   return page;
-   SCSI_LOG_TIMEOUT(3, printk("sg_vma_nopage: offset=%lu, scatg=%d\n",
+   return VM_FAULT_SIGBUS;
+   SCSI_LOG_TIMEOUT(3, printk("sg_vma_fault: offset=%lu, scatg=%d\n",
   offset, rsv_schp->k_use_sg));
sg = rsv_schp->buffer;
sa = vma->vm_start;
@@ -1169,21 +1168,21 @@ sg_vma_nopage(struct vm_area_struct *vma
len = vma->vm_end - sa;
len = (len < sg->length) ? len : sg->length;
if (offset < len) {
+   struct page *page;
page = virt_to_page(page_address(sg_page(sg)) + offset);
get_page(page); /* increment page count */
-   break;
+   vmf->page = page;
+   return 0; /* success */
}
sa += len;
offset -= len;
}

-   if (type)
-   *type = VM_FAULT_MINOR;
-   return page;
+   return VM_FAULT_SIGBUS;
 }

 static struct vm_operations_struct sg_mmap_vm_ops = {
-   .nopage = sg_vma_nopage,
+   .fault = sg_vma_fault,
 };

 static int


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] scsi_error: Fix language abuse.

2008-02-08 Thread Douglas Gilbert


Alan Cox wrote:

The word "illegal" has a precise dictionary meaning of "prohibited by
law". 


Also "contrary to or forbidden by official rules, regulations, etc".
So word meanings are like standards, there are so many to choose
from.

The error messages are therefore incorrect as so far nobody has

made SCSI violations a criminal offence.


Most USB mass storage implementations should result in jail
terms for the responsible parties.


This corrects scsi to match various other subsystems I've slowly been
ridding of this.

Pedantically-signed-off-by: Alan Cox <[EMAIL PROTECTED]>


Please don't do this.


-   {0x2004, "Illegal command while in write capable state"},
+   {0x2004, "Invalid command while in write capable state"},


Several reasons:
 1) Those strings appear in an international standard.
In this case SPC-3 ANSI INCITS 408-2005.
 2) The way that INCITS standard (and others) is structured,
the relevant part of the text refers to a specific
additional sense code + qualifier instance by _name_
(e.g. "Illegal command while in write capable state")
not by number (e.g. 0x20,0x4). So given the Alan Cox
rendition of that string in a log, a diligent debugger
would need to get to constants.c to find the corresponding
asc/ascq numeric sequence, then search Annex D of SPC-3
to find the t10.org version of that string, then search
the body of SPC-3 and other t10 device class specific
standards for a match on the t10 string.
Vendor product manuals would tend to use the t10.org
strings as well.
 3) I maintain constants.c and I do that with the help
of a program that compares t10's tables (specifically
http://www.t10.org/lists/asc-num.txt ) with the table
in constants.c . I'm glad to send you a copy so you
can take over maintaining the sanitized list and
cope with the abuse that may follow.


If you feel strongly about this then you could take it
up with ANSI or BS (a great acronym IMO).

Doug Gilbert



diff -u --new-file --recursive --exclude-from /usr/src/exclude 
linux.vanilla-2.6.24-mm1/drivers/scsi/constants.c 
linux-2.6.24-mm1/drivers/scsi/constants.c
--- linux.vanilla-2.6.24-mm1/drivers/scsi/constants.c   2008-02-06 
14:14:40.0 +
+++ linux-2.6.24-mm1/drivers/scsi/constants.c   2008-02-06 14:35:16.0 
+
@@ -606,10 +606,10 @@
{0x2001, "Access denied - initiator pending-enrolled"},
{0x2002, "Access denied - no access rights"},
{0x2003, "Access denied - invalid mgmt id key"},
-   {0x2004, "Illegal command while in write capable state"},
+   {0x2004, "Invalid command while in write capable state"},
{0x2005, "Obsolete"},
-   {0x2006, "Illegal command while in explicit address mode"},
-   {0x2007, "Illegal command while in implicit address mode"},
+   {0x2006, "Invalid command while in explicit address mode"},
+   {0x2007, "Invalid command while in implicit address mode"},
{0x2008, "Access denied - enrollment conflict"},
{0x2009, "Access denied - invalid LU identifier"},
{0x200A, "Access denied - invalid proxy token"},
@@ -620,7 +620,7 @@
{0x2102, "Invalid address for write"},
{0x2103, "Invalid write crossing layer jump"},
 
-	{0x2200, "Illegal function (use 20 00, 24 00, or 26 00)"},

+   {0x2200, "Invalid function (use 20 00, 24 00, or 26 00)"},
 
 	{0x2400, "Invalid field in cdb"},

{0x2401, "CDB decryption error"},
@@ -697,7 +697,7 @@
{0x2C02, "Invalid combination of windows specified"},
{0x2C03, "Current program area is not empty"},
{0x2C04, "Current program area is empty"},
-   {0x2C05, "Illegal power condition request"},
+   {0x2C05, "Invalid power condition request"},
{0x2C06, "Persistent prevent conflict"},
{0x2C07, "Previous busy status"},
{0x2C08, "Previous task set full status"},
@@ -1014,7 +1014,7 @@
{0x6300, "End of user area encountered on this track"},
{0x6301, "Packet does not fit in available space"},
 
-	{0x6400, "Illegal mode for this track"},

+   {0x6400, "Invalid mode for this track"},
{0x6401, "Invalid packet size"},
 
 	{0x6500, "Voltage fault"},

@@ -1124,7 +1124,7 @@
"Not Ready",  /* 2: The addressed target is not ready */
"Medium Error",   /* 3: Data error detected on the medium */
"Hardware Error",   /* 4: Controller or device failure */
-   "Illegal Request",  /* 5: Error in request */
+   "Invalid Request",  /* 5: Error in request */
"Unit Attention",   /* 6: Removable medium was changed, or
  the target has been reset, or ... */
"Data Protect",   /* 7: Access to the data is blocked */
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html




--
To unsubscribe fro

Re: [PATCH] scsi_error: Fix language abuse.

2008-02-10 Thread Douglas Gilbert


Alan Cox wrote:

On Fri, 08 Feb 2008 20:32:54 -0500
Douglas Gilbert <[EMAIL PROTECTED]> wrote:


Alan Cox wrote:

The word "illegal" has a precise dictionary meaning of "prohibited by
law". 

Also "contrary to or forbidden by official rules, regulations, etc".


The OED I have here doesn't seem to think so, however if the words are
the ones used in the T10 documentation then I'm happy to drop the patch.


The OED (Oxford English dictionary) seems to be at odds
with most other online dictionaries with respect to the
word illegal. I find the following 'illegal' entry
(at www.askoxford.com) daft:
 "USAGE Both illegal and unlawful can mean 'contrary to
  or forbidden by law', but unlawful has a broader meaning
  'not permitted by rules': thus handball in soccer is
  unlawful, but not illegal."

So if we followed the OED's advice then SCSI's "illegal
request" would become "unlawful request". I'm yet to
see the word "unlawful" in any technical standard ...

Doug Gilbert



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542)

2013-02-13 Thread Douglas Gilbert


On 13-02-13 03:32 AM, Paolo Bonzini wrote:

Il 06/02/2013 16:15, Paolo Bonzini ha scritto:

This series regards the whitelist that is used for the SG_IO ioctl.  This
whitelist has three problems:

* the bitmap of allowed commands is designed for MMC devices (roughly,
   "play/burn CDs without requiring root") but some opcodes overlap across SCSI
   device classes and have different meanings for different classes.

* also because the bitmap of allowed commands is designed for MMC devices
   only, some commands are missing even though they are generally useful and
   not insecure.  At least not more insecure than anything else you can
   do if you have access to /dev/sdX or /dev/stX nodes.

* the whitelist can be disabled per-process but not per-disk.  In addition,
   the required capability (CAP_SYS_RAWIO) gives access to a range of other
   resources, enough to make it insecure.

The series corrects these problems.  Patches 1-4 solve the first problem,
which also has an assigned CVE, by using different bitmaps for the various
device classes.  Patches 5-11 solve the second by adding more commands
to the bitmaps.  Patches 12 and 13 solve the third, and were already
posted but ignored by the maintainers despite multiple pings.

Note: checkpatch hates the formatting of the command table.  I know about this,
and ensured that there are no errors in the rest of the code.  The current
formatting is IMHO quite handy, and roughly based on the files available
from the SCSI standard body.

Ok for the next merge window?

Paolo

v1->v2: remove 2 MMC commands and 6 SBC commands (see patches 6 and 9
 for details).  Added patch 14 and added a few more scanner
 commands based on SANE (scanners are not whitelisted by default,
 also were not in v1, but this makes it possible to opt into the
 whitelist out of paranoia).  Removed C++ comments.  Removed the
 large #if 0'd list of commands that the kernel does not pass
 though.  Marked blk_set_cmd_filter_defaults as __init.


Paolo Bonzini (14):
   sg_io: pass request_queue to blk_verify_command
   sg_io: reorganize list of allowed commands
   sg_io: use different default filters for each device class
   sg_io: resolve conflicts between commands assigned to multiple
 classes (CVE-2012-4542)
   sg_io: whitelist a few more commands for rare & obsolete device types
   sg_io: whitelist another command for multimedia devices
   sg_io: whitelist a few more commands for media changers
   sg_io: whitelist a few more commands for tapes
   sg_io: whitelist a few more commands for disks
   sg_io: whitelist a few obsolete commands
   sg_io: mark blk_set_cmd_filter_defaults as __init
   sg_io: remove remnants of sysfs SG_IO filters
   sg_io: introduce unpriv_sgio queue flag
   sg_io: use unpriv_sgio to disable whitelisting for scanners

  Documentation/block/queue-sysfs.txt |8 +
  block/blk-sysfs.c   |   33 +++
  block/bsg.c |2 +-
  block/scsi_ioctl.c  |  369 ++-
  drivers/scsi/scsi_scan.c|   14 ++-
  drivers/scsi/sg.c   |6 +-
  include/linux/blkdev.h  |8 +-
  include/linux/genhd.h   |9 -
  include/scsi/scsi.h |3 +
  9 files changed, 344 insertions(+), 108 deletions(-)



Ping? I'm not even sure what tree this should host these patches...


You are whitelisting SCSI commands so obviously the SCSI tree
and the patch spills over into the block tree.
Can't see much point in ack-ing the sg changes since most
of the action is at higher levels.

The question I have is what existing code will this change
break (and will I being getting emails from peeved
developers)?

Is 8 lines of documentation changes enough? My guess is
that SG_IO ioctl pass-through users will be tripped up
and it won't be obvious to them to look at
Documentation/block/queue-sysfs.txt
for enlightenment; especially if they are using a char
device node from the bsg, sg or st drivers to issue SG_IO.

Doug Gilbert

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Announce] sg3_utils-1.34 available

2012-10-14 Thread Douglas Gilbert


sg3_utils is a package of command line utilities for sending
SCSI and some ATA commands to devices. This package targets
the linux kernel (lk) 3, 2.6 and lk 2.4 series. It also has
ports to FreeBSD, Tru64, Solaris, and Windows (cygwin and mingw).

This version adds sg_xcopy and sg_copy_results (contributed by
Hannes Reinecke). This version tracks various changes made by
www.t10.org since January 2012.

For an overview of sg3_utils and downloads see this page:
http://sg.danny.cz/sg/sg3_utils.html
The sg_ses utility (for enclosure devices) is discussed at:
http://sg.danny.cz/sg/sg_ses.html
The SG_IO ioctl is discussed at:
http://sg.danny.cz/sg/sg_io.html
A full changelog can be found at:
http://sg.danny.cz/sg/p/sg3_utils.ChangeLog

A release announcement will be sent to freecode.com .

Changelog for sg3_utils-1.34 [20121013] [svn: r461]
  - sg_xcopy: new dd like utility for extended copy command
  - sg_copy_results: new utility for receive copy results
  - sg_verify: add 16 byte cdb, bytchk (data-out buffer)
and group number support
  - sync to spc4r36 and sbc3r32
  - sg_inq: add --export so sg_inq can replace udev's scsi_id
- decode old EMC Symmetrix abuse of VPD page 0x83
  - sg_vpd: decode old EMC Symmetrix abuse of VPD page 0x83
  - sg_ses: increase max dpage response size to 64 KB
- allow ident,locate on enclosure controller
- more sanity for additional element status descriptor
  - sg_sanitize: add --ause, --fail and --test=
  - sg_luns: add long extended flat space addressing format
  - sg_logs: add ATA pass-through results lpage (SAT-2)
  - sg_rtpg: add --extended option
  - sg_senddiag: list rebuild assist diag page name
  - sg_pt_linux: expand DID_ (host_byte) codes
- cope with a transport error plus sense data
- prefer major() over MAJOR() macro
  - sg_lib: fix sg_get_command_name() service actions
- report sdat_ovfl bit (if set) in sense data
- decode extended_copy and receive_copy service actions
- decode read_buffer and write_buffer modes
- decode ATA PT fixed format sense (SAT-2)
  - sg_cmds_extra: add sg_ll_report_tgt_prt_grp2()
  - ./configure options:
- change --enable-no-linux-bsg to --disable-linuxbsg
- add --disable-scsistrings to reduce utility sizes

Changelog for sg3_utils-1.33 [20120118] [svn: r435]
...

Doug Gilbert
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Announce] sg3_utils-1.35 available

2013-01-17 Thread Douglas Gilbert


sg3_utils is a package of command line utilities for sending
SCSI and some ATA commands to devices. This package targets
the linux kernel (lk) 3, 2.6 and lk 2.4 series. It also has
ports to FreeBSD, Tru64, Solaris, and Windows (cygwin and mingw).

This version adds sg_compare_and_write (contributed by
Shahar Salzman). Also Hannes Reinecke has done more work on
'sg_inq --export' to support udev. This version tracks various
changes made by www.t10.org since October 2012.

For an overview of sg3_utils and downloads see this page:
http://sg.danny.cz/sg/sg3_utils.html
The sg_ses utility (for enclosure devices) is discussed at:
http://sg.danny.cz/sg/sg_ses.html
The SG_IO ioctl is discussed at:
http://sg.danny.cz/sg/sg_io.html
A full changelog can be found at:
http://sg.danny.cz/sg/p/sg3_utils.ChangeLog

A release announcement will be sent to freecode.com .

Changelog for sg3_utils-1.35 [20130117] [svn: r476]
  - sg_compare_and_write: new utility
  - sg_inq+sg_vpd: block device characteristics VPD page:
add product_type, WABEREQ, WACEREQ and VBULS fields
  - sg_inq: more --export option changes for udev
  - sg_vpd: add more rdac vendor specific vpd pages
  - sg_verify: add --ebytchk option for sbc3r34 changes
  - sg_stpg: --offline option: fix 'Invalid state 0xe'
  - sg_ses: Door Lock element changed to Door element and
abbreviation changed from 'dl' to 'do' (ses3r05)
  - archive/rescan-scsi-bus.sh: upgrade to version 1.53hr
- move rescan-scsi-bus.sh to scripts directory
  - sync to sbc3r34
  - sg_lib: sg_ll_verify10+16 expand BYTCHK to 2 bit field
  - sg_pt_win32, sg_scan(win32): changes for cygwin 1.7.17
  - clean up man page summary lines

Changelog for sg3_utils-1.34 [20121013] [svn: r461]
...

Doug Gilbert
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Zero Copy IO

2001-04-08 Thread Douglas Gilbert

"Alex Q Chen" <[EMAIL PROTECTED]> wrote:

> I am trying to find a way to pin down user space 
> memory from kernel, so that these user space buffer 
> can be used for direct IO transfer or otherwise 
> known as "zero copying IO".  Searching through the 
> Internet and reading comments on various news groups, 
> it would appear that most developers including Linus 
> himself doesn't believe in the benefit of "zero 
> copying IO".  Most of the discussion however was based 
> on network card drivers.  For certain other drivers 
> such as SCSI Tape driver, which need to handle great 
> deal of data transfer, it would seemed still be more
> advantageous to enable zero copy IO than copy_from_user() 
> and copy_to_user() all the data.  Other OS such as AIX 
> and OS2 have kernel functions that can be used to 
> accomplish such a task.  Has any ground work been done 
> in Linux 2.4 to enable "zero copying IO"?

Alex,
The kiobufs mechanism in the 2.4 series is the appropriate
tool for avoiding copy_from_user() and copy_to_user().
The definitive driver is in drivers/char/raw.c which
does synchronous IO to block devices such as disks
(but is probably not appropriate for tapes).

The SCSI generic (sg) driver supports direct IO. The driver
in lk 2.4.3 has the direct IO code commented out while
a version that I'm currently testing (sg 3.1.18 at
www.torque.net/sg) has its direct IO code activated. I have
a web page comparing throughput times and CPU utilizations
at http://www.torque.net/sg/rbuf_tbl.html . My testing
indicates that the kiobufs mechanism is now working
quite well. For various reasons I still think that it 
is best to default to indirect IO and let speed hungry
users enable dio (which is done in sg via procfs). Even 
when the user selects direct IO is should be possible to
fall back to indirect IO. [Sg does this when a SCSI
adapter can't support direct IO (e.g. an ISA adapter).]

Since the SCSI tape (st) driver is structurally similar 
to sg, it should be possible to add direct IO support 
to st.

One thing to note is that when you let the user provide
the buffer for direct IO (e.g. with malloc) then on
the i386 it won't be contiguous from a bus address POV.
This means large scatter gather lists (typically with
each element 4 KB on i386) which can be time consuming
to load on some SCSI adapters. One way around this would
be for a driver to provide "malloc/free" like ioctls.

Doug Gilbert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Problem with 2.4.1/2.4.3 and CD-RW ide-scsi drive

2001-04-12 Thread Douglas Gilbert

Tim Meushaw <[EMAIL PROTECTED]> wrote:
> I've got an update for this problem I emailled about 
> last night (and for which I only received one reply :-) ).
>
> Strangely enough, I'm able to actually burn a CD 
> using the cd-rw described below, and can verify 
> data written to it (using X-CD-Roast). I still can't 
> actually mount a cd in the drive without getting the 
> error described below, but at least I can burn CDs now.
> 
> Does this behavior sound like a kernel problem, or 
> does it suggest a bug in the 'mount' utility?

Tim,
At the risk of Jens jumping on this post, I think
there was some problem mounting cdroms that is
fixed in the "ac" series, the latest of which is
2.4.3-ac5 . Perhaps you could try it and report
back.

The fact that you can write a cd (which does not
involve the sr driver) means that the rest of the SCSI
subsystem and the ide-scsi driver seem to be working
ok. I doubt that this is a problem with the mount
command.

Doug Gilbert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH] adding PCI bus information to SCSI layer

2001-04-13 Thread Douglas Gilbert


Matt Domsch <[EMAIL PROTECTED]> wrote:
> I'm working on an IA-64 user-space application to add a Linux entry to
> the IA-64 boot manager.  To do so, I've got to uniquely identify a
> disk by it's controller PCI address, SCSI channel,
> ID, and LUN.  Essentially, I need to tie /dev/sda to an EFI device.  An
> equivalent problem (with similar solution) exists with i386 where the
> BIOS boot order is not necessarily the Linux driver load order.
> 
> 
> BIOS Enhanced Disk Drive Services 3.0 provides a way to query BIOS for
> what it thinks is it's device location and order.  IA-64 implements
> EDD 3.0, and some i386 BIOS manufactures are adding this feature
> also.  EDD 3.0 information is available at http://www.t13.org.
> 
> What I'd like to do is add the PCI location of the SCSI controller to
> the information printed in /proc/scsi/scsi, as follows:
> 
> Attached devices:
> Host: scsi0 Channel: 00 Id: 05 Lun: 00 PCI bus: 1 slot: 6 fn: 0
>   Vendor: NEC  Model: CD-ROM DRIVE:466 Rev: 1.06
>   Type:   CD-ROM   ANSI SCSI revision: 02

[snip]

Matt,
SANE (and probably some other applications) parses the
output of 'cat /proc/scsi/scsi' so any change to its
format may trip SANE up. How about another entry in
the /proc/scsi directory that has a more parsable format
(e.g. xml :-) ).

Also ISA adapters are not the only non-PCI adapters,
there are the growing band of pseudo adapters that
may or may not have a PCI bus at the bottom of some
other protocol stack.

Doug Gilbert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH] adding PCI bus information to SCSI layer

2001-04-14 Thread Douglas Gilbert

Alan Cox wrote:
> 
> > Also ISA adapters are not the only non-PCI adapters,
> > there are the growing band of pseudo adapters that
> > may or may not have a PCI bus at the bottom of some
> > other protocol stack.
> 
> An ioctl might be better. We already have an ioctl for querying the lun
> information for a disk. We could also return the bus information for its
> controller(s) [remember multipathing]

Both 'cat /proc/scsi/scsi' and ioctls used on
fds belonging to the existing upper level drivers
(e.g. sd and sr) have a problem as far as getting
HBA environment information: there needs to be at
least one SCSI device (target) connected to the
HBA. With no SCSI devices connected, there is no 
fd to do an ioctl on. [The same problem arises
if a device is there but marked offline, has an
exclusive lock on it, ...]

Perhaps Matt could look at the approach I have taken
with the scsimon experimental upper level driver.
Scsimon was originally designed to get scsi based
information to the /sbin/hotplug mechanism. It also
supplies ioctls to probe HBAs as well as SCSI devices.
More information about it can be found at:
  http://www.torque.net/scsi/scsimon.html

It should not be difficult to add HBA PCI bus information
to scsimon (after the Scsi_Host structure is expanded to
hold that information).

Doug Gilbert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

MO drives (2048 byte block vfat fs) in lk 2.4

2001-04-22 Thread Douglas Gilbert


The "MO" bug (also 2048 byte block vfat problem) has been
reported several times in the lk 2.4 series. Since the
finger was being pointed at the SCSI subsystem I decided
to investigate. As far as I can see the sd driver offers
the same physical block (other than 512 byte) capabilities
in lk 2.4 as it did in lk 2.2 .

One error report stated that a MO drive with a vfat
fs based on 2048 byte sectors can be mounted and read
but any significant write causes a system lockup. I
have been able to replicate this behaviour. Luckily
Alt-SysRq-P did work. Pressing this sequence multiple
times gave similar addresses. Rebooting the machine
and rerunning the experiment multiple time gave 
addresses in the same area.

The EIP resolved most often to cont_prepare_write() in
fs/buffer. A disassembly suggests line 1802 in buffer.c
[2.4.3ac11]. That is around a memset() between
__block_prepare_write() and __block_commit_write() calls
within the while loop. Most other addresses were within
the same while loop. Perhaps someone with expertize
in this area may like to examine that loop.


Details: I modified the "scsi_debug" adapter driver to look
like it had one 2048 byte block MO drive connected to it.
The driver uses 8 MB of RAM to simulate a storage device.
[For anyone who wants to run similar experiments, I have
placed the driver at www.torque.net/sg/p/scsi_debug_mo.tgz ].
The sequence of commands that lead up to the failure was:
 $ modprobe scsi_debug
 $ cat /proc/scsi/scsi# "optical" device should be there
 $ fdisk -ul /dev/sdb # should see 3 partitions
 $ mkdosfs -S 2048 /dev/sdb3
 $ mount /dev/sdb3 /mnt/extra
 $ cd /mnt/extra
 $ touch t# worked ok
 $ cp /boot/vml-2.2.18 u  # system locks up

Doug Gilbert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH] adding PCI bus information to SCSI layer

2001-04-23 Thread Douglas Gilbert

[EMAIL PROTECTED] wrote:
> 
> [snip]
> 
> Doug suggested looking at extending scsimon.  This is a fine idea, and I've
> made proposed changes available at http://domsch.com/linux/scsi/.  (Doug may
> want to clean this up).  However, this, like my earlier changes to
> /proc/scsi/scsi, doesn't actually show the relationship between /dev/sda and
> a particular PCI controller and SCSI channel,bus,lun tuple.

Changes look ok. One suggestion: if a #define SCSI_PCI_INFO
(or some such) is defined in driver/scsi/scsi.h as part of
the patch then code like Matt is suggesting can be safely 
added, protected by "#ifdef SCSI_PCI_INFO ... #endif" blocks. 
I have used this technique in sg to support the scsi 
"reset+reservation" patch which still hasn't made it into 
the mid level (but is available in many distros).

The scsimon driver is just a window through to the information
held in the mid level structures. The information printed by
'cat /proc/scsi/scsi' also comes from the mid level. The scsi 
minor device numbers (e.g. /dev/sda) are allocated by each upper 
level driver  (e.g. sd_attach() in the case of sd) and are held 
in upper level driver data structures. Hence they are not 
visible to the mid-level or to other upper level drivers.

As an example of the latter point, using st and sg on the same 
tape device at the same time will most likely confuse st 
(since it maintains a state machine). However there is no 
simple way for the sg or st drivers to detect (or supply
information flagging) this conflict.

Doug Gilbert

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Manual SCSI bus reset?

2001-02-15 Thread Douglas Gilbert


German Gomez Garcia wrote:

> I've got Plexwriter 12x10x32S attached to an onbard AIC7890
> (besides other things as three IBM UWSCSI harddisks, an SCSI ZIP and a
> Pioneer DVD) and sometimes when recording a CD the Plexwriter fails at the
> very end of the process (although the CD is recorded correctly) and it is
> locked with no posibility to eject it (it seems that a failure while
> reading from the DVD during on-the-fly recording is the cause). 
>
> But if I reset the SCSI bus manually, that is trying to read from
> a "reset-it CD", that is completely broken and makes the SCSI bus resets
> itself, I can eject the CD from the Plexwriter. So I would like to know if
> there is a way to do it without that trick. I've downloaded some utilities
> for the SCSI generic driver, one of them should let you reset the bus (or
> even just a single device) but it fails with "SCSI_RESET" not supported
> and after reading through the docs it seems that the kernel (or should I
> say the SCSI drivers) doesn't support this kind of reset.
>
> I would like to know if this is "kernel politics", "faulty
> hardware", or just lazy programmers ;-), thanks and please CC the answer
> to me as I'm not subscribed to this mailing list.

Various distributions (e.g. RH 7.0) contain the SCSI 
mid-level patch that will permit the sg_reset utility
to perform a scsi bus reset. [The same patch makes the 
scsi subsystem respect device reservations.]

The patch originates from James Bottomley (see the linux-scsi
list archive) and he has submitted a new version for the lk 2.4
tree recently. When first submitted, objections were raised to 
the concept of allowing users to do scsi bus resets (see same 
list archive). 

There is a chance that the required patch will go into main 
kernel tree soon.

Doug Gilbert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Flushing buffer and page cache

2001-02-17 Thread Douglas Gilbert

James Bottomley wrote:

> > Is it possible to flush all entries in the buffer cache corresponding
> > to a single block device (i.e. simply drop them if they aren't dirty,
> > or write them to disk and drop them after this if they are dirty)?
> 
> Yes, just send the BLKFLSBUF ioctl to the device this syncs the device then 
> removes all the buffers from the cache.  We use it as a tool to move a SAN 
> device around a cluster, which is similar to what you want to do.

Last time this question was raised, someone mentioned
a little utility called flushb . Here is its source:

/*
 * flushb.c --- This routine flushes the disk buffers for a disk
 */

/*
 * modified August 2000 by Juri Haberland
 * [EMAIL PROTECTED]
 */

#include 
#include 
#include 

#define NOARGS void

const char *progname;

static void usage(NOARGS)
{
fprintf(stderr, "Usage: %s disk\n", progname);
exit(1);
}  

int main(int argc, char **argv)
{
int fd;

progname = argv[0];
if (argc != 2)
usage();

fd = open(argv[1], O_RDONLY, 0);
if (fd < 0) {
perror("open");
exit(1);
}
/*
 * Note: to reread the partition table, use the ioctl
 * BLKRRPART instead of BLKFSLBUF.
 */
if (ioctl(fd, BLKFLSBUF, 0) < 0) {
perror("ioctl BLKFLSBUF");
exit(1);
}
return 0;
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: downloading drive firmware to a fibre channel drive through linux

2001-02-21 Thread Douglas Gilbert


[EMAIL PROTECTED] wrote:
> Is it possible to download a drive firmware to a fibre channel
> drive (or even a scsi drive) through linux ? I know that on
> NT (or 98) they use WNASPI and a utility provided by the 
> drive manufacturer to download the firmware. I was wondering
> if this is possible through linux scsi interface. For FC,
> I have Qlogic Fibre Channel card.

Yes. It is one of the advertised features of the Scsi Command
Utility (scu). There is a Linux port. See:
http://www.bit-net.com/~rmiller/scu.html

Doug Gilbert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Writing on raw device with software RAID 0 is slow

2001-03-01 Thread Douglas Gilbert

Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Ben LaHaise wrote:
> On Thu, 1 Mar 2001, Stephen C. Tweedie wrote:
> 
> > Yep.  There shouldn't be any problem increasing the 64KB size, it's
> > only the lack of accounting for the pinned memory which stopped me
> > increasing it by default.
> 
> Actually, how about making it a sysctl?  That's probably the most
> reasonable approach for now since the optimal size depends on hardware.

Something else may slow down raw IO. A buffer
that looks contiguous in the user space typically looks
quite splintered from the kernel's perspective. This
means that a buffer of 64 KB in the user space ends
up being a scatter gather list of 16 elements (assuming
PAGE_SIZE of 4KB) en route to the IDE or SCSI subsystem.

Now one SCSI adapter that I have examined must push each
scatter gather element through its firmware to the DMA 
engine which can only hold one element at a time. 
That takes time.

Doug Gilbert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: scsi vs ide performance on fsync's

2001-03-04 Thread Douglas Gilbert


There is definitely something strange going on here.
As the bonnie test below shows, the SCSI disk used
for my tests should vastly outperform the old IDE one:

  ---Sequential Output ---Sequential Input-- --Random--
Seagate   -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
ST318451LW MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU
SCSI  200 21544 96.8 51367 51.4 11141 16.3 17729 58.2 40968 40.4 602.9  5.4

Quantum   ---Sequential Output ---Sequential Input-- --Random--
Fireball  -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
ST3.2A MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU
IDE   200  3884 72.8  4513 86.0  1781 36.4  3144 89.9  4052 95.3 131.5  0.9

I used a program based on Mike Black's "Blah Blah" test
(shown below) in which 200 write()+fdatasync()s are 
performed. Each write() outputs either 20 or 4096 bytes.

On my Celeron 533 Mhz 128 MB ram hardware with an ext2 fs,
the "block" size that is seen by the sd driver for each 
fdatasync() is 4096 bytes. lk 2.4.2 is being used. The 
fs/buffer.c __wait_on_buffer() routine waits for IO 
completion in response to fdatasync(). Timings have been 
done with Andrew Morton's timepegs (units are microseconds). 
Here are the IDE results:

IDE 20*200 Destination  Count   Min   Max   Average   Total
enter __wait_on_buffer:0 ->
  leave __wait_on_buffer:0  2001,037.23  6,487.72  1,252.19  250,439.80
leave __wait_on_buffer:0 ->
  enter __wait_on_buffer:0  1997.32 21.05  7.821,557.05

IDE 4096*200   Destination  Count   Min   Max   Average   Total
enter __wait_on_buffer:0 ->
  leave __wait_on_buffer:0  2001,037.06  7,354.21  1,243.78  248,756.64
leave __wait_on_buffer:0 ->
  enter __wait_on_buffer:0  199   23.01 67.32 37.037,370.51


So the size of each transfer doesn't matter to this IDE
disk. Now the same test for the SCSI disk:

SCSI(20*200)   Destination  Count Min   Max   Average   Total
enter __wait_on_buffer:0 ->
   enter sd_init_command:0  200  1.86 13.27  2.05  411.48
enter sd_init_command:0 ->
   enter rw_intr:0  200320.87  5,398.56  3,417.30  683,461.25
enter rw_intr:0 ->
  leave __wait_on_buffer:0  200  4.04 15.81  4.42  885.73
leave __wait_on_buffer:0 ->
  enter __wait_on_buffer:0  199  8.78 14.39  9.261,844.23

SCSI(4096*200) Destination  Count MinMax   Average   Total
enter __wait_on_buffer:0 ->
   enter sd_init_command:0  200  1.97  13.20  2.21  443.52
enter sd_init_command:0 ->
   enter rw_intr:0  200109.53  13,997.50  1,327.47  265,495.87
enter rw_intr:0 ->
  leave __wait_on_buffer:0  200  4.37  22.50  4.75  951.44
leave __wait_on_buffer:0 ->
  enter __wait_on_buffer:0  199 22.40  42.20 24.274,831.34

The extra timepegs inside the SCSI subsystem show that 
the IO transaction to that disk really did take that 
long. [Initially I suspected a "plugging" type
elevator bug, but that isn't supported by the above
and various other timepegs not shown.]
Since there is a wait on completion for every write,
tagged queuing should not be involved.

So writing more data to the SCSI disk speeds it up!
I suspect the critical point in the "20*200" test is
that the same sequence of 8 512 byte sectors are being 
written to disk 200 times. BTW That disk spins at
15K rpm so one rotation takes 4 ms and it has a
4 MB cache.

Even though the SCSI disk's "cache" mode page indicates
that the write cache is on, it would seem that writing 
the same sectors continually causes flushes to the medium 
(and hence the associated delay). Here is scu's output 
of the "cache" mode page:

$ scu -f /dev/sda show page cache
Cache Control Parameters (Page 0x8 - Current Values):

Mode Parameter Header:

  Mode Data Length: 31
   Medium Type: 0 (Default Medium Type)
 Device Specific Parameter: 0x10 (Supports DPO & FUA bits)
   Block Descriptor Length: 8

Mode Parameter Block Descriptor:

  Density Code: 0x2
  Number of Logical Blocks: 2289239 (1117.792 megabytes)
  Logical Block Length: 512

Page Header / Data:
 Page Code: 0x8
Parameters Savable: Yes
   Page Length: 18
  Read Cache Disable (RCD): No
Multiplication Factor (MF): Off
  Write Cache Enable (WCE): Yes
  Cache Segment Size Enable (SIZE): Off
  Discontinuity (DISC): On
  Caching Analysis Permitted (CAP): Disabled
Abort Pre-Fetch (ABPF): Off
 Initiator Control Enable (IC): Off
  Write Retention Priority: 0 (Not distiguished)
Demand Read Retention Priority: 0 (Not distiguished)
  Disable

Re: scsi vs ide performance on fsync's

2001-03-05 Thread Douglas Gilbert

Since the intention of fsync and fdatasync seems to be
to write dirty fs buffers to persistent storage (i.e.
the "oxide") then the best time is not necessarily
the objective. Given the IDE times that people have 
been reporting, it is very unlikely that any of those
IDE disks were really doing 2000 discrete IO operations
involving waiting for the those buffers to be written
to the "oxide". [Reason: it should take at least 2000 
revolutions of the disk to do it, since most of the
4KB writes are going to the same disk address as the
prior write.]

As it stands, the Linux SCSI subsystem has no mechanism 
to force a disk cache write through. The SCSI WRITE(10)
command has a Force Unit Access bit (FUA) to do exactly
that, but we don't use it. Do the fs/block layers flag
they wish buffers written to the oxide?? 
The measurements that showed SCSI disks were taking a lot 
longer with the "xlog" test were more luck than good 
management.

Here are some tests that show an IDE versus SCSI "xlog"
comparison are very similar between FreeBSD 4.2 and
lk 2.4.2 on the same hardware: 

# IBM DCHS04U SCSI disk 7200 rpm  <>
[root@free /var]# time /root/xlog tst.txt
real0m0.043s
[root@free /var]# time /root/xlog tst.txt fsync
real0m33.131s

# Quantum Fireball ST3.2A IDE disk 3600 rpm  <>
[root@free dos]# time /root/xlog tst.txt
real0m0.034s
[root@free dos]# time /root/xlog tst.txt fsync
real0m5.737s

# IBM DCHS04U SCSI disk 7200 rpm  <>
[root@tvilling extra]# time /root/xlog tst.txt
0:00.00elapsed 125%CPU
[root@tvilling spare]# time /root/xlog tst.txt fsync
0:33.15elapsed 0%CPU

# Quantum Fireball ST3.2A IDE disk 3600 rpm  <>
[root@tvilling /root]# time /root/xlog tst.txt
0:00.02elapsed 43%CPU
[root@tvilling /root]# time /root/xlog tst.txt fsync
0:05.99elapsed 69%CPU

Notes: FreeBSD doesn't have fdatasync() so I changed xlog 
to use fsync(). Linux timings were the same with fsync() 
and fdatasync(). The xlog program crashed immediately in
FreeBSD; it needed some sanity checks on its arguments.

One further note: I wrote:
> [snip] 
> So writing more data to the SCSI disk speeds it up!
> I suspect the critical point in the "20*200" test is
> that the same sequence of 8 512 byte sectors are being
> written to disk 200 times. BTW That disk spins at
> 15K rpm so one rotation takes 4 ms and it has a
> 4 MB cache.

A clarification: by "same sequence" I meant written
to the same disk address. If the 4 KB lies on the same
track, then a delay of one disk revolution would be
expected before you could write the next 4 KB to the 
"oxide" at the same address.

Doug Gilbert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: scsi vs ide performance on fsync's

2001-03-05 Thread Douglas Gilbert

Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Linus Torvalds wrote:

> Well, it's entirely possible that the mid-level SCSI layer is doing
> something horribly stupid.

Well it's in good company as FreeBSD 4.2 on the same hardware
returns the same result (including IDE timings that were too
fast). My timepeg analysis showed that the SCSI disk was consuming
the time, not any of the SCSI layers.

> On the other hand, it's also entirely possible that IDE is just a lot
> better than what the SCSI-bigots tend to claim. It's not all that
> surprising, considering that the PC industry has pushed untold billions of
> dollars into improving IDE, with SCSI as nary a consideration. The above
> may just simply be the Truth, with a capital T.

What exactly do you think fsync() and fdatasync() should
do? If they need to wait for dirty buffers to get flushed
to the disk oxide then multiple reported IDE results to
this thread are defying physics.

Doug Gilbert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Problems with devfs (?)

2001-01-05 Thread Douglas Gilbert

Raphael wrote:

> I'm using ROCK Linux, which is built with devfs, originally Kernel
> 2.4.0-test9. This problem occurs, when I want to boot some Kernel after
> 2.4.0-test9, whereas building and installing the Kernel never is a problem.
>
> I enabled devfs support as well as the mounting of devfs at bootup in the
> configuration, just as it is with my "default"Kernel, next, I played around
> with lilo.conf (under normal circumstances a make bzlilo without any
> playing-around should do it, shouldn't it?)
> Then, as this also did show no success I tried passing root=/.
> devfs=mount (<= don't nail me on this one, but I'm sure I did it the right
> way) at the LILO boot prompt.
> Whole story. Point. That's all. When trying to mount the root he hangs with
> "Kernel Panic: I have no root and I want to scream." Poor Kernel.
> Hope this helps anyone except me in any way (or perhaps I'm just too
> stupid).

I've been using kernels with devfs right through the test
series and now with lk 2.4.0 . The only hiccup was
when I upgraded to glibc 2.2 [RH 7.0 upgrade]. devfsd
seg faulted during bootup and I got a similar message
(because the kernel couldn't find /dev/sda3).

Don't know why but this line in /etc/devfsd.conf caused
devfsd to barf:
LOOKUP  ^cdrom$  CFUNCTION  GLOBAL
   symlink ${mntpnt}/cdroms/cdrom0 $devpath

This line is almost straight out of Richard's doco.

You could try and identify your disk to lilo explicitly.
For example if it is normally at /dev/hda3 then try
something like this at the lilo prompt:
  linux root=/dev/ide/host0/bus0/target0/lun0/part3

If you can find the root partition that way, it will
probably fail later in the boot. To poke around with 
a statically linked shell trying adding to the above 
lilo prompt:
init=/sbin/sash

Taking another tack, you could hope your root fs has a
normally populated /dev directory and try "devfs=nomount"
at the lilo prompt.

Doug Gilbert

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: APIC-ERROR-Messages -

2001-01-06 Thread Douglas Gilbert


 
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Alan Cox wrote:

> > as far as I understood my smp-board seem not well designed - so I get APIC 
> > error messages nearly every 1-3 seconds. These mmessages do not help me 
> > because -so I was told - it is not possible to fix the problem.
>
> They are a warning that your box isnt going to be happy long term.; Eventually
> a bad message will get through with a good checksum. There was a panic case in
> the code when messages got reset that is fixed in 2.4.0-preleease
> 
> > Is it possible to eliminate these error messages. My logfiles grow enormously 
> > and are "trashed" with these messages...
> 
> You can certainly comment the printk's out of your own tree

At a frequency of 1 every half hour or so from my BP6
motherboard (more frequent during heavy IO) I found that
message pretty annoying and commented it out.

The 'cat /proc/interrupts' last line "ERR: " gives a
running count if you are interested.

Doug Gilbert


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: SCSI scanner problem with all kernels since 2.3.42

2001-01-09 Thread Douglas Gilbert

Tim Waugh wrote:
> I'm having problems with using xsane to acquire a preview from an HP
> ScanJet 5P connected to an AHA-2940.  2.3.42 is the last kernel that
> works right for me.
> 
> The symptom is that the scanner starts to make scanning sounds, then
> stops, and xsane says 'Error during read: Error during device I/O'.

Tim and I have been looking at this offline. The significance
of lk 2.3.43 was the addition of a new sg driver that has an
additional interface. Recent versions of SANE use that newer
sg interface. The problem that Tim reported seemed to be caused
by timeouts ** resulting in scsi bus resets. Anyway the problem
seems to disappear with the recently released SANE 1.0.4 .
[The original report was based on SANE 1.0.3 and earlier.]

There is also a problem report with the  SnapScan 1236 <--> aha152x 
combination also based on SANE 1.0.3 . This one is looking 
like an "uninitialized errno" bug fixed in SANE 1.0.4 .

** SANE's newer sg interface shortens the per command timeout
from 10 minutes to 10 seconds. Most other OSes interfaces in
SANE have a timeout value of 1 minute or more. I suspect 10 
seconds may be too short.

Doug Gilbert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: 2.4.0: Raw devices ?

2001-01-13 Thread Douglas Gilbert

Meino Cramer wrote:
> short question: How cabn I activate/where can I find the raw devices
> often described as /dev/raw[12]* in/with kernel linux-2.4.0.

There doesn't seem to be any config option for raw
devices in lk 2.4.0 , they are just there. However
the raw (8) utility expects them in a different place
from where Documentation/devices.txt currently says 
they are. You may have to set up these char devices:

$ ls -l /dev/rawctl 
crw-r--r--1 root root 162,   0 Jan 13 05:12 /dev/rawctl

$ ls -l /dev/raw/*  
crw-r--r--1 root root 162,   1 Jan 13 05:12 /dev/raw/raw1
crw-r--r--1 root root 162,   2 Jan 13 05:12 /dev/raw/raw2
crw-r--r--1 root root 162,   3 Jan 13 05:12 /dev/raw/raw3
crw-r--r--1 root root 162,   4 Jan 13 05:12 /dev/raw/raw4
etc.

Recent versions of dd meet the alignment requirements
of raw devices as does lmdd (from the lmbench package).
I have done some timings of disk to disk copies using
raw devices compared to other devices. See:
http://www.torque.net/sg/fst_copy.html

> And where can I find the "raw" utility...

In both RH 6.2 and 7.0 the raw (8) utility is in the 
util-linux package (RH have applied a "raw" patch for 
those two lk 2.2 versions). Read man (8) raw to find 
out how to bind a raw device to an existing block device.
Example:
$ raw /dev/raw/raw1 /dev/sda3

Doug Gilbert

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: Linux not adhering to BIOS Drive boot order?

2001-01-16 Thread Douglas Gilbert

Venkatesh Ramamurthy wrote:
> 
> Hi,
> I have one issue which requires fix from the linux kernel.
> Initially i put a SCSI controller and install the OS on the drive connected
> to it. After installing the OS (on sda), the customer puts another SCSI
> controller. The BIOS for the first controller has BIOS enabled and for the
> second controller does not have the BIOS enabled.
> 
> The linux loads the driver for the second controller first and assigns sda
> to it first , and the actual boot drive gets some sdX device node.
> >From the lilo prompt we can override it with root=/dev/sdX to boot to the
> correct drive and controller, but for a end -user using these cards, this is
> no fun.
> 
> Can the linux kernel be changed in such a way that kernel will look for the
> actual boot drive and re-order the drives so that mounting can go on in the
> right order.
> 
> we need some kind of signature being written in the drive, which the kernel
> will use for determining the boot drive and later re-order drives, if
> required.
> 
> Is someone handling this already?

Venkatesh,
It has been partially addressed in the new lk 2.4 series with
the "scsihosts" parameter. Here is a line from /etc/lilo.conf
in my system:
   append="scsihosts=imm:advansys:advansys:aha1542"

The scsi host numbers will be allocated to the HBAs in 
the order shown starting at 0. This method does not
distinguish between the two advansys controllers, luckily
swapping their positions on the PCI bus does.

In my experience, changing the SCSI BIOS settings only
affects which disk's boot track is accessed to find
the kernel image. It is the kernel's initialization that
detects and orders scsi controllers. This is were
"scsihosts" helps.

Doug Gilbert

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: Linux not adhering to BIOS Drive boot order?

2001-01-17 Thread Douglas Gilbert


Michael Meissner wrote:
> 
> On Wed, Jan 17, 2001 at 12:32:05AM +0100, J . A . Magallon wrote:
> > If that is your idea of the average user... You're a system administrator,
> > you can have tons of scsi cards in your system if you want.
> >
> > You want to make things SOOO easy for a 'dummy' user, and that user will never
> > use them. The average user you are targetting says: 'daddy, buy me a PC to
> > run Quake and do my school jobs' or 'please, dear vendor, I want a PC to
> > do my housekeeping'. I have seen so many cases (A buys PC, A tries to run
> > brand new racing game that does not work, A goes shop and says: don't know
> > what's wrong with this PC, look at it and call me when MyCarRacingGame
> > works...).
> 
> I also don't want things so complex for the people who need to do complex
> things, that they give up in frustration with Linux and use something else like
> *BSD, particularly when things are changed from the previous way they were done
> in Linux.  I agree things should be simple for simple configurations, but that
> does not mean we should be throwing boat anchors and couches in the paths of
> people who have more complex hardware.
> 
> > Average users you are targetting with that automagical
> > card detection even do not know there are SCSI and IDE disks. They just
> > want a 30Gb ide disk to install linux and play. If they involve with SCSI
> > and ID numbers and multiple cards and so on they can read some docs and
> > rebuild a kernel.
> 
> Ummm, I just reread the 2.4 Changes file once again just to be sure, and it did
> not cover this issue.  So how the *$@% are people supposed to "read some docs"
> to know about this, if the docs don't mention the information.  I know people
> have been complaining about this change since at least the fall time frame.

There has been some movement on the SCSI subsystem
documentation front:

 http://www.linuxdoc.org/HOWTO/SCSI-2.4-HOWTO/

Doug Gilbert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [PATCH] (new for ppa and imm) Re: [PATCH] Re: Patch to fix lockup on

2000-11-20 Thread Douglas Gilbert



Content-Type: multipart/mixed;
 boundary="4B08707380476D0EA895C09B"

This is a multi-part message in MIME format.
--4B08707380476D0EA895C09B
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

John Cavan wrote:
> Tim Waugh wrote:
> > 
> > On Thu, Nov 16, 2000 at 09:50:40PM -0500, John Cavan wrote:
> > 
> > > [...] This patch unlocks, allows the lowlevel driver to do it's
> > > probes, and then relocks. It could probably be more granular in the
> > > parport_pc code, but my own home tests show it to be working fine.
> > 
> > Is that safe?

Safer than an oops/lockup :-)

> I'm not sure. I know why it causes the NMI lockup, but I'm not enough of
> an expert to sort it out. I've got a pretty good feel for the Zip
> driver, but not the parport or scsi code yet, so I don't know how safe
> it is. The new scsi error stuff does mention that drivers must
> spinunlock/spinlock if it enables interrupts.
>
> > Also, what bit of the parport code is tripping over the lock?
> > Request_module or something?
> 
> During the init phase of the parport_pc module it probes and enables the
> IRQ(s) of the parallel port, but the scsi layer has them locked.

John and Tim,
At least using imm on my SMP machine (BP6 dual celery)
I found that I had to go a bit further than John's patch.
Basically I have unlocked the whole of imm_detect().
It was necessary to unblock parport_enumerate()
but not sufficient.

Please see attachment. I don't have a ppa device to test.
Eric Y. makes some comments just before the (*detect())
call in scsi.c relating to low level driver detect routines.

Doug Gilbert
--4B08707380476D0EA895C09B
Content-Type: text/plain; charset=us-ascii;
 name="imm.cx2_diff"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="imm.cx2_diff"

--- linux/drivers/scsi/imm.cMon Nov 20 16:36:19 2000
+++ linux/drivers/scsi/imm.cx2  Mon Nov 20 22:46:10 2000
@@ -122,14 +122,17 @@
 struct Scsi_Host *hreg;
 int ports;
 int i, nhosts, try_again;
-struct parport *pb = parport_enumerate();
+struct parport *pb;
 
 printk("imm: Version %s\n", IMM_VERSION);
+spin_unlock_irq(&io_request_lock);
+pb = parport_enumerate();
 nhosts = 0;
 try_again = 0;
 
 if (!pb) {
printk("imm: parport reports no devices.\n");
+spin_lock_irq(&io_request_lock);
return 0;
 }
   retry_entry:
@@ -154,6 +157,7 @@
printk(KERN_ERR "imm%d: failed to claim parport because a "
  "pardevice is owning the port for too longtime!\n",
   i);
+spin_lock_irq(&io_request_lock);
return 0;
}
}
@@ -208,12 +212,16 @@
nhosts++;
 }
 if (nhosts == 0) {
-   if (try_again == 1)
+   if (try_again == 1) {
+spin_lock_irq(&io_request_lock);
return 0;
+   }
try_again = 1;
goto retry_entry;
-} else
+} else {
+spin_lock_irq(&io_request_lock);
return 1;   /* return number of hosts detected */
+}
 }
 
 /* This is to give the imm driver a way to modify the timings (and other

--4B08707380476D0EA895C09B--


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: 2.2.16: How to freeze the kernel

2000-11-24 Thread Douglas Gilbert


Ulrich Windl wrote:
> 
> Hello,
> 
> this is for your interest, amusement, and for "what not to do":
> 
> I managed to freeze the kernel (2.2.16 from SuSE Linux 7.0) in a way
> that I could not even switch virtual consoles. Completely silent
> eberything...
> 
> It all started when Windows/95 ruined another CD-R while trying to
> write an image to the media. So I decided to try it with Linux, using
> the same CD writer.
> 
> I plugged the device to the so far unused SCSI channel and used the
> "add-sigle-device" method to avoid reboot, and I succeeded:
> 
> kgate kernel: scsi singledevice 0 0 4 0
> kgate kernel:   Vendor: WAITECModel: WT624 Rev: 7.0F
> kgate kernel:   Type:   CD-ROM ANSI SCSI
> revision: 0
> kgate kernel: Detected scsi CD-ROM sr1 at scsi0, channel 0, id 4, lun 0
> kgate kernel: (scsi0:0:4:0) Synchronous at 10.0 Mbyte/sec, offset 15.
> kgate kernel: sr1: scsi3-mmc drive: 24x/24x writer cd/rw xa/form2 cdda
> tray
> 
> Then I used "cdrecord-1.8.1" to simulate writing at "speed=8". It
> worked so far, but there was a warning about possible problems with
> "simulated fixation", and actually several minutes nothing happened
> while the simulated fixation was expected to take place.

Evidently some media/cdwriters don't like "simulated
fixation" hence the comment from cdrecord. In your
case the warning seems well founded.

When cdrecord issues the SCSI command to fixate
(0x5b on my Yamaha) it sets a long timeout (480.5
seconds (8 minutes)). To find out the current
state (in the lk 2.4 series) try:

$ cat /proc/scsi/sg/debug 
dev_max(currently)=9 max_active_device=3 (origin 1)
 scsi_dma_free_sectors=512 sg_pool_secs_aval=320 def_reserved_size=32768
 >>> device=sg2 scsi2 chan=0 id=6 lun=0   em=0 sg_tablesize=255 excl=0
   FD(1): timeout=480500ms bufflen=32768 (res)sgat=0 low_dma=0
   cmd_q=0 f_packid=0 k_orphan=0 closed=0
 act: id=4054 blen=0 t_o/elap=480500/9920ms sgat=0 op=0x5b

$ cat /proc/scsi/sg/debug 
dev_max(currently)=9 max_active_device=3 (origin 1)
 scsi_dma_free_sectors=512 sg_pool_secs_aval=320 def_reserved_size=32768
 >>> device=sg2 scsi2 chan=0 id=6 lun=0   em=0 sg_tablesize=255 excl=0
   FD(1): timeout=480500ms bufflen=32768 (res)sgat=0 low_dma=0
   cmd_q=0 f_packid=0 k_orphan=0 closed=0
 act: id=4054 blen=0 t_o/elap=480500/13840ms sgat=0 op=0x5b

The last line of these 2 commands shows that SCSI command
0x5b is active with a timeout value of 480500 milliseconds.
9.92 seconds has elapsed when the first 'cat'
was executed and 13.8 seconds had elapsed when
the second one was executed. In my test case
the "dummy fixate" concluded successfully. But what if
it locked up the cdwriter, as the warning hints at?

There is no way that I know of to cancel
a command once it is "active". The timeout
will get it (usually by the brute force
technique of resetting the SCSI bus). [I
have toyed with the idea of trying to shorten
the timeout of an active command.]
 
> At some point I hit ^C, returning to the prompt. As the device did not
> seem to be ready, I thought "remove the device and reconnect", so I did
> "remove-single-device" (possibly while a command was still "busy"). The
> remove suceeded, but a second later everything had stopped!

Things _not_ to do while there is an active
SCSI command still executing:
  - remove a module that it is using
(e.g. sg, aic7xxx, scsi_mod).
In most (but not all) cases rmmod will 
report the module is busy.
  - use remove-single-device
 
> Should a device with busy commands be able to be removed? I guess no...

Correct.
 
> The last message in the syslog was:
> 
> kgate kernel: scsi : aborting command due to timeout : pid 8358,
>  scsi0, channel 0, id 4, lun 0 UNKNOWN(0x5b) 00 02 00 00 00 00 00 00 00
> 
> At that point I pressed "RESET", and interestingly the builtin BIOS of
> the Adaptec 2740 (EISA) hung while trying to detect the device.

In my experience cdwriters are not always well behaved
SCSI devices and can lockup and not respond to SCSI
bus resets. This means you need to power cycle them
to get them back into a functional mode. It is also
a good reason _not_ to have SCSI cdwriters (and scanners)
on the same SCSI bus as high speed modern SCSI disks.
Luckily they tend to use different SCSI parallel bus
types.
 
> Only after powering down both, the CD writer and the machine (a HP
> Netserver LD Pro), the BIOS detected the device again. So I guess
> something badly hung...
> 
> The driver being used was
> Adaptec AHA274x/284x/294x (EISA/VLB/PCI-Fast SCSI) 5.1.31/3.2.4
> 
> After that, everything worked fine.

Thanks for the report. Hopefully everything worked ok
when you did _not_ use the "dummy" option in cdrecord.

Doug Gilbert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: Weirdness in block device queues.

2000-09-09 Thread Douglas Gilbert

Giuliano Pochini wrote:
> 
> > > This brings me to another point.  We probably want some generic
> > > interfaces in ll_rw_blk to unplug individual queues, where you can either
> > > specify either the actual queue a kdev_t.  Some of the places, (such as
> > > __wait_on_buffer()) it might make more sense to unplug just the queue
> > > related to the buffer in question rather than just unplug everything by
> > > running tq_disk.  On single-disk systems there wouldn't be any appreciable
> > > difference, but if you had a lot of disks, it would probably work out to be
> > > more efficient. Something to think about
> >
> > request_queue_t *q = blk_get_queue(dev);
> > generic_unplug_device(q);
> 
> ?? I doesn't seem very different from the current (2.4t7)
> 
> static void generic_unplug_device(void *data){
> request_queue_t *q = (request_queue_t *) data;
> unsigned long flags;
> spin_lock_irqsave(&io_request_lock, flags);
> __generic_unplug_device(q);
> spin_unlock_irqrestore(&io_request_lock, flags);
> }
> ...but perhaps I'm blind. Sometimes it happens after 6 hours of windows :(((

Well the stall effects the SCSI generic driver (st and
ioctls issued from the SCSI subsystem). So one problem
is the static storage class of that function (which is in
drivers/block). It is not exported either as Jens pointed
out. Also when the SCSI mid level queuing code comes
to this point, it already has io_request_lock.

Anyway one strategy for char drivers putting things on
those queues is to set q->plugged to zero and dispatch. 
Actually, Jens is now proposing that the 
scsi_lib::scsi_request_fn() doesn't check for "plugged" 
at all. That solves my problem and saves code.

My tests show that the same "plugged" stall delays the
lilo command sometimes as well.

Doug Gilbert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: 2.4.0 and ZIP ppa

2000-09-13 Thread Douglas Gilbert


Peter Christy wrote:
> 
> OK, after a LOT of head scratching, I 've found the problem. At some point
> in its development, the name for the scsi sd module has changed from sd_mod
> to plain sd. Now I haven't seen this mentioned anywhere in the
> documentation, and its caught my system out totally.
> 
> A quick fix has been to put "alias sd_mod sd" in modules.conf, but then I
> have to rem it out if I use the old kernel.
> 
> Is there a better solution?

That 2.4 change was introduced about 2 months ago and it
is going to be put back to sd_mod.o because too many
tools assume that.

The author of that change (in a Makefile cleanup) said 
that module name change was an oversight. 


Doug Gilbert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [2.4.0-test9-pre5] SCSI still broken, trident/mixer still broken

2000-09-21 Thread Douglas Gilbert


Simon Kirby wrote:

> Around 2.4.0-test9-pre2 (or so, definitely in pre3) both my SCSI scanner
> and trident sound card stopped being happy.  They are still both broken
> in pre5.  On test8, both work perfectly.
> 
> On test8:
> 
> (scsi0:6:0) Synchronous Data Transfer Request was rejected
>   Vendor:   Model: Scanner   Rev: 1.70
>   Type:   ScannerANSI SCSI revision: 04
> Detected scsi generic sg0 at scsi0, channel 0, id 6, lun 0, type 6
> (scsi1:0:3:0) Synchronous at 8.0 Mbyte/sec, offset 31.
>   Vendor: YAMAHAModel: CRW4416S  Rev: 1.0e
>   Type:   CD-ROM ANSI SCSI revision: 02
> Detected scsi CD-ROM sr0 at scsi1, channel 0, id 3, lun 0
> scsi : detected 1 SCSI cdrom total.
> sr0: scsi3-mmc drive: 16x/16x writer cd/rw xa/form2 cdda tray
> 
> ... on test9pre5 and test9pre3:
> 
> (scsi0:6:0) Synchronous Data Transfer Request was rejected
>   Vendor:   Model: Scanner   Rev: 1.70
>   Type:   ScannerANSI SCSI revision: 04
> (scsi0:0:3:0) Synchronous at 8.0 Mbyte/sec, offset 31.
>   Vendor: YAMAHAModel: CRW4416S  Rev: 1.0e
>   Type:   CD-ROM ANSI SCSI revision: 02
> Detected scsi CD-ROM sr0 at scsi0, channel 0, id 3, lun 0
> sr0: scsi3-mmc drive: 16x/16x writer cd/rw xa/form2 cdda tray
> 
> ("Detected scsi generic..." line missing.)

[snipped trident problem report]

Interesting. 'cat /proc/scsi/scsi' should show the same
devices as 'cat /proc/scsi/sg/device_strs' [and 
'cat /proc/scsi/sg/devices']. If not, then the SCSI
mid-level is not calling sg_detect() [in sg.c] for
all new scsi devices detected by the mid-level.

The sg_detect() routine is silent for all devices that
are "owned" by other upper level drivers (i.e. disks,
cdroms and tapes) but outputs a line for any other
scsi type (e.g. scanners which are scsi type 6).

It is not clear to me what "hacking" sg requires as
Torben Mathiasen suggested in his response. This seems
like a mid level problem. I'll check with my scsi
scanner this evening.


Other random scsi notes:
  - scsi modules were completely broken in 2.4.0-test9-pre4
but worked again in pre5 [Makefile hacks?]
  - the sd module's name has now reverted to its historic name
of "sd_mod.o"
  - the imm module (scsi over parallel port for ZIP drives)
works on a UP machine but locks up a SMP machine (until
the NMI notices)
  - the sg "stall" problem (plugged queues) has not been
addressed yet

Doug Gilbert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [2.4.0-test9-pre5] SCSI still broken, trident/mixer still broken

2000-09-21 Thread Douglas Gilbert

Torben Mathiasen wrote:
> 
> On Thu, Sep 21 2000, Douglas Gilbert wrote:
> 
> [deleted]
> 
> > It is not clear to me what "hacking" sg requires as
> > Torben Mathiasen suggested in his response. This seems
> > like a mid level problem. I'll check with my scsi
> > scanner this evening.
> >
> 
> Well first of all the sg driver needs to be updated the
> same way sd and sr was.

Well looking at sr in test9-pre5 the only changes are the 
addition of 'static' before the sr_template definition 
and various functions. Sg already has the corresponding
functions declared static and the sg_template definition
has been changed to 'static'.

So as far as I can see the mid level has failed to call
sg_detect() when it should have. Simon has now confirmed
with a printk that sg_detect() was not called for the
scanner which the mid level obviously knows about.

Doug Gilbert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [2.4.0-test9-pre5] SCSI still broken, trident/mixer still broken

2000-09-21 Thread Douglas Gilbert


Simon Kirby wrote:
> 
> On Thu, Sep 21, 2000 at 01:12:27PM -0400, Douglas Gilbert wrote:
> 
> > Interesting. 'cat /proc/scsi/scsi' should show the same
> > devices as 'cat /proc/scsi/sg/device_strs' [and
> > 'cat /proc/scsi/sg/devices']. If not, then the SCSI
> > mid-level is not calling sg_detect() [in sg.c] for
> > all new scsi devices detected by the mid-level.
> >
> > The sg_detect() routine is silent for all devices that
> > are "owned" by other upper level drivers (i.e. disks,
> > cdroms and tapes) but outputs a line for any other
> > scsi type (e.g. scanners which are scsi type 6).
> 
> I didn't fiddle with it too much, but I added a printk to sg_detect and
> verified it was not getting called at all.  I notice now, however, that I
> don't even have a /proc/scsi/sg.  Does that mean it's not getting
> initialized at all?  CONFIG_CHR_DEV_SG=y, assuming that's what needs to
> be set (config didn't change between kernel versions).

I do nearly all of my testing with sg as a module.
So this looks like (another recent) breakage.

It is beginning to look like the sg driver is not
(properly) initialized when it is built into the
kernel. Perhaps you could put a printk in
sg_init() and sg_attach() to see if they are called.

> At one point before I followed some of the debug/logging commands listed
> at the top of sg.c and got an Oops as well...

Seems as though I've got a lot of retesting to do.

Doug Gilbert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [2.4.0-test9-pre5] SCSI still broken, trident/mixer still broken

2000-09-21 Thread Douglas Gilbert


Torben Mathiasen wrote:
> 
> Ok, small patch cooked up. Not tested, not compiled. Give
> it a try, and if it works please send it off to Linus.
> I really need to get some work done on a project...

Here is a very similar patch that has been tested
[with a USB zip drive using sg (builtin) to read it].
It worked and the /proc/scsi/sg directory was
properly populated.

Doug Gilbert

--- linux/drivers/scsi/sg.c Thu Sep 21 15:05:28 2000
+++ linux/drivers/scsi/sg.c3117 Thu Sep 21 15:22:08 2000
@@ -17,8 +17,8 @@
  * any later version.
  *
  */
- static char * sg_version_str = "Version: 3.1.16 (2716)";
- static int sg_version_num = 30116; /* 2 digits for each component */
+ static char * sg_version_str = "Version: 3.1.17 (2921)";
+ static int sg_version_num = 30117; /* 2 digits for each component */
 /*
  *  D. P. Gilbert ([EMAIL PROTECTED], [EMAIL PROTECTED]), notes:
  *  - scsi logging is available via SCSI_LOG_TIMEOUT macros. First
@@ -1298,18 +1298,20 @@
 }
 
 #ifdef MODULE
-
 MODULE_PARM(def_reserved_size, "i");
 MODULE_PARM_DESC(def_reserved_size, "size of buffer reserved for each fd");
+#endif /* MODULE */
 
-int init_module(void) {
+int __init init_sg(void) {
+#ifdef MODULE
 if (def_reserved_size >= 0)
sg_big_buff = def_reserved_size;
+#endif /* MODULE */
 sg_template.module = THIS_MODULE;
 return scsi_register_module(MODULE_SCSI_DEV, &sg_template);
 }
 
-void cleanup_module( void)
+void __exit exit_sg( void)
 {
 #ifdef CONFIG_PROC_FS
 sg_proc_cleanup();
@@ -1324,7 +1326,6 @@
 }
 sg_template.dev_max = 0;
 }
-#endif /* MODULE */
 
 
 #if 0
@@ -2782,3 +2783,7 @@
 return 1;
 }
 #endif  /* CONFIG_PROC_FS */
+
+
+module_init(init_sg);
+module_exit(exit_sg);

[PATCH] Fix sg in 2.4.0-test9-pre5 when builtin

2000-09-21 Thread Douglas Gilbert


Linus,
This patch has been generated in response to the thread: 
"[2.4.0-test9-pre5] SCSI still broken, trident/mixer 
still broken" on lkml today. Simon Kirby reported that
the SCSI generic (sg) wasn't working in the latest
pre-release when the driver was built into the kernel.

Torben Mathiasen wrote:
> 
> On Thu, Sep 21 2000, Douglas Gilbert wrote:
> > Torben Mathiasen wrote:
> > >
> > > Ok, small patch cooked up. Not tested, not compiled. Give
> > > it a try, and if it works please send it off to Linus.
> > > I really need to get some work done on a project...
> >
> > Here is a very similar patch that has been tested
> > [with a USB zip drive using sg (builtin) to read it].
> > It worked and the /proc/scsi/sg directory was
> > properly populated.
> >
> 
> Looks good, but you should make the init functions static.

Done. Also added conditionals to make it compile cleanly
when procfs is not present. Tested as builtin and module,
with and without procfs.

Doug Gilbert

--- linux/include/scsi/sg.h Sun Jul 16 18:38:11 2000
+++ linux/include/scsi/sg.h3117 Thu Sep 21 20:00:08 2000
@@ -11,9 +11,13 @@
 Version 2 and 3 extensions to driver:
 *   Copyright (C) 1998 - 2000 Douglas Gilbert
 
-Version: 3.1.16 (2716)
-This version is for 2.3/2.4 series kernels.
+Version: 3.1.17 (2921)
+This version is for 2.4 series kernels.
 
+Changes since 3.1.16 (2716)
+   - changes for new scsi subsystem initialization
+   - change Scsi_Cmnd usage to Scsi_Request
+   - cleanup for no procfs
 Changes since 3.1.15 (2528)
- further (scatter gather) buffer length changes
 Changes since 3.1.14 (2503)
--- linux/drivers/scsi/sg.c Wed Sep 20 22:06:26 2000
+++ linux/drivers/scsi/sg.c3117 Thu Sep 21 20:13:12 2000
@@ -17,8 +17,11 @@
  * any later version.
  *
  */
- static char * sg_version_str = "Version: 3.1.16 (2716)";
- static int sg_version_num = 30116; /* 2 digits for each component */
+#include 
+#ifdef CONFIG_PROC_FS
+ static char * sg_version_str = "Version: 3.1.17 (2921)";
+#endif
+ static int sg_version_num = 30117; /* 2 digits for each component */
 /*
  *  D. P. Gilbert ([EMAIL PROTECTED], [EMAIL PROTECTED]), notes:
  *  - scsi logging is available via SCSI_LOG_TIMEOUT macros. First
@@ -38,7 +41,6 @@
  *  # cat /proc/scsi/sg/debug
  *
  */
-#include 
 #include 
 
 #include 
@@ -235,10 +237,12 @@
 static int sg_ms_to_jif(unsigned int msecs);
 static unsigned sg_jif_to_ms(int jifs);
 static int sg_allow_access(unsigned char opcode, char dev_type);
-static int sg_last_dev(void);
 static int sg_build_dir(Sg_request * srp, Sg_fd * sfp, int dxfer_len);
 static void sg_unmap_and(Sg_scatter_hold * schp, int free_also);
 static Sg_device * sg_get_dev(int dev);
+#ifdef CONFIG_PROC_FS
+static int sg_last_dev(void);
+#endif
 
 static Sg_device ** sg_dev_arr = NULL;
 
@@ -1298,18 +1302,20 @@
 }
 
 #ifdef MODULE
-
 MODULE_PARM(def_reserved_size, "i");
 MODULE_PARM_DESC(def_reserved_size, "size of buffer reserved for each fd");
+#endif /* MODULE */
 
-int init_module(void) {
+static int __init init_sg(void) {
+#ifdef MODULE
 if (def_reserved_size >= 0)
sg_big_buff = def_reserved_size;
+#endif /* MODULE */
 sg_template.module = THIS_MODULE;
 return scsi_register_module(MODULE_SCSI_DEV, &sg_template);
 }
 
-void cleanup_module( void)
+static void __exit exit_sg( void)
 {
 #ifdef CONFIG_PROC_FS
 sg_proc_cleanup();
@@ -1324,7 +1330,6 @@
 }
 sg_template.dev_max = 0;
 }
-#endif /* MODULE */
 
 
 #if 0
@@ -1972,6 +1977,7 @@
 return resp;
 }
 
+#ifdef CONFIG_PROC_FS
 static Sg_request * sg_get_nth_request(Sg_fd * sfp, int nth)
 {
 Sg_request * resp;
@@ -1985,6 +1991,7 @@
 read_unlock_irqrestore(&sfp->rq_list_lock, iflags);
 return resp;
 }
+#endif
 
 /* always adds to end of list */
 static Sg_request * sg_add_request(Sg_fd * sfp)
@@ -2064,6 +2071,7 @@
 return res;
 }
 
+#ifdef CONFIG_PROC_FS
 static Sg_fd * sg_get_nth_sfp(Sg_device * sdp, int nth)
 {
 Sg_fd * resp;
@@ -2077,6 +2085,7 @@
 read_unlock_irqrestore(&sg_dev_arr_lock, iflags);
 return resp;
 }
+#endif
 
 static Sg_fd * sg_add_sfp(Sg_device * sdp, int dev)
 {
@@ -2410,6 +2419,7 @@
 }
 
 
+#ifdef CONFIG_PROC_FS
 static int sg_last_dev()
 {
 int k;
@@ -2421,6 +2431,7 @@
 read_unlock_irqrestore(&sg_dev_arr_lock, iflags);
 return k + 1;   /* origin 1 */
 }
+#endif
 
 static Sg_device * sg_get_dev(int dev)
 {
@@ -2782,3 +2793,7 @@
 return 1;
 }
 #endif  /* CONFIG_PROC_FS */
+
+
+module_init(init_sg);
+module_exit(exit_sg);

Re: 2.4.0-test12 unresolved symbols in ide-scsi.o

2000-12-13 Thread Douglas Gilbert


Tracy,
All scsi modules built with lk 2.4.0-test12 are broken due to
scsi_sym.o being moved in drivers/scsi/Makefile .

This patch against test12 from Bob Tracy worked for me.

Doug Gilbert


--- linux/drivers/scsi/Makefile Tue Dec 12 10:49:32 2000
+++ linux/drivers/scsi/Makefile.t12bt   Tue Dec 12 22:46:27 2000
@@ -30,7 +30,7 @@
 CFLAGS_gdth.o= # -DDEBUG_GDTH=2 -D__SERIAL__ -D__COM2__ -DGDTH_STATISTICS
 CFLAGS_seagate.o =   -DARBITRATE -DPARITY -DSEAGATE_USE_ASM
 
-obj-$(CONFIG_SCSI) += scsi_mod.o
+obj-$(CONFIG_SCSI) += scsi_mod.o scsi_syms.o
 
 obj-$(CONFIG_A4000T_SCSI)  += amiga7xx.o   53c7xx.o
 obj-$(CONFIG_A4091_SCSI)   += amiga7xx.o   53c7xx.o
@@ -122,8 +122,7 @@
 scsi_mod-objs  := scsi.o hosts.o scsi_ioctl.o constants.o \
scsicam.o scsi_proc.o scsi_error.o \
scsi_obsolete.o scsi_queue.o scsi_lib.o \
-   scsi_merge.o scsi_dma.o scsi_scan.o \
-   scsi_syms.o
+   scsi_merge.o scsi_dma.o scsi_scan.o
 
 sr_mod-objs:= sr.o sr_ioctl.o sr_vendor.o
 initio-objs:= ini9100u.o i91uscsi.o
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: aic7xxx

2000-12-17 Thread Douglas Gilbert


Mohammad A. Haque wrote:
>
> Weird. The modules just give me unresolved symbol errors instead of the
> loop.
> 
> Mathias Wiklander wrote:
> > 
> > Sorry I've forgot that. It is 2.4.0-test12
> >

There was a SCSI Makefile bug in test12 that caused
those unresoved symbols. This patch from Bob Tracy
fixes it.

Doug Gilbert

--- linux/drivers/scsi/Makefile Tue Dec 12 10:49:32 2000
+++ linux/drivers/scsi/Makefile.t12bt   Tue Dec 12 22:46:27 2000
@@ -30,7 +30,7 @@
 CFLAGS_gdth.o= # -DDEBUG_GDTH=2 -D__SERIAL__ -D__COM2__ -DGDTH_STATISTICS
 CFLAGS_seagate.o =   -DARBITRATE -DPARITY -DSEAGATE_USE_ASM
 
-obj-$(CONFIG_SCSI) += scsi_mod.o
+obj-$(CONFIG_SCSI) += scsi_mod.o scsi_syms.o
 
 obj-$(CONFIG_A4000T_SCSI)  += amiga7xx.o   53c7xx.o
 obj-$(CONFIG_A4091_SCSI)   += amiga7xx.o   53c7xx.o
@@ -122,8 +122,7 @@
 scsi_mod-objs  := scsi.o hosts.o scsi_ioctl.o constants.o \
scsicam.o scsi_proc.o scsi_error.o \
scsi_obsolete.o scsi_queue.o scsi_lib.o \
-   scsi_merge.o scsi_dma.o scsi_scan.o \
-   scsi_syms.o
+   scsi_merge.o scsi_dma.o scsi_scan.o
 
 sr_mod-objs:= sr.o sr_ioctl.o sr_vendor.o
 initio-objs:= ini9100u.o i91uscsi.o

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: Oops with 2.4.0-test13pre3 - swapoff

2000-12-20 Thread Douglas Gilbert

Zdenek Kabelac wrote:
> This is oops I've got when rebooting after some heavy disk activity on
> my SMP system:
> 
> Written by hand:
> 
> kernel BUG swap_state.c:78!
[snip]

Same here during a halt of a RH 6.2 based K6-2 500 MHz
UP machine running lk240t13p3. The machine had been on
for a while and had built a kernel amongst other things.

Lead up was:
$ halt
.
Sending all processes the KILL signal[OK]
Turning off swap VM: __lru_cache_del, found unknown page ?!
kernel BUG at swap_state.c:78

Doug Gilbert

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: Laptop system clock slow after suspend to disk. (2.4.0-test9/hinote VP)

2000-12-20 Thread Douglas Gilbert

Ian Stirling <[EMAIL PROTECTED]> wrote:

> I've not noticed this on earlier kernel versions, is there something
> silly I'm missing that's making my DEC hinote VP (p100 laptop)s 
> system clock slow by a factor of five or so after resume?
> Not the CPU or cmos clock, only the system clock.
> Thoughts welcome.

I saw something like this on my thinkpad (RH6.2)
and it turned out to be connected to /etc/adjtime .
It was cured by changing the large numbers in
there to zeroes.

Could someone explain the mechanism?

Doug Gilbert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: SCSI Problems since upgrade from 2.2.16

2000-12-29 Thread Douglas Gilbert


"George R. Kasica" wrote:
> 
> Hello:
> 
> I'm running an HP DAT 4mm Autochanger here and since going to 2.2.17
> and 2.2.18 I'm seeing failures when it attempts to unload the tape and
> load a new one while backing up using BRU PE...utilizing the mt or mtx
> commands as follows:
> 
> mt -f $DEV rewoffl 2>&1 >/dev/null
> 
> OR
> 
> /usr/local/bin/mtx -f /dev/sg1 eject 2>&1 >/dev/null
> 
> If I use the MTX command set to "manually" change the tapes all is
> wellany thoughts on the cause or a fix...I don't think the
> hardware is broken due to the fact it runs fine "manually" by doing
> the mtx -f /dev/sg1 next commnand to load the next tape
> 
> Pertinent info below:
> 
> Information about installed SCSI devices
> 
> Attached devices:
> Host: scsi0 Channel: 00 Id: 01 Lun: 00
>   Vendor: SEAGATE  Model: ST32550N Rev: 0021
>   Type:   Direct-AccessANSI SCSI revision: 02
> Host: scsi0 Channel: 00 Id: 03 Lun: 00
>   Vendor: HP   Model: C1553A   Rev: NS01
>   Type:   Sequential-AccessANSI SCSI revision: 02
> Host: scsi0 Channel: 00 Id: 03 Lun: 01
>   Vendor: HP   Model: C1553A   Rev: NS01
>   Type:   Medium Changer   ANSI SCSI revision: 02

While I am not familiar with mtx and the process you
are having problems with, 'man 1 mtx' contains the 
following:
   The first argument, given following -f, is the SCSI
   generic device corresponding to your media changer.

On the basis of the /proc/scsi/scsi output you have shown,
the mtx commands should read "mtx -f /dev/sg2 ..." 
(not /dev/sg1) as you have noted at the top.

Given those 2 sg devices are closely coupled (just
differing by the lun) mtx probably can sort this out.

Doug Gilbert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: devices.txt inconsistency

2001-01-02 Thread Douglas Gilbert


While on this subject, the description of raw devices
(char 162) in lk 2.4 is not consistent with current 
usage.

devices.txt contains this:
162 charRaw block device interface
  0 = /dev/raw  Raw I/O control device
  1 = /dev/raw1 First raw I/O device
  2 = /dev/raw2 Second raw I/O device
...

but something like this would be more accurate:
162 charRaw block device interface
  0 = /dev/rawctl   Raw I/O control device
  1 = /dev/raw/raw1 First raw I/O device
  2 = /dev/raw/raw2 Second raw I/O device
...

The raw(8) command supplied in RH 6.2 and 7.0 assumes the
latter structure. I have already alerted sct and this 
change may be coming through in one of his patches.

Doug Gilbert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: 2.4.0-test9 module sd.o didn't get installed

2000-10-04 Thread Douglas Gilbert

Lee Mitchell wrote:
> Compiled 2.4.0-test9 with an aic7xxx card (all scsi stuff compiled as
> modules).
> 
> make modules_install did not install the module sd.o as i found out when
> rebooting.

The scsi disk (sd) module has been named "sd_mod.o"
for some time but in July this year in the development
kernels it was accidentally renamed "sd.o" . It has now
reverted to its former name.

Doug Gilbert

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [PATCH] 2nd go for scsi upper layers + I2O

2000-10-06 Thread Douglas Gilbert

Torben Mathiasen wrote:

> Ok this patch should be diffed correctly. Same things apply:
>
>   apply patch
>   copy sd.c st.c sg.c sr.c sr_ioctl.c sr_vendor.c from
>   drivers/scsi to drivers/scsi/upper  
>
> The EXPORT_SYMBOL has been removed as Jeff suggested.
> 
> TLAN will hopefully follow soon.

[snipped most of patch, see lkml]

> +sg_mod.o: $(sg_mod-objs)
> +   $(LD) -r -o $@ $(sg_mod-objs)

Firstly, I just spent 2 months trying to get the sd module
name reverted to "sd_mod.o" as it is in lk 2.2 and 2.0 .Now 
this patch seems to rename the sg module to "sg_mod.o"! 
Since the vast majority of distributions build sg as a module,
there could be a lot of irate SANE and cdrecord users out
there. Also devfsd and other module loading software would 
need to be changed. Hopefully this is an oversight.

Secondly, do we really need the scsi Makefile and directory
structure changed this close to the lk 2.4 release?
What does it gain us? Could changes of this dimension be
sent to Eric Youngdale or at least the linux-scsi list
rather than just sent to the linux-kernel list?

Doug Gilbert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: SCSI problems with v2.2.16 (as shipped with Redhat v7.0)

2000-10-09 Thread Douglas Gilbert

Graham Leggett wrote:

> > > all attempts to access the scanner, including running the xsane program,
> > > or even probing for attached scanners with "scanimage -L" cause the box
> > > to run extremely slowly. CTL-C the program accessing the scanner and the
> > > system responsiveness returns to normal.
> > 
> > What scsi controller card are you using, and is the kernel/box SMP ?
> 
> Oct  3 09:47:59 watchman kernel: sym53c416.c: Version 1.0.0
[snip]

Graham,
The sg driver used in RH 7.0 [sg version 2.1.38] breaks the
sym53c416 driver. Please try sg 2.1.39 (as found in lk 2.2.17)
which can be found at: http://www.torque.net/sg
and tell me if that fixes it.

BTW Read the note at the end of that page. Redhat have backported
the RT signal patch which changes the number of parameters to
the kill_fasync() call made by sg. It is safe to add a 3rd
argument of POLL_IN to that call otherwise the compile breaks .

Doug Gilbert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: OOPS REPORT: Will someone _please_ look at this? (was Re: BUG & OOPS REPORT: /proc/scsi/ entries not properly cleaned up)

2000-10-10 Thread Douglas Gilbert

Matthew Dharm wrote:
> 
> Yet more followup with myself I can reproduce this problem on
> 2.4.0-test10-pre1 every time.  I'm using the ide-scsi and usb-storage
> modules to trigger the bug -- loading and then unloading either one causes
> /proc/scsi to not be cleaned up properly.
> 
> As yet, nobody has indicated to me that they are looking into this problem.
> I've taken a few experimental pokes at it, and it seems that the SCSI layer
> is, in fact, calling the procfs function to remove the entries.  At least,
> I think it is -- I'm not sure if it's the right entry.
> 
> Will someone _please_ look at this?  I consider this a critical item for
> 2.4.0, and I hope others do too.

Matt,
Well I looked at it on the weekend and didn't see anything.
Unfortunately I don't have any USB devices or IDE cdroms
in the right place to replicate your configuration.

I guess the problem revolves around calling  
remove_proc_entry() with the appropriate
arguments bottom up for the subtree you wish to
delete from procfs.

One way that I have noticed that you can see
that remove_proc_entry() is working is to
place the cwd in a procfs directory belonging
to driver you are about to rmmod. [For example:
'cd /proc/scsi/sg ; rmmod sg']. When the rmmod
is done the kernel spits out a message like:
   remove_proc_entry: scsi/sg busy, count=1

When I back out of that directory the kernel outputs:
   de_put: deferred delete of sg

Hope this helps.

Doug Gilbert

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

RH 7.0, devfs + lk 2.4

2000-10-13 Thread Douglas Gilbert


I have been fighting with RH 7.0 trying to make it
work with devfs and the lk 2.4 series. This is the
second time round the loop as I did the same with
RH 6.2 .

The /etc/securetty file no longer needs to be changed
but /etc/security/console.perms needs a different
patch to allow non-root users to start X:

--- /etc/security/console.perms_rh70Tue Aug 22 21:19:33 2000
+++ /etc/security/console.perms Fri Oct 13 20:08:58 2000
@@ -15,7 +15,7 @@
 # man 5 console.perms
 
 # file classes -- these are regular expressions
-=tty[0-9][0-9]* :[0-9]\.[0-9] :[0-9]
+=tty[0-9][0-9]* vc/[0-9][0-9]* :[0-9]\.[0-9] :[0-9]
 =:[0-9]\.[0-9] :[0-9]
 
 # device classes -- these are shell-style globs


Hopefully this patch does not compromise security.

My page on devfs and scsi has been updated:
http://www.torque.net/devfs_scsi.html


Doug Gilbert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: failure to blank CDRWs (2.2.18pre15 smp ide-scsi hp7100i)

2000-10-16 Thread Douglas Gilbert

Mark Cooke wrote:
> On Mon, 16 Oct 2000, Andre Hedrick wrote:
>
> > Yes but there is a way to do this directly now, the question is can the
> > user-space apps change to go both ways.
> 
> Hi Andre,
>
> Is there any tool / test code that you know of to 'do this directly' -
> I'm wanting to try to avoid ade-scsi translation, and show the drive's
> still working okay for blanking.  One less variable in the soup to
> worry about then.

As far as I know, cdrecord interfaces to Linux either
via the sg or pg devices. No-one would be happier than
I if cdrecord bypassed the sg driver and spoke to the
cdrom driver directly. I know the CDROM_SEND_PACKET
ioctl() is in place for lk 2.4 but from which version
has it been functional in the lk 2.2 series?

Jens, do you know of some example code that shows the
CDROM_SEND_PACKET ioctl being exercised for non-trivial
work? Something that could be sent onto Joerg Schilling.

> Aside: Browsing through the cdrecord 10a4 source does flag a specific
> note in the mmc driver about ATIP not being supported on the 7100, but
> seems to suggest that a failure to read the ATIP data's non-fatal...

Sg has an ioctl called SG_SET_TRANSFORM which is only
relevant to the ide-scsi driver. As far as I know, no
applications use it. Still it is not clear why Mark's
system would work on a UP machine but fail on a SMP box.

Doug Gilbert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: aic7xxx of 2.4.2: 'cdrecord -scanbus' complains about DVD

2001-03-10 Thread Douglas Gilbert


Harald Dunkel wrote:
> When I run 'cdrecord -scanbus', then cdrecord complains about my
> DVD:
> 
> # cdrecord -scanbus
> Cdrecord 1.9 (i686-pc-linux-gnu) Copyright (C) 1995-2000 Jörg Schilling
> Linux sg driver version: 3.1.17
> Using libscg version 'schily-0.1'
> scsibus0:
> 0,0,0 0) *
> 0,1,0 1) *
> cdrecord: Warning: controller returns wrong size for CD capabilities page.
> 0,2,0 2) 'PIONEER ' 'DVD-ROM DVD-303 ' '1.09' Removable CD-ROM
> 0,3,0 3) *
> 0,4,0 4) 'PIONEER ' 'CD-ROM DR-U16S  ' '1.01' Removable CD-ROM
> 0,5,0 5) *
>
> Is this warning correct? Or is this a problem of cdrecord?

This is noise coming out of cdrecord and not a aic7xxx, sg
or linux issue. I believe Joerg Schilling is pointing out 
that the device in question is at variance with some standard 
or convention. The subject has been raised on the cdwrite list.

Doug Gilbert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.4.2ac20

2001-03-13 Thread Douglas Gilbert


> David Balazic wrote:
> > 
> > Nathan Walp ([EMAIL PROTECTED]) wrote :
> > 
> > > Also, sometime between ac7 and ac18 (spring break kept me from testing
> > > stuff inbetween), i assume during the new aic7xxx driver merge, the
> > > order of detection got changed, and now the ide-scsi virtual host is
> > > host0, and my 29160N is host1. Is this on purpose? It messed up a
> > > bunch of my stuff as far as /dev and such are concerned.
> > 
> > SCSI adapters are enumerated randomly(*) , relying on certain numbering
> > will get you into trouble, sooner or later.
> > There is no commonly accepted solution, AFAIK.
> > The same thing can happent to disk enumeration ( sdb becomes sdc )
> > or partition enumeration ( hda6 becomes hda5 ).
> > 
> > * - theoreticaly no, but practicaly yes ( most of the time )
> 
> SCSI adapters are given host numbers in a random order?  Even with no
> hardware changes?  Does this make less than sense to anyone else?  Every
> kernel EVER up till now has had the real scsi cards (in some particular
> order) then ide-scsi.  Have I just been lucky???

Built in scsi adapter drivers are probed in the order in
which they appear in drivers/scsi/Makefile (in the lk 2.4
series). Adapters can be assigned to host numbers using
the "scsihosts" kernel boot option (but this will not
differentiate between 2 adapters controlled by the same 
driver (e.g. 2 29160 cards)). Scsi buses are scanned for
devices in ascending order.

If you have lots of SCSI devices then devfs is your friend.

Doug Gilbert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Problems with SCSI on 2.4.X

2001-03-14 Thread Douglas Gilbert

[EMAIL PROTECTED] wrote:

> I'm having some problems using SCSI-generic (sg loaded as module) to
> access my scanner on linux 2.4 (using SANE).
>
> [snip output showing timeouts]

This is most likely caused by a bug in SANE 1.0.3 and 
1.0.4 which sets timeouts on commands to 10 seconds 
rather than 10 minutes. The SANE code detects the new 
sg driver in lk 2.4.x and mistakenly shortens the 
timeout. This has been fixed in SANE's CVS (and 
RedHat's 7.1 beta (fisher)). 

Fix for SANE 1.0.4 : in file
sane-backends-1.0.4/sanei/sanei_scsi.c change line 1893 
from:
  req->sgdata.sg3.hdr.timeout = 1;
to
  req->sgdata.sg3.hdr.timeout = 10 * 60 * 1000;

If you look at the FAQ on the sg web site 
( http://www.torque.net/sg ) under the SANE entry you will
find the same information ...

Doug Gilbert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Advansys SCSI driver old verson?

2001-03-23 Thread Douglas Gilbert

icabod <[EMAIL PROTECTED]> wrote:
> I've noticed a small problem that hinders me 
> from updatingmy system to the new 2.4 kernels. 
> I'm using a PowerMac with a Advansys SCSI 3940UW 
> card in it running my drives. I've noticed that 
> since the 2.4 kernel series the advanasys drivers 
> version 3.2M and the driver version that works for 
> me and came with the 2.2.17+ kernels at least 
> workes fid with my card, that version is 3.3D. I 
> was wondering if anyone had noticed this before, 
> or if there s a reason the older driver was used. 
> The reason I bring this up is tht the driver in 
> the 2.4 kernel series does not drive my particular
> card. I hope that the readers of this list find 
> this helpful, if you have any questions please 
> feel free to reply. Thanks.

Andy Kellner (from ConnectCom Solutions formerly 
known as Advansys) and Bob Frey (former maintainer) 
working in concert have posted several "3.3x" versions 
of the advansys driver to the linux-scsi list. Despite
this, there seems no sign of this improved driver 
being included in the 2.4 series kernels. However 
the advansys driver was upgraded to version 3.3D in 
lk 2.2.18 . Have there been any adverse reports about 
the advansys driver in lk 2.2.18 ?

I am currently using advansys driver version 3.3G
without problems on lk 2.4.2 . The changelog at the
top of the advansys.c file shows an impressive number 
of improvements and fixes between versions 3.2M and 
3.3G (including powerpc support, see below).

This is not the first time that I've sent such a post 
trying to press for an update of this driver. I was 
told that the patch was too big to be contemplated for 
lk 2.4 . Well size didn't stop the complete replacement 
of the much used aic7xxx driver.

You do have some short term options. Since the 
advansys driver has the same source in the lk 2.2 
and lk 2.4 series, you can copy the version 3.3D
advansys.[hc] files from your lk 2.2.18 source to
your lk 2.4.2 source and it will build ok.
Alternatively you can get a recent version (version 
3.3F) from:
http://www.connectcom.net/support/evaluation.html

Doug Gilbert

Changelog from advansys.c file between versions
3.2M and 3.3G follows:

 3.2N (4/1/00):
 1. Add CONFIG_ISA ifdef code.
 2. Include advansys_interrupts_enabled name change patch.
 3. For >= v2.3.28 use new SCSI error handling with new function
advansys_eh_bus_reset(). Don't include an abort function
because of base library limitations.
 4. For >= v2.3.28 use per board lock instead of io_request_lock.
 5. For >= v2.3.28 eliminate advansys_command() and
advansys_command_done().
 6. Add some changes for PowerPC (Big Endian) support, but it isn't
working yet.
 7. Fix "nonexistent resource free" problem that occurred on a module
unload for boards with an I/O space >= 255. The 'n_io_port' field
is only one byte and can not be used to hold an ioport length more
than 255.

 3.3A (4/4/00):
 1. Update to Adv Library 5.8.
 2. For wide cards add support for CDBs up to 16 bytes.
 3. Eliminate warnings when CONFIG_PROC_FS is not defined.

 3.3B (5/1/00):
 1. Support for PowerPC (Big Endian) wide cards. Narrow cards
still need work.
 2. Change bitfields to shift and mask access for endian
portability.

 3.3C (10/13/00):
 1. Update for latest 2.4 kernel.
 2. Test ABP-480 CardBus support in 2.4 kernel - works!
 3. Update to Asc Library S123.
 4. Update to Adv Library 5.12.

 3.3D (11/22/00):
 1. Update for latest 2.4 kernel.
 2. Create patches for 2.2 and 2.4 kernels.

 3.3E (1/9/01):
 1. Now that 2.4 is released remove ifdef code for kernel versions
less than 2.2. The driver is now only supported in kernels 2.2,
2.4, and greater.
 2. Add code to release and acquire the io_request_lock in
the driver entrypoint functions: advansys_detect and
advansys_queuecommand. In kernel 2.4 the SCSI mid-level driver
still holds the io_request_lock on entry to SCSI low-level drivers.
This was supposed to be removed before 2.4 was released but never
happened. When the mid-level SCSI driver is changed all references
to the io_request_lock should be removed from the driver.
 3. Simplify error handling by removing advansys_abort(),
AscAbortSRB(), AscResetDevice(). SCSI bus reset requests are
now handled by resetting the SCSI bus and fully re-initializing
the chip. This simple method of error recovery has proven to work
most reliably after attempts at different methods. Also now only
support the "new" error handling method and remove the obsolete

Re: ext2 corruption in 2.4.2, scsi only system

2001-03-26 Thread Douglas Gilbert


Dale E Martin wrote:
> [snip]
> I had had good luck with 2.4.x on other boxes, so I put it 
> on this machine as well.  Several times now I've seen ext2 
> corruption with no other noteworthy logs.
> .
> The machine is a dual PPro, it has a Buslogic BT958 with a 
> single 9G scsi/wide drive in it.
> 

Dale,
Alan Cox has reported the following:

> 2.4.2-ac19
> ...
> o   Hopefully fix the buslogic corruptions  (me)

Alan's ac tree also contains a consolidated set of
patches from Eric Youngdale for the SCSI midlevel.
Alan's latest is ac25 and may be worth trying (ac24
has been working fine for me).

Doug Gilbert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: add-single-device won't work in 2.4.3

2001-03-31 Thread Douglas Gilbert


Armin,
It works for me:

$ uname -a
Linux frig 2.4.3 #1 Fri Mar 30 16:33:45 EST 2001 i586 unknown

$ cat /proc/scsi/scsi
Attached devices: 
Host: scsi1 Channel: 00 Id: 01 Lun: 00
  Vendor: IBM  Model: DNES-309170W Rev: SA30
  Type:   Direct-AccessANSI SCSI revision: 03
Host: scsi2 Channel: 00 Id: 05 Lun: 00
  Vendor: UMAX Model: Astra 1220S  Rev: V1.2
  Type:   Scanner  ANSI SCSI revision: 02
Host: scsi2 Channel: 00 Id: 06 Lun: 00
  Vendor: YAMAHA   Model: CRW4416S Rev: 1.0g
  Type:   CD-ROM   ANSI SCSI revision: 02

$ echo "scsi remove-single-device 2 0 5 0" > /proc/scsi/scsi
$ cat /proc/scsi/scsi 
Attached devices: 
Host: scsi1 Channel: 00 Id: 01 Lun: 00
  Vendor: IBM  Model: DNES-309170W Rev: SA30
  Type:   Direct-AccessANSI SCSI revision: 03
Host: scsi2 Channel: 00 Id: 06 Lun: 00
  Vendor: YAMAHA   Model: CRW4416S Rev: 1.0g
  Type:   CD-ROM   ANSI SCSI revision: 02

$ echo "scsi add-single-device 2 0 5 0" > /proc/scsi/scsi
$ cat /proc/scsi/scsi
Attached devices: 
Host: scsi1 Channel: 00 Id: 01 Lun: 00
  Vendor: IBM  Model: DNES-309170W Rev: SA30
  Type:   Direct-AccessANSI SCSI revision: 03
Host: scsi2 Channel: 00 Id: 06 Lun: 00
  Vendor: YAMAHA   Model: CRW4416S Rev: 1.0g
  Type:   CD-ROM   ANSI SCSI revision: 02
Host: scsi2 Channel: 00 Id: 05 Lun: 00
  Vendor: UMAX Model: Astra 1220S  Rev: V1.2
  Type:   Scanner  ANSI SCSI revision: 02

$ sg_scan -i
/dev/sg0: scsi1 channel=0 id=1 lun=0  type=0
IBM   DNES-309170W  SA30 [wide=1 sync=1 cmdq=1 sftre=0 pq=0x0]  
/dev/sg1: scsi2 channel=0 id=6 lun=0  type=5
YAMAHACRW4416S  1.0g [wide=0 sync=1 cmdq=0 sftre=0 pq=0x0] 
/dev/sg2: scsi2 channel=0 id=5 lun=0  type=6
UMAX  Astra 1220S   V1.2 [wide=0 sync=0 cmdq=0 sftre=0 pq=0x0]

This last command is from sg_utils and it sends actual
INQUIRY commands rather than relying on data held in
the midlayer. This demonstrates the devices are responding.


Run on a AMD K6-2 500MHz machine with 2 advansys adapters.
Could you retest. If it continues to fail then it may be
a problem with the new aic7xxx driver. You also have the
option of building with the "old" (i.e. former) aic7xxx
driver.

Doug Gilbert

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: scsi bus numbering

2001-04-01 Thread Douglas Gilbert

Peter Daum <[EMAIL PROTECTED]> wrote:
> For some reason, the order of initializing the scsi drivers
> changed between 2.4.2 and 2.4.3: If both, ncr53c8xx and aic7xxx
> drivers are included in the kernel, up to version 2.4.2, the
> adaptec driver always came first (so the first disk on an adaptec
> controller ended up as /dev/sda) while in 2.4.3, the ncr driver
> initializes first and all the device names change - with
> potentially disastrous effects for unsuspecting users.
> 
> AFAIK, the numbering of scsi busses depends only on the order the
> low-level drivers are loaded. Not that I can think of any better
> way to do this, but it would be good if things were a little bit
> more predictable - in absence of any better idea, maybe by
> loading the drivers in alphabetical order or something like that ...

Looking at the drivers/scsi/Makefile file in lk 2.4.3
you can see that aic7xxx_old.o is about half way down
the list with ncr53c8xx.o towards the end. So this
dictates the old behaviour (in the lk 2.4 series).
However the new aic7xxx driver isn't in that list,
it has its own entry:
subdir-$(CONFIG_SCSI_AIC7XXX)   += aic7xxx
which seems to invoke drivers/scsi/aic7xxx/Makefile
_after_ all other built in adapters drivers are built.
Maybe another "make" mechanism needs to be found to
restore the previous ordering information. In any case
building the aic7xxx driver last has already surprised
a lot of people.

> How is it possible, to influence that order at the moment (for
> example, to revert to the old order)? I personally couldn't
> figure out, where to change this.

>  scsihosts  <

As a boot time option try:
  scsihosts=aic7xxx:ncr53c8xxx
or if you are using lilo, in /etc/lilo.conf add:
  append="scsihosts=aic7xxx:ncr53c8xxx"

Actually just doing:
  scsihosts=aic7xxx
should do the trick for most people.

In the unlikely case that the SCSI mid level is a module,
then you can pass the scsihosts argument to the
module load (or add an option line to /etc/modules.conf):
  modprobe scsi_mod scsihosts=aic7xxx:ncr53c8xxx

You could also read the SCSI-2.4-HOWTO at:
http://www.linuxdoc.org/HOWTO/SCSI-2.4-HOWTO/

BTW You can thank Richard Gooch and devfs for scsihosts.
Lucky he spotted the requirement some time back.

Doug Gilbert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: scsi bus numbering

2001-04-01 Thread Douglas Gilbert

Peter Daum wrote:
> 
> On Sun, 1 Apr 2001, Douglas Gilbert wrote:
> 
> [...]
> 
> > >>>>>>>>>  scsihosts  <<<<<<<<<<<<<
> >
> > As a boot time option try:
> >   scsihosts=aic7xxx:ncr53c8xxx
> > or if you are using lilo, in /etc/lilo.conf add:
> >   append="scsihosts=aic7xxx:ncr53c8xxx"
> 
> that does indeed change the bus numbering. Unfortunately, even
> with this option, the first disk on the ncr controller becomes
> "/dev/sda" ...

Peter,
This indicates that the method being used by the 
new aic7xxx driver for initialization is broken 
with respect to other scsi adapters.

The intent is that all built in HBA drivers are
initialized _before_ the built in upper level 
drivers (e.g. sd). To get the effect you describe
the driver init order seems to have been:
  register ncr53c8xxx
  register sd
  register aic7xxx  # too late ...

Doug Gilbert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 >

1 - 100 of 328 matches

Mail list logo