[PATCH] EATA and u14-34f driver update for lk-2.4.4

2001-05-01 Thread Ballabio_Dario



Here enclosed release 6.05 of the EATA ISA/EISA/PCI and 
Ultrastor 14F/34F SCSI drivers.

This patch applies to lk-2.4.4.

 *
 *   1 May 2001 Rev. 6.05 for linux 2.4.4
 *+ Clean up all pci related routines.
 *+ Fix data transfer direction for opcode SEND_CUE_SHEET (0x5d)
 *

Cheers,

**
Ph.D. Dario Ballabio
EMC Computer Systems Italia spa
Mobile phone +393487978851
Office phone +390244571315
Mobile fax   +393487951622

Se un uomo non รจ disposto a rischiare per le proprie idee,
o le sue idee non valgono niente o non vale niente lui.
**


begin 644 scsi_drivers_605.diff.gz
M'XL("&N,[CH``W-C<(YO"WCK5#DIE9`&4`!)T<5V3BK__?3T2$("C%DG>=BJ
MXX>QF)F^3$_W-]^,98_'4/:A','<=J*':IW;KKZX$.W.P1)?6A8,M>T7'1-A7^AZ9RS]15G
MM>5T=*]?@U-!1=V)Z/5D<^,62NO;BWV(-%L,%T'TRABF?E\X;'EE7WD
M,@JH^3`J,H@%<6>3UV0'UX
M]:`(T_RS\M$I*HF2^$^4G.X8"T92Y;>HHOS6B18CYL<]I$SHX5Z^9(YEC_.A
MD,C('G!OW[<'7;U]>MKO8D1[5_I%>_!O&1-Y;4R%-V]6DBGUBR\I8*&^,`),
M\ZDYQ?;M_4ZP_)##:XSMB>Z=>_ZEBB;]9Q28(\2-5N%
MI=UK<"7".]5@)C>>J$#\C',"7D4_H1SC'IYCZ2HW>4;+VP$;2C^(#1!C`_Q2
M;$@+"S".&!&9BANGA5X\)IWCJW6J'GQ0(C0GU"S.K>.:)W2$]N"'M^=R]1R_:M[CG32GDP8CD_`
M#C#P^"/@-)1<[?^98D2N*D'4HZCY)7AG#ZA,=NPM#Z&]/9!XT?SV)MTT.5-8
M63^%<5@8GH?NA"[TKLKMZUXGZ]72CV42+,]H49WYY/#$@9`]ZY9UAE/2(X8F
M9MSU4I2CH_3'_?7685,X["WQ)%OK(JW>NX9O@:A,AD"*Q-SE:01V&)`ES"&1
M"H=U?N,I:6J-WW_B7`BFMY\_E=^:AJ-_B1A>FMY@50=3C(`,3NA.`XD70X7&
M],#^+X])5FYAZ1["\3QR4)+SW,[%J7[=[>OG-Y>YF0&;HX?""$;9"Z$%JVJ-96JR*?$
MZ9]SK8%?0;>V7I&^\7+?SK269"1;ZJ+\-K.9=6MC`YFHM8G:6"`3L2X5"L`.QV[+PV2_A=H/K=J/(3D3U[AOME;
M<]R%P'3E6XA*W`P+`CRQ#=/$=.1')L(\%K4>(*#"G3&/6'*&<]]$]=URF.Q=
M#O7K=K]]@6Q/2B5^!RE^*X`RS&0X`4G#"3-9QA-^Q;/G,=C62L8]2CAWYY4$
M![7J$>%7];B)_Y/#1=!(C&$0KI(83K;<*-3Q!`ENB;\D!M4'%<%-?:A2JS5X
MVZ!6K5-[1/U-:BV:2?U5IBQ5U%3>A2<=;ZO4DD#MF%I271M12RIJ8][6U?3N
M@S](N&[RUB`!@_%V1*I'-#HBU:-#WC+A[RB)\/,UT(JM=`,2$OE4.!T7(3\3
M3VO7-]KIHP^ITQW>:*?/>Z.=KKS1UM23:O:-MD;Y5(NSB=\CN%CU@>,6GM)T
MH^@L,!OW.0YDQD>VBSS!-Q;2J1W,8!_1!$%##Q6:M"\(ZDN!;O2DJ__5[0]Z
M5Y=0Y`^?%54M(K(^-MZ@<5X2F1E(OG_L[V-AK8L#X4F^JY@^0%!5`'[?A$;>".;LV"I>3:
M8WV2/YG'^N.UM_JE]X\\U]_,0]\8A/BEU<\.:O6S7_)N__/>X:N-]7?XVHIQ
M-?\.7S2GS)S%M+<()G^6]QE>L$U$AM%7*.:)<7'Y$(]70\$X5?7_B/U/0NPD
M[3>"=CJX6]%.GUVT&]!;5;>B-TKJ*/DX?"<3_A9^WY#060ZB:WD(WS1E#<5O
DSH?]-K^2Z&+VLQ#]210OKD-2#.)K3G(<_UCX'Y`(3U=G'0``
`
end
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]



RE: Linux Cluster using shared scsi

2001-05-01 Thread Roets, Chris

So, will Linux ever support the scsi reservation mechanism as standard ?
Isn't there a standard that says if you scsi reserve a disk, no one
else should be able to access this disk, or is this a "steeleye/Compaq"
standard.

Chris

-Original Message-
From: James Bottomley [mailto:[EMAIL PROTECTED]]
Sent: Friday, April 27, 2001 5:12 PM
To: Roets, Chris
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: Re: Linux Cluster using shared scsi


I've copied linux SCSI and quoted the entire message below so they can
follow.

Your assertion that this works in 2.2.16 is incorrect, the patch to fix the 
linux reservation conflict handler has never been added to the official
tree.  
I suspect you actually don't have vanilla 2.2.16 but instead have a redhat
or 
other distribution patched version.  Most distributions include the Steeleye

SCSI clustering patches which correct reservation handling.

I've attached the complete patch, which fixes both the old and the new error

handlers in the 2.2 kernel it applies against 2.2.18.

James Bottomley


> Problem :
> install two Linux-system with a shared scsi-bus and storage on that shared
> bus.
> suppose :
> system one : SCSI ID 7
> system two : SCSI ID 6
> shared disk : SCSI ID 4
> 
> By default, you can mount the disk on both system.  This is normal
> behavior, but
> may impose data corruption.
> To prevent this, you can SCSI-reserve a disk on one system.  If the other
> system
> would try to access this device, the system should return an i/o error due
> to the reservation.
> This is a common technique used in
> - Traditional Tru64 Unix ase clustering
> - Tr64 Unix V5 Clustering to accomplish i/o barriers
> - Windows-NT Clusters
> - Steel-eye clustering
> The reservation can be done using a standard tool like scu
> 
> scu -f /dev/sdb
> scu > reserve device
> 
> On Linux, this works fine under Kernel version 2.2.16.
> Below is the code that accomplish this
> /usr/src/linux/drivers/scsi/scsi_obsolete.c in routine scsi_old_done
> case RESERVATION_CONFLICT:
> printk("scsi%d (%d,%d,%d) : RESERVATION CONFLICT\n",
>SCpnt->host->host_no, SCpnt->channel,
>SCpnt->device->id, SCpnt->device->lun);
> status = CMD_FINISHED; /* returns I/O error */
> break;
> default:
> As of kernel version 2.2.18, this code has changed, If a scsi reserve
> error
> occurs, the device driver does a scsi reset.  This way the scsi
> reservation is
> gone, and the device can be accessed.
> /usr/src/linux/drivers/scsi/scsi_obsolete.c in routine scsi_old_done 
> case RESERVATION_CONFLICT:
> printk("scsi%d, channel %d : RESERVATION CONFLICT
> performing"
>" reset.\n", SCpnt->host->host_no, SCpnt->channel);
> scsi_reset(SCpnt, SCSI_RESET_SYNCHRONOUS);
> status = REDO;
> break;
> 
> Fix : delete the scsi reset in the kernel code
> case RESERVATION_CONFLICT:
> /* Deleted Chris Roets
> printk("scsi%d, channel %d : RESERVATION CONFLICT
> performing"
>" reset.\n", SCpnt->host->host_no, SCpnt->channel);
> scsi_reset(SCpnt, SCSI_RESET_SYNCHRONOUS);
> status = REDO;
> next four lines added */
> printk("scsi%d (%d,%d,%d) : RESERVATION CONFLICT\n",
>SCpnt->host->host_no, SCpnt->channel,
>SCpnt->device->id, SCpnt->device->lun);
> status = CMD_FINISHED; /* returns I/O error */
> break;
> 
> and rebuild the kernel.
> 
> This should get the customer being able to continue
> 
Questions  :
> - why  is this scsi reset done/added as of kernel version 2.2.18
> - as we are talking about an obsolete routine, how is this accomplished 
>  in the new code and how is it activated.  
>
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]



Re: race condition in scsi_unregister_host

2001-05-01 Thread Oliver Neukum

On Tuesday,  1. May 2001 02:23, Alan Cox wrote:
> > scsi_unregister_host first checks the use count and then marks the device
> > offline. The order is wrong. By the time the device goes offline, it
> > might have been opened again.
>
> That should be right in -ac but you might want to double check

In the ac series the big kernel lock is taken, but the order is not reversed.

Regards
Oliver

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]



module use count corruption in scsi.c/scsi_register_host

2001-05-01 Thread Oliver Neukum

Hi,

scsi_register_host() can in the error case _decrement_ the module usage 
counter of the scsi core module and it has a race with module unload during
bus scan.

if (tpnt->present) {
if (pcount == next_scsi_host) {
if (tpnt->present > 1) {
printk(KERN_ERR "scsi: Failure to register low-level 
scsi driver");
scsi_unregister_host(tpnt);
return 1;

This code calls scsi_unregister_host, which decrements module use count, 
without incrementing it beforehand.

In addition, it potentially allocates memory, starts and waits for the error 
handling thread and does io to scan the bus without having incremented the 
module usage counter.

Doing the increment earlier should fix it.
That the attached patch does.
Unless I am wrong, it should go in quickly.

Regards
Oliver




--- drivers/scsi/scsi.c.alt	Wed May  2 06:39:16 2001
+++ drivers/scsi/scsi.c	Wed May  2 06:46:21 2001
@@ -1826,6 +1826,8 @@
 	   using the new scsi code. NOTE: the detect routine could
 	   redefine the value tpnt->use_new_eh_code. (DB, 13 May 1998) */
 
+MOD_INC_USE_COUNT;
+
 	if (tpnt->use_new_eh_code) {
 		spin_lock_irqsave(&io_request_lock, flags);
 		tpnt->present = tpnt->detect(tpnt);
@@ -1947,8 +1949,6 @@
 	   (scsi_init_memory_start - scsi_memory_lower_value) / 1024,
 	   (scsi_memory_upper_value - scsi_init_memory_start) / 1024);
 #endif
-
-	MOD_INC_USE_COUNT;
 
 	if (out_of_space) {
 		scsi_unregister_host(tpnt);	/* easiest way to clean up?? */



Re: Linux Cluster using shared scsi

2001-05-01 Thread Alan Cox

> reserved.But if you did such a hot swap you would have "bigger
> fish to fry" in a HA application... I mean, none of your data would be
> there! 

You need to realise this has happened and do the right thing. Since
it could be an md raid array the hotswap is not fatal.

If its fatal you need to realise promptly before you either damage
the disk contents inserted in error (if possible) and so the HA
system can take countermeasures


> if the kernel (by this I mean the scsi midlayer) was maintaining
> reservations, that there would be some logic activated to "handle"
> this problem, whether it be re-reserving the device, or the ability to

Suppose the cluster nodes don't agree on the reservation table ?

> Bus resets in the Linux drivers also tend to happen frequently when a
> disk is failing, which has tended to leave the system in a somewhat
> functional but often an unusable state, (but that's a different story...)

The new scsi EH code in 2.4 for the drivers that use it is a lot better. Real
problem.


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]



Re: Linux Cluster using shared scsi

2001-05-01 Thread James Bottomley

[EMAIL PROTECTED] said:
> Does this package also tell the kernel to "re-establish" a reservation
> for all devices after a bus reset, or at least inform a user level
> program?  Finding out when there has been a bus reset has been a
> stumbling block for me. 

[EMAIL PROTECTED] said:
> You cannot rely on a bus reset. Imagine hot swap disks on an FC
> fabric. I  suspect the controller itself needs to call back for
> problem events 

Essentially, there are many conditions which cause a quiet loss of a SCSI-2 
reservation.  Even in parallel SCSI: Reservations can be silently lost because 
of LUN reset, device reset or even simple powering off the device.

The way we maintain reservations for LifeKeeper is to have a user level daemon 
ping the device with a reservation command every few minutes.  If you get a 
RESERVATION_CONFLICT return you know that something else stole your 
reservation, otherwise you maintain it.  There is a window in this scheme 
where the device may be accessible by other initiators but that's the price 
you pay for using SCSI-2 reservations instead of the more cluster friendly 
SCSI-3 ones.  In a kernel scheme, you may get early notification of 
reservation loss by putting a hook into the processing of 
CHECK_CONDITION/UNIT_ATTENTION, but it won't close the window entirely.

James




-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]



Re: Linux Cluster using shared scsi

2001-05-01 Thread James Bottomley

[EMAIL PROTECTED] said:
> So, will Linux ever support the scsi reservation mechanism as standard? 

That's not within my gift.  I can merely write the code that corrects the 
behaviour.  I can't force anyone else to accept it.

[EMAIL PROTECTED] said:
> Isn't there a standard that says if you scsi reserve a disk, no one
> else should be able to access this disk, or is this a "steeleye/
> Compaq" standard. 

Use of reservations is laid out in the SCSI-2 and SCSI-3 standards (which can 
be downloaded from the T10 site www.t10.org) which are international in scope. 
 I think the implementation issues come because the reservations part is 
really only relevant to a multi-initiator clustered environment which isn't an 
every day configuration for most Linux users.  Obviously, as Linux moves into 
the SAN arena this type of configuration will become a lot more common, at 
which time the various problems associated with multiple initiators should 
rise in prominence.

James


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]



killing a process writing to a scsi drive freezes a system

2001-05-01 Thread hiren_mehta

Hi List,

I tried to do dd to a bunch of scsi drives running in background.

e.g. dd if=/dev/zero of=/dev/sdb bs=1024 &
 dd if=/dev/zero of=/dev/sdc bs=1024 &
 dd if=/dev/zero of=/dev/sdd bs=1024 &

Then I tried to kill the first dd process writing to /dev/sdb
and it freezes up the system. When I looked at the drives,
I can see that every 1 second the lights on all the drives (not
just sdb) flashes. This goes on for quite long time. Looks like 
after I issued the kill, the system is trying to flush its buffers. 
I am running this on ia64 system which has 4GB of memory and it 
takes lot of time to flush the buffers, because of delay of 1 second
between bunch of writes.

Can anybody tell what is going on here ? Is this is bug in the block
device driver ? If that the case, is it fixed in the lator versions
of kernel ? I am running 2.4.2 kernel.

I can switch between the vt's. But I cannot login from other vt's
until this flushing is done. Also, the current vt ( where I fired
all dd commands) and other vt's where I have already logged in, 
are completely frozen and I cannot do anything after I issue the 
kill command.

Regards,
-hiren

Regards,
-hiren
(408)970-3062
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]



Re: Linux Cluster using shared scsi

2001-05-01 Thread Eric Z. Ayers

Doug Ledford writes:
(James Bottomley commented about the need for SCSI reservation kernel patches)
 > 
 > I agree.  It's something that needs fixed in general, your software needs it
 > as well, and I've written (about 80% done at this point) some open source
 > software geared towards getting/holding reservations that also requires the
 > same kernel patches (plus one more to be fully functional, an ioctl to allow a
 > SCSI reservation to do a forced reboot of a machine).  I'll be releasing that
 > package in the short term (once I get back from my vacation anyway).
 > 

Hello Doug,

Does this package also tell the kernel to "re-establish" a
reservation for all devices after a bus reset, or at least inform a
user level program?  Finding out when there has been a bus reset has
been a stumbling block for me.

-Eric.
--
Eric Z. Ayers Lead Software Engineer
Phone:  +1 404-705-2864Computer Generation, Incorporated
Fax:+1 404-705-2805 an Intec Telecom Systems Company
Web:http://www.intec-telecom-systems.com/
Email:  [EMAIL PROTECTED]
Postal: Bldg G 4th Floor, 5775 Peachtree-Dunwoody Rd, Atlanta, GA 30342 USA
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]



Re: Linux Cluster using shared scsi

2001-05-01 Thread Doug Ledford

James Bottomley wrote:
> 
> [EMAIL PROTECTED] said:
> > So, will Linux ever support the scsi reservation mechanism as standard?
> 
> That's not within my gift.  I can merely write the code that corrects the
> behaviour.  I can't force anyone else to accept it.

I think it will be standard before not too much longer (I hope anyway, I'm
tired of carrying the patches forward all the time so I'll lend my support to
getting it into the mainstream kernel ;-)

> [EMAIL PROTECTED] said:
> > Isn't there a standard that says if you scsi reserve a disk, no one
> > else should be able to access this disk, or is this a "steeleye/
> > Compaq" standard.
> 
> Use of reservations is laid out in the SCSI-2 and SCSI-3 standards (which can
> be downloaded from the T10 site www.t10.org) which are international in scope.
>  I think the implementation issues come because the reservations part is
> really only relevant to a multi-initiator clustered environment which isn't an
> every day configuration for most Linux users.  Obviously, as Linux moves into
> the SAN arena this type of configuration will become a lot more common, at
> which time the various problems associated with multiple initiators should
> rise in prominence.

I agree.  It's something that needs fixed in general, your software needs it
as well, and I've written (about 80% done at this point) some open source
software geared towards getting/holding reservations that also requires the
same kernel patches (plus one more to be fully functional, an ioctl to allow a
SCSI reservation to do a forced reboot of a machine).  I'll be releasing that
package in the short term (once I get back from my vacation anyway).

-- 

 Doug Ledford <[EMAIL PROTECTED]>  http://people.redhat.com/dledford
  Please check my web site for aic7xxx updates/answers before
  e-mailing me about problems
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]



Re: Linux Cluster using shared scsi

2001-05-01 Thread Alan Cox

> Does this package also tell the kernel to "re-establish" a
> reservation for all devices after a bus reset, or at least inform a
> user level program?  Finding out when there has been a bus reset has
> been a stumbling block for me.

You cannot rely on a bus reset. Imagine hot swap disks on an FC fabric. I 
suspect the controller itself needs to call back for problem events

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]



Re: Linux Cluster using shared scsi

2001-05-01 Thread Eric Z. Ayers

Alan Cox writes:
 > > Does this package also tell the kernel to "re-establish" a
 > > reservation for all devices after a bus reset, or at least inform a
 > > user level program?  Finding out when there has been a bus reset has
 > > been a stumbling block for me.
 > 
 > You cannot rely on a bus reset. Imagine hot swap disks on an FC fabric. I 
 > suspect the controller itself needs to call back for problem events
 > 

I'm not an SCSI expert by any stretch of the imagination.  I think
that what you are saying is that you cannot rely that a bus reset is
as only thing that will remove a reservation.  For example, if a
device is 'hot replaced', the device will (clearly) no longer be
reserved.But if you did such a hot swap you would have "bigger
fish to fry" in a HA application... I mean, none of your data would be
there! 

My understanding is that specifically, when a bus reset occurs,  all
SCSI reservations for devices on that bus are lost.  I was hoping that
if the kernel (by this I mean the scsi midlayer) was maintaining
reservations, that there would be some logic activated to "handle"
this problem, whether it be re-reserving the device, or the ability to
pass notification of a reset (or another problem event as you point
out) up to the application that's handling reservations. 

In my experience, the most common reason for a bus reset in parallel
SCSI is that a peer host on the bus is rebooting.  Since this happens
under normal operation and well in advance of any attempt to acess the
device, it would be nice if there were some sort of asyncronous
notification instead of a polling process with an interval of 2-3
minutes, where it's conceivable that the peer system could have booted
and attempted to take-over the disk out from under a running system.  

Bus resets in the Linux drivers also tend to happen frequently when a
disk is failing, which has tended to leave the system in a somewhat
functional but often an unusable state, (but that's a different story...)

James Bottomley <[EMAIL PROTECTED]> writes:
 > Essentially, there are many conditions which cause a quiet loss of a SCSI-2 
 > reservation.  Even in parallel SCSI: Reservations can be silently lost because 
 >of LUN reset, device reset or even simple powering off the device.
...

James mentions that even handling a bus reset still leaves a window
where a peer could grab the reservation out from underneath an
un-suspecting host.  I agree that this could happen, and the old host
might perform writes to an 'unreserved' disk,  but once the second
system suceeded in obtaining the reservation, any read/write commands
from the "old" host would return SCSI errors (this is my layman's
understanding - the commands would return a UNIT_RESERVED error) , so
I believe you would have the desired behavior in this kind of cluster
- only one machine in the cluster can access the disk at the same
time.  The data on the disk should be in a state where the second
system in the cluster could start a recovery task and begin to provide
the service hosted on the disk. 

-Eric.
--
Eric Z. Ayers Lead Software Engineer
Phone:  +1 404-705-2864Computer Generation, Incorporated
Fax:+1 404-705-2805 an Intec Telecom Systems Company
Web:http://www.intec-telecom-systems.com/
Email:  [EMAIL PROTECTED]
Postal: Bldg G 4th Floor, 5775 Peachtree-Dunwoody Rd, Atlanta, GA 30342 USA
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]