[PATCH] EATA and u14-34f driver update for lk-2.4.4
Here enclosed release 6.05 of the EATA ISA/EISA/PCI and Ultrastor 14F/34F SCSI drivers. This patch applies to lk-2.4.4. * * 1 May 2001 Rev. 6.05 for linux 2.4.4 *+ Clean up all pci related routines. *+ Fix data transfer direction for opcode SEND_CUE_SHEET (0x5d) * Cheers, ** Ph.D. Dario Ballabio EMC Computer Systems Italia spa Mobile phone +393487978851 Office phone +390244571315 Mobile fax +393487951622 Se un uomo non รจ disposto a rischiare per le proprie idee, o le sue idee non valgono niente o non vale niente lui. ** begin 644 scsi_drivers_605.diff.gz M'XL("&N,[CH``W-C<(YO"WCK5#DIE9`&4`!)T<5V3BK__?3T2$("C%DG>=BJ MXX>QF)F^3$_W-]^,98_'4/:A','<=J*':IW;KKZX$.W.P1)?6A8,M>T7'1-A7^AZ9RS]15G MM>5T=*]?@U-!1=V)Z/5D<^,62NO;BWV(-%L,%T'TRABF?E\X;'EE7WD M,@JH^3`J,H@%<6>3UV0'UX M]:`(T_RS\M$I*HF2^$^4G.X8"T92Y;>HHOS6B18CYL<]I$SHX5Z^9(YEC_.A MD,C('G!OW[<'7;U]>MKO8D1[5_I%>_!O&1-Y;4R%-V]6DBGUBR\I8*&^,`), M\ZDYQ?;M_4ZP_)##:XSMB>Z=>_ZEBB;]9Q28(\2-5N% MI=UK<"7".]5@)C>>J$#\C',"7D4_H1SC'IYCZ2HW>4;+VP$;2C^(#1!C`_Q2 M;$@+"S".&!&9BANGA5X\)IWCJW6J'GQ0(C0GU"S.K>.:)W2$]N"'M^=R]1R_:M[CG32GDP8CD_` M#C#P^"/@-)1<[?^98D2N*D'4HZCY)7AG#ZA,=NPM#Z&]/9!XT?SV)MTT.5-8 M63^%<5@8GH?NA"[TKLKMZUXGZ]72CV42+,]H49WYY/#$@9`]ZY9UAE/2(X8F M9MSU4I2CH_3'_?7685,X["WQ)%OK(JW>NX9O@:A,AD"*Q-SE:01V&)`ES"&1 M"H=U?N,I:6J-WW_B7`BFMY\_E=^:AJ-_B1A>FMY@50=3C(`,3NA.`XD70X7& M],#^+X])5FYAZ1["\3QR4)+SW,[%J7[=[>OG-Y>YF0&;HX?""$;9"Z$%JVJ-96JR*?$ MZ9]SK8%?0;>V7I&^\7+?SK269"1;ZJ+\-K.9=6MC`YFHM8G:6"`3L2X5"L`.QV[+PV2_A=H/K=J/(3D3U[AOME; M<]R%P'3E6XA*W`P+`CRQ#=/$=.1')L(\%K4>(*#"G3&/6'*&<]]$]=URF.Q= M#O7K=K]]@6Q/2B5^!RE^*X`RS&0X`4G#"3-9QA-^Q;/G,=C62L8]2CAWYY4$ M![7J$>%7];B)_Y/#1=!(C&$0KI(83K;<*-3Q!`ENB;\D!M4'%<%-?:A2JS5X MVZ!6K5-[1/U-:BV:2?U5IBQ5U%3>A2<=;ZO4DD#MF%I271M12RIJ8][6U?3N M@S](N&[RUB`!@_%V1*I'-#HBU:-#WC+A[RB)\/,UT(JM=`,2$OE4.!T7(3\3 M3VO7-]KIHP^ITQW>:*?/>Z.=KKS1UM23:O:-MD;Y5(NSB=\CN%CU@>,6GM)T MH^@L,!OW.0YDQD>VBSS!-Q;2J1W,8!_1!$%##Q6:M"\(ZDN!;O2DJ__5[0]Z M5Y=0Y`^?%54M(K(^-MZ@<5X2F1E(OG_L[V-AK8L#X4F^JY@^0%!5`'[?A$;>".;LV"I>3: M8WV2/YG'^N.UM_JE]X\\U]_,0]\8A/BEU<\.:O6S7_)N__/>X:N-]7?XVHIQ M-?\.7S2GS)S%M+<()G^6]QE>L$U$AM%7*.:)<7'Y$(]70\$X5?7_B/U/0NPD M[3>"=CJX6]%.GUVT&]!;5;>B-TKJ*/DX?"<3_A9^WY#060ZB:WD(WS1E#<5O DSH?]-K^2Z&+VLQ#]210OKD-2#.)K3G(<_UCX'Y`(3U=G'0`` ` end - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED]
RE: Linux Cluster using shared scsi
So, will Linux ever support the scsi reservation mechanism as standard ? Isn't there a standard that says if you scsi reserve a disk, no one else should be able to access this disk, or is this a "steeleye/Compaq" standard. Chris -Original Message- From: James Bottomley [mailto:[EMAIL PROTECTED]] Sent: Friday, April 27, 2001 5:12 PM To: Roets, Chris Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: Re: Linux Cluster using shared scsi I've copied linux SCSI and quoted the entire message below so they can follow. Your assertion that this works in 2.2.16 is incorrect, the patch to fix the linux reservation conflict handler has never been added to the official tree. I suspect you actually don't have vanilla 2.2.16 but instead have a redhat or other distribution patched version. Most distributions include the Steeleye SCSI clustering patches which correct reservation handling. I've attached the complete patch, which fixes both the old and the new error handlers in the 2.2 kernel it applies against 2.2.18. James Bottomley > Problem : > install two Linux-system with a shared scsi-bus and storage on that shared > bus. > suppose : > system one : SCSI ID 7 > system two : SCSI ID 6 > shared disk : SCSI ID 4 > > By default, you can mount the disk on both system. This is normal > behavior, but > may impose data corruption. > To prevent this, you can SCSI-reserve a disk on one system. If the other > system > would try to access this device, the system should return an i/o error due > to the reservation. > This is a common technique used in > - Traditional Tru64 Unix ase clustering > - Tr64 Unix V5 Clustering to accomplish i/o barriers > - Windows-NT Clusters > - Steel-eye clustering > The reservation can be done using a standard tool like scu > > scu -f /dev/sdb > scu > reserve device > > On Linux, this works fine under Kernel version 2.2.16. > Below is the code that accomplish this > /usr/src/linux/drivers/scsi/scsi_obsolete.c in routine scsi_old_done > case RESERVATION_CONFLICT: > printk("scsi%d (%d,%d,%d) : RESERVATION CONFLICT\n", >SCpnt->host->host_no, SCpnt->channel, >SCpnt->device->id, SCpnt->device->lun); > status = CMD_FINISHED; /* returns I/O error */ > break; > default: > As of kernel version 2.2.18, this code has changed, If a scsi reserve > error > occurs, the device driver does a scsi reset. This way the scsi > reservation is > gone, and the device can be accessed. > /usr/src/linux/drivers/scsi/scsi_obsolete.c in routine scsi_old_done > case RESERVATION_CONFLICT: > printk("scsi%d, channel %d : RESERVATION CONFLICT > performing" >" reset.\n", SCpnt->host->host_no, SCpnt->channel); > scsi_reset(SCpnt, SCSI_RESET_SYNCHRONOUS); > status = REDO; > break; > > Fix : delete the scsi reset in the kernel code > case RESERVATION_CONFLICT: > /* Deleted Chris Roets > printk("scsi%d, channel %d : RESERVATION CONFLICT > performing" >" reset.\n", SCpnt->host->host_no, SCpnt->channel); > scsi_reset(SCpnt, SCSI_RESET_SYNCHRONOUS); > status = REDO; > next four lines added */ > printk("scsi%d (%d,%d,%d) : RESERVATION CONFLICT\n", >SCpnt->host->host_no, SCpnt->channel, >SCpnt->device->id, SCpnt->device->lun); > status = CMD_FINISHED; /* returns I/O error */ > break; > > and rebuild the kernel. > > This should get the customer being able to continue > Questions : > - why is this scsi reset done/added as of kernel version 2.2.18 > - as we are talking about an obsolete routine, how is this accomplished > in the new code and how is it activated. > - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED]
Re: race condition in scsi_unregister_host
On Tuesday, 1. May 2001 02:23, Alan Cox wrote: > > scsi_unregister_host first checks the use count and then marks the device > > offline. The order is wrong. By the time the device goes offline, it > > might have been opened again. > > That should be right in -ac but you might want to double check In the ac series the big kernel lock is taken, but the order is not reversed. Regards Oliver - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED]
module use count corruption in scsi.c/scsi_register_host
Hi, scsi_register_host() can in the error case _decrement_ the module usage counter of the scsi core module and it has a race with module unload during bus scan. if (tpnt->present) { if (pcount == next_scsi_host) { if (tpnt->present > 1) { printk(KERN_ERR "scsi: Failure to register low-level scsi driver"); scsi_unregister_host(tpnt); return 1; This code calls scsi_unregister_host, which decrements module use count, without incrementing it beforehand. In addition, it potentially allocates memory, starts and waits for the error handling thread and does io to scan the bus without having incremented the module usage counter. Doing the increment earlier should fix it. That the attached patch does. Unless I am wrong, it should go in quickly. Regards Oliver --- drivers/scsi/scsi.c.alt Wed May 2 06:39:16 2001 +++ drivers/scsi/scsi.c Wed May 2 06:46:21 2001 @@ -1826,6 +1826,8 @@ using the new scsi code. NOTE: the detect routine could redefine the value tpnt->use_new_eh_code. (DB, 13 May 1998) */ +MOD_INC_USE_COUNT; + if (tpnt->use_new_eh_code) { spin_lock_irqsave(&io_request_lock, flags); tpnt->present = tpnt->detect(tpnt); @@ -1947,8 +1949,6 @@ (scsi_init_memory_start - scsi_memory_lower_value) / 1024, (scsi_memory_upper_value - scsi_init_memory_start) / 1024); #endif - - MOD_INC_USE_COUNT; if (out_of_space) { scsi_unregister_host(tpnt); /* easiest way to clean up?? */
Re: Linux Cluster using shared scsi
> reserved.But if you did such a hot swap you would have "bigger > fish to fry" in a HA application... I mean, none of your data would be > there! You need to realise this has happened and do the right thing. Since it could be an md raid array the hotswap is not fatal. If its fatal you need to realise promptly before you either damage the disk contents inserted in error (if possible) and so the HA system can take countermeasures > if the kernel (by this I mean the scsi midlayer) was maintaining > reservations, that there would be some logic activated to "handle" > this problem, whether it be re-reserving the device, or the ability to Suppose the cluster nodes don't agree on the reservation table ? > Bus resets in the Linux drivers also tend to happen frequently when a > disk is failing, which has tended to leave the system in a somewhat > functional but often an unusable state, (but that's a different story...) The new scsi EH code in 2.4 for the drivers that use it is a lot better. Real problem. - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED]
Re: Linux Cluster using shared scsi
[EMAIL PROTECTED] said: > Does this package also tell the kernel to "re-establish" a reservation > for all devices after a bus reset, or at least inform a user level > program? Finding out when there has been a bus reset has been a > stumbling block for me. [EMAIL PROTECTED] said: > You cannot rely on a bus reset. Imagine hot swap disks on an FC > fabric. I suspect the controller itself needs to call back for > problem events Essentially, there are many conditions which cause a quiet loss of a SCSI-2 reservation. Even in parallel SCSI: Reservations can be silently lost because of LUN reset, device reset or even simple powering off the device. The way we maintain reservations for LifeKeeper is to have a user level daemon ping the device with a reservation command every few minutes. If you get a RESERVATION_CONFLICT return you know that something else stole your reservation, otherwise you maintain it. There is a window in this scheme where the device may be accessible by other initiators but that's the price you pay for using SCSI-2 reservations instead of the more cluster friendly SCSI-3 ones. In a kernel scheme, you may get early notification of reservation loss by putting a hook into the processing of CHECK_CONDITION/UNIT_ATTENTION, but it won't close the window entirely. James - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED]
Re: Linux Cluster using shared scsi
[EMAIL PROTECTED] said: > So, will Linux ever support the scsi reservation mechanism as standard? That's not within my gift. I can merely write the code that corrects the behaviour. I can't force anyone else to accept it. [EMAIL PROTECTED] said: > Isn't there a standard that says if you scsi reserve a disk, no one > else should be able to access this disk, or is this a "steeleye/ > Compaq" standard. Use of reservations is laid out in the SCSI-2 and SCSI-3 standards (which can be downloaded from the T10 site www.t10.org) which are international in scope. I think the implementation issues come because the reservations part is really only relevant to a multi-initiator clustered environment which isn't an every day configuration for most Linux users. Obviously, as Linux moves into the SAN arena this type of configuration will become a lot more common, at which time the various problems associated with multiple initiators should rise in prominence. James - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED]
killing a process writing to a scsi drive freezes a system
Hi List, I tried to do dd to a bunch of scsi drives running in background. e.g. dd if=/dev/zero of=/dev/sdb bs=1024 & dd if=/dev/zero of=/dev/sdc bs=1024 & dd if=/dev/zero of=/dev/sdd bs=1024 & Then I tried to kill the first dd process writing to /dev/sdb and it freezes up the system. When I looked at the drives, I can see that every 1 second the lights on all the drives (not just sdb) flashes. This goes on for quite long time. Looks like after I issued the kill, the system is trying to flush its buffers. I am running this on ia64 system which has 4GB of memory and it takes lot of time to flush the buffers, because of delay of 1 second between bunch of writes. Can anybody tell what is going on here ? Is this is bug in the block device driver ? If that the case, is it fixed in the lator versions of kernel ? I am running 2.4.2 kernel. I can switch between the vt's. But I cannot login from other vt's until this flushing is done. Also, the current vt ( where I fired all dd commands) and other vt's where I have already logged in, are completely frozen and I cannot do anything after I issue the kill command. Regards, -hiren Regards, -hiren (408)970-3062 [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED]
Re: Linux Cluster using shared scsi
Doug Ledford writes: (James Bottomley commented about the need for SCSI reservation kernel patches) > > I agree. It's something that needs fixed in general, your software needs it > as well, and I've written (about 80% done at this point) some open source > software geared towards getting/holding reservations that also requires the > same kernel patches (plus one more to be fully functional, an ioctl to allow a > SCSI reservation to do a forced reboot of a machine). I'll be releasing that > package in the short term (once I get back from my vacation anyway). > Hello Doug, Does this package also tell the kernel to "re-establish" a reservation for all devices after a bus reset, or at least inform a user level program? Finding out when there has been a bus reset has been a stumbling block for me. -Eric. -- Eric Z. Ayers Lead Software Engineer Phone: +1 404-705-2864Computer Generation, Incorporated Fax:+1 404-705-2805 an Intec Telecom Systems Company Web:http://www.intec-telecom-systems.com/ Email: [EMAIL PROTECTED] Postal: Bldg G 4th Floor, 5775 Peachtree-Dunwoody Rd, Atlanta, GA 30342 USA - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED]
Re: Linux Cluster using shared scsi
James Bottomley wrote: > > [EMAIL PROTECTED] said: > > So, will Linux ever support the scsi reservation mechanism as standard? > > That's not within my gift. I can merely write the code that corrects the > behaviour. I can't force anyone else to accept it. I think it will be standard before not too much longer (I hope anyway, I'm tired of carrying the patches forward all the time so I'll lend my support to getting it into the mainstream kernel ;-) > [EMAIL PROTECTED] said: > > Isn't there a standard that says if you scsi reserve a disk, no one > > else should be able to access this disk, or is this a "steeleye/ > > Compaq" standard. > > Use of reservations is laid out in the SCSI-2 and SCSI-3 standards (which can > be downloaded from the T10 site www.t10.org) which are international in scope. > I think the implementation issues come because the reservations part is > really only relevant to a multi-initiator clustered environment which isn't an > every day configuration for most Linux users. Obviously, as Linux moves into > the SAN arena this type of configuration will become a lot more common, at > which time the various problems associated with multiple initiators should > rise in prominence. I agree. It's something that needs fixed in general, your software needs it as well, and I've written (about 80% done at this point) some open source software geared towards getting/holding reservations that also requires the same kernel patches (plus one more to be fully functional, an ioctl to allow a SCSI reservation to do a forced reboot of a machine). I'll be releasing that package in the short term (once I get back from my vacation anyway). -- Doug Ledford <[EMAIL PROTECTED]> http://people.redhat.com/dledford Please check my web site for aic7xxx updates/answers before e-mailing me about problems - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED]
Re: Linux Cluster using shared scsi
> Does this package also tell the kernel to "re-establish" a > reservation for all devices after a bus reset, or at least inform a > user level program? Finding out when there has been a bus reset has > been a stumbling block for me. You cannot rely on a bus reset. Imagine hot swap disks on an FC fabric. I suspect the controller itself needs to call back for problem events - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED]
Re: Linux Cluster using shared scsi
Alan Cox writes: > > Does this package also tell the kernel to "re-establish" a > > reservation for all devices after a bus reset, or at least inform a > > user level program? Finding out when there has been a bus reset has > > been a stumbling block for me. > > You cannot rely on a bus reset. Imagine hot swap disks on an FC fabric. I > suspect the controller itself needs to call back for problem events > I'm not an SCSI expert by any stretch of the imagination. I think that what you are saying is that you cannot rely that a bus reset is as only thing that will remove a reservation. For example, if a device is 'hot replaced', the device will (clearly) no longer be reserved.But if you did such a hot swap you would have "bigger fish to fry" in a HA application... I mean, none of your data would be there! My understanding is that specifically, when a bus reset occurs, all SCSI reservations for devices on that bus are lost. I was hoping that if the kernel (by this I mean the scsi midlayer) was maintaining reservations, that there would be some logic activated to "handle" this problem, whether it be re-reserving the device, or the ability to pass notification of a reset (or another problem event as you point out) up to the application that's handling reservations. In my experience, the most common reason for a bus reset in parallel SCSI is that a peer host on the bus is rebooting. Since this happens under normal operation and well in advance of any attempt to acess the device, it would be nice if there were some sort of asyncronous notification instead of a polling process with an interval of 2-3 minutes, where it's conceivable that the peer system could have booted and attempted to take-over the disk out from under a running system. Bus resets in the Linux drivers also tend to happen frequently when a disk is failing, which has tended to leave the system in a somewhat functional but often an unusable state, (but that's a different story...) James Bottomley <[EMAIL PROTECTED]> writes: > Essentially, there are many conditions which cause a quiet loss of a SCSI-2 > reservation. Even in parallel SCSI: Reservations can be silently lost because >of LUN reset, device reset or even simple powering off the device. ... James mentions that even handling a bus reset still leaves a window where a peer could grab the reservation out from underneath an un-suspecting host. I agree that this could happen, and the old host might perform writes to an 'unreserved' disk, but once the second system suceeded in obtaining the reservation, any read/write commands from the "old" host would return SCSI errors (this is my layman's understanding - the commands would return a UNIT_RESERVED error) , so I believe you would have the desired behavior in this kind of cluster - only one machine in the cluster can access the disk at the same time. The data on the disk should be in a state where the second system in the cluster could start a recovery task and begin to provide the service hosted on the disk. -Eric. -- Eric Z. Ayers Lead Software Engineer Phone: +1 404-705-2864Computer Generation, Incorporated Fax:+1 404-705-2805 an Intec Telecom Systems Company Web:http://www.intec-telecom-systems.com/ Email: [EMAIL PROTECTED] Postal: Bldg G 4th Floor, 5775 Peachtree-Dunwoody Rd, Atlanta, GA 30342 USA - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED]