Re: Using pam_ssh with gdm
Volker Stolz wrote: Am 13. Oct 2003 um 16:56 CEST schrieb Joe Kelsey: first try, logging the following to syslog: Oct 13 07:24:30 zircon gdm[186]: Couldn't open session for joek Then, gdm resets and I reenter the password and passphrase. The second time, I get in. Apparantly, now ssh-agent has started, but pam_ssh did not pass along any authentication information, so I have to call ssh-add by hand to actually enter the key information. This means that every time I log in, I have to type my password twice and my passphrase three times. The first thing you're probably experiencing is this: http://www.freebsd.org/cgi/query-pr.cgi?pr=bin/45669 Description The pam_ssh module uses popen() to start an ssh-agent for the user during PAM authentication. However, pclose() causes the pam-module to return an error if somebody else already called waitpid(-1,...) because now pclose returns -1 and errno is set to ECHILD (observed with gdm who uses a whole bunch of processes). That fits exactly! I stumbled on a gdm error message in the logs about ssh-agent and child processes. I run 4-STABLE, your PR relates to 5-CURRENT. Has anyone doen anything about fixing this in 4-STABLE? Also, switching to only using my ssh passpharase doesn't tickle the ssh-agent child process bug. Also, why doesn't pam_ssh export my identities into ssh-agent? I still have to do a separate ssh-add to load the keys into ssh-agent. The pam_ssh man page still says that it does this, but obviously it doesn't. /Joe ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Firewire on STABLE: Sane for drive-based backups? (SOLVED operator error)
On Mon, 13 Oct 2003 02:34:01 -0500 Stephen Hilton <[EMAIL PROTECTED]> wrote: > On Sat, 11 Oct 2003 18:11:54 -0700 (PDT) > Jason Fesler <[EMAIL PROTECTED]> wrote: > > > > > I did not have much luck on digging through the archives. Does anyone have > > any sucess stories on using external firewire drives on the stable branch > > of freebsd? Does hot swap work? Can I mount, dd or ufsdump or > > newfs/rsync, then umount and unplug it cleanly? > > > > I'm considering my options for doing once-a-month backups, and tape just > > totally blows the budget. I'm currently using a second drive to produce > > snapshots, but that doens't leave me with any off-site backups without > > taking the system down to swap drives. > > > I have been using a Buslink 1394 Firewire HD model #FW80 72E on 2 > 4-STABLE systems for backup. The Firewire ports are onbord on an > ASUS P4PE and an ASUS P3B-1394 motherboard. > > Mounting and unmounting works fine, I do have troubles with hot swapping > though. > > > On the ASUS P3B-1394 system > > FreeBSD 4.9-RC cvsup'd and built/installed today > > Snips from my dmesg.boot > --- > fwohci0: mem > 0xcb00-0xcb003fff,0xcb80-0xcb8007ff irq 11 at device 6.0 on pci0 > fwohci0: OHCI version 1.0 (ROM=1) > fwohci0: No. of Isochronous channel is 4. > fwohci0: EUI64 00:e0:18:00:00:00:16:bd > fwohci0: Phy 1394a available S400, 3 ports. > fwohci0: Link S400, max_rec 2048 bytes. > firewire0: on fwohci0 > sbp0: on firewire0 > fwohci0: Initiate bus reset > fwohci0: BUS reset > fwohci0: node_id=0xc800ffc1, gen=1, CYCLEMASTER mode > firewire0: 2 nodes, maxhop <= 1, cable IRM = 1 (me) > firewire0: bus manager 1 (me) > > sym0: <895> port 0xb000-0xb0ff mem 0xc980-0xc9800fff,0xca00-0xcaff > irq 11 at device 10.0 on pci0 > sym0: Tekram NVRAM, ID 7, Fast-40, LVD, parity checking > firewire0: New S400 device ID:0030e001e000177c > Mounting root from ufs:/dev/da0s2a > da2 at sbp0 bus 0 target 0 lun 0 > da2: Fixed Simplified Direct Access SCSI-4 device > da2: 50.000MB/s transfers, Tagged Queueing Enabled > da2: 76319MB (156301488 512 byte sectors: 255H 63S/T 9729C) > da0 at sym0 bus 0 target 0 lun 0 > da0: Fixed Direct Access SCSI-3 device > da0: 80.000MB/s transfers (40.000MHz, offset 31, 16bit), Tagged Queueing Enabled > da0: 17501MB (35843670 512 byte sectors: 255H 63S/T 2231C) > da1 at sym0 bus 0 target 1 lun 0 > da1: Fixed Direct Access SCSI-2 device > da1: 80.000MB/s transfers (40.000MHz, offset 15, 16bit), Tagged Queueing Enabled > da1: 8683MB (17783240 512 byte sectors: 255H 63S/T 1106C) > --- > > > SCSI stuff from my kernel config file > --- > # using SCSI-IDE atapicam emulation for DVD/CDRW access. > device atapicam# emulate ATAPI devices as SCSI ditto via CAM > # needs CAM to be present (scbus & pass) > > # SCSI Controllers > device sym0# NCR/Symbios Logic (newer chipsets) > device scbus0 at sym0 > device da0 at scbus0 target 0 > device da1 at scbus0 target 1 > options SYM_SETUP_SCSI_DIFF #-HVD support for 825a, 875, 885 > options SCSI_DELAY=3000 #Delay (in ms) before probing SCSI > # disabled:0 (default), enabled:1 > # SCSI peripherals > device scbus # SCSI bus (required) > device da # Direct Access (disks) > device sa # Sequential Access (tape etc) > device cd # CD > device pass# Passthrough device (direct SCSI access) > > # Firewire support > device firewire# Firewire bus code > device sbp # SCSI over Firewire (Requires scbus and da) > # devicefwe # Ethernet over Firewire (non-standard!) > --- > > On bootup with my firewire drive plugged in everything is fine. > > Unplugging the firewire cable gives this in my console: > fwohci0: BUS reset > fwohci0: node_id=0xc800ffc0, gen=2, CYCLEMASTER mode > firewire0: 1 nodes, maxhop <= 0, cable IRM = 0 (me) > firewire0: bus manager 0 (me) > > plugging the firewire cable back in gives this: > fwohci0: BUS reset > fwohci0: node_id=0xc800ffc1, gen=3, CYCLEMASTER mode > fwohci0: SID Error > > At this point I cannot mount the firewire drive anymore. > > I have tried using 'fwcontrol -r' but no luck. > > Any help or pointers appreciated. This was my operator error, the correct sequence for hot swapping this firewire hard drive on my motherboard is this: (this is with firewire and sbp devices compiled into my kernel) ... plug the firewire drive cable in run 'fwcontrol -r' mount the firewire drive do backups unmount the fi
Re: Firewire on STABLE: Sane for drive-based backups?
On Mon, 13 Oct 2003 02:34:01 -0500 Stephen Hilton <[EMAIL PROTECTED]> wrote: > On Sat, 11 Oct 2003 18:11:54 -0700 (PDT) > Jason Fesler <[EMAIL PROTECTED]> wrote: > > > > > I did not have much luck on digging through the archives. Does anyone have > > any sucess stories on using external firewire drives on the stable branch > > of freebsd? Does hot swap work? Can I mount, dd or ufsdump or > > newfs/rsync, then umount and unplug it cleanly? > > > > I'm considering my options for doing once-a-month backups, and tape just > > totally blows the budget. I'm currently using a second drive to produce > > snapshots, but that doens't leave me with any off-site backups without > > taking the system down to swap drives. > > > I have been using a Buslink 1394 Firewire HD model #FW80 72E on 2 > 4-STABLE systems for backup. The Firewire ports are onbord on an > ASUS P4PE and an ASUS P3B-1394 motherboard. > > Mounting and unmounting works fine, I do have troubles with hot swapping > though. > > > On the ASUS P3B-1394 system > > FreeBSD 4.9-RC cvsup'd and built/installed today > > Snips from my dmesg.boot > --- > fwohci0: mem > 0xcb00-0xcb003fff,0xcb80-0xcb8007ff irq 11 at device 6.0 on pci0 > fwohci0: OHCI version 1.0 (ROM=1) > fwohci0: No. of Isochronous channel is 4. > fwohci0: EUI64 00:e0:18:00:00:00:16:bd > fwohci0: Phy 1394a available S400, 3 ports. > fwohci0: Link S400, max_rec 2048 bytes. > firewire0: on fwohci0 > sbp0: on firewire0 > fwohci0: Initiate bus reset > fwohci0: BUS reset > fwohci0: node_id=0xc800ffc1, gen=1, CYCLEMASTER mode > firewire0: 2 nodes, maxhop <= 1, cable IRM = 1 (me) > firewire0: bus manager 1 (me) > > sym0: <895> port 0xb000-0xb0ff mem 0xc980-0xc9800fff,0xca00-0xcaff > irq 11 at device 10.0 on pci0 > sym0: Tekram NVRAM, ID 7, Fast-40, LVD, parity checking > firewire0: New S400 device ID:0030e001e000177c > Mounting root from ufs:/dev/da0s2a > da2 at sbp0 bus 0 target 0 lun 0 > da2: Fixed Simplified Direct Access SCSI-4 device > da2: 50.000MB/s transfers, Tagged Queueing Enabled > da2: 76319MB (156301488 512 byte sectors: 255H 63S/T 9729C) > da0 at sym0 bus 0 target 0 lun 0 > da0: Fixed Direct Access SCSI-3 device > da0: 80.000MB/s transfers (40.000MHz, offset 31, 16bit), Tagged Queueing Enabled > da0: 17501MB (35843670 512 byte sectors: 255H 63S/T 2231C) > da1 at sym0 bus 0 target 1 lun 0 > da1: Fixed Direct Access SCSI-2 device > da1: 80.000MB/s transfers (40.000MHz, offset 15, 16bit), Tagged Queueing Enabled > da1: 8683MB (17783240 512 byte sectors: 255H 63S/T 1106C) > --- > > > SCSI stuff from my kernel config file > --- > # using SCSI-IDE atapicam emulation for DVD/CDRW access. > device atapicam# emulate ATAPI devices as SCSI ditto via CAM > # needs CAM to be present (scbus & pass) > > # SCSI Controllers > device sym0# NCR/Symbios Logic (newer chipsets) > device scbus0 at sym0 > device da0 at scbus0 target 0 > device da1 at scbus0 target 1 > options SYM_SETUP_SCSI_DIFF #-HVD support for 825a, 875, 885 > options SCSI_DELAY=3000 #Delay (in ms) before probing SCSI > # disabled:0 (default), enabled:1 > # SCSI peripherals > device scbus # SCSI bus (required) > device da # Direct Access (disks) > device sa # Sequential Access (tape etc) > device cd # CD > device pass# Passthrough device (direct SCSI access) > > # Firewire support > device firewire# Firewire bus code > device sbp # SCSI over Firewire (Requires scbus and da) > # devicefwe # Ethernet over Firewire (non-standard!) > --- > > On bootup with my firewire drive plugged in everything is fine. > > Unplugging the firewire cable gives this in my console: > fwohci0: BUS reset > fwohci0: node_id=0xc800ffc0, gen=2, CYCLEMASTER mode > firewire0: 1 nodes, maxhop <= 0, cable IRM = 0 (me) > firewire0: bus manager 0 (me) > > plugging the firewire cable back in gives this: > fwohci0: BUS reset > fwohci0: node_id=0xc800ffc1, gen=3, CYCLEMASTER mode > fwohci0: SID Error > > At this point I cannot mount the firewire drive anymore. > > I have tried using 'fwcontrol -r' but no luck. This was my operator error, the correct sequence for hot swapping this firewire hard drive on my motherboard is this: (this is with firewire and sbp devices compiled into my kernel) ... plug the firewire drive cable in run 'fwcontrol -r' mount the firewire drive do backups unmount the firewire drive unplug the firewire drive c
Re: ATA failure with 4.6.2 & 250GB drive?
> Date: Tue, 14 Oct 2003 09:55:54 +0100 > From: Scott Mitchell <[EMAIL PROTECTED]> > Sender: [EMAIL PROTECTED] > > On Mon, Oct 13, 2003 at 10:09:10AM +0100, Scott Mitchell wrote: > > Hi all, > > > > Just installed a Maxtor 250GB PATA drive in one of our servers, to be used > > as a backup staging area. This was actually a replacement for an identical > > drive that appeared to have died after a month of service. > > > > Anyway, 2 days after this drive was installed I start seeing this in the > > daily logs: > > > > > ad1s1e: hard error reading fsbn 850845887 of 425422912-425422943 (ad1s1 bn > > > 850845887; cn 52962 tn 180 sn 17) trying PIO mode > > > ad1s1e: hard error reading fsbn 850845887 of 425422912-425422943 (ad1s1 bn > > > 850845887; cn 52962 tn 180 sn 17) status=59 error=40 > > > ad1s1e: hard error reading fsbn 850845887 of 425422912-425422943 (ad1s1 bn > > > 850845887; cn 52962 tn 180 sn 17) status=59 error=40 > > > ad1s1e: hard error reading fsbn 850845887 of 425422912-425422943 (ad1s1 bn > > > 850845887; cn 52962 tn 180 sn 17) status=59 error=40 > > ... > > OK, swapped out the cable (from an 80- to 40-wire one, as it happened, > although that should make no difference on a UDMA33 controller). Same > errors appeared again while the backups were running. > > Some more information on how this drive is being used - we're dumping two > vinum RAID5 volumes onto it, one local and one remote, writing to the > backup disk over NFS. Both dumps kick off at 0300, with the remote one > finishing at 0305 last night. The first ATA error appeared in the logs at > 0325, while the local backup was still running. The last error was logged > at 0355, but the backup itself didn't finish until nearly 0500. > > Anyone have any more ideas on how to diagnose this? It does occur to me > that the daily periodic run also kicks off at 0301 but that is usually all > done before 0330. It's a real drive problem, but possibly not a terminal one. (I had the same issue on one of my drives a few months ago and it's fine now.) The problem is that the system is getting an error trying to read this area of the disk. It's an unmapped bunch of bad blocks. The system gets an unrecoverable error trying to read these blocks and that is what you see reported. Since it can't read "good" data, it does not relocate the bad data, but just leaves it there and reports errors every time it tries to read the data. First, any files containing data stored in these blocks are probably toast. Or, at least garbled. Sorry. The fix/workaround is to move the file(s) involved so that the damaged blocks are marked free and relocated to spar space on the drive. You can try to figure out just which file(s) use those blocks. There might even be a reasonable way to do this...I just don't know what it is. Another "fix"is to simply copy the drive onto another and then copy it back. dd(1) will do the trick as will dump/restore. (I'd suggest the dump/restore to copy the data out and dd to copy it back if the disks have identical geometries.) Once the data is restored to the original disk, the bad blocks will have been re-directed by the drive and will no longer trouble you. Modern disks are pretty smart at error recovery, but some failures are too sudden for the drive to be able to deal with them without losing data. -- R. Kevin Oberman, Network Engineer Energy Sciences Network (ESnet) Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab) E-mail: [EMAIL PROTECTED] Phone: +1 510 486-8634 ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: ATA failure with 4.6.2 & 250GB drive?
Kevin Oberman wrote: Date: Tue, 14 Oct 2003 09:55:54 +0100 From: Scott Mitchell <[EMAIL PROTECTED]> Sender: [EMAIL PROTECTED] On Mon, Oct 13, 2003 at 10:09:10AM +0100, Scott Mitchell wrote: Hi all, Just installed a Maxtor 250GB PATA drive in one of our servers, to be used as a backup staging area. This was actually a replacement for an identical drive that appeared to have died after a month of service. Anyway, 2 days after this drive was installed I start seeing this in the daily logs: ad1s1e: hard error reading fsbn 850845887 of 425422912-425422943 (ad1s1 bn 850845887; cn 52962 tn 180 sn 17) trying PIO mode ad1s1e: hard error reading fsbn 850845887 of 425422912-425422943 (ad1s1 bn 850845887; cn 52962 tn 180 sn 17) status=59 error=40 ad1s1e: hard error reading fsbn 850845887 of 425422912-425422943 (ad1s1 bn 850845887; cn 52962 tn 180 sn 17) status=59 error=40 ad1s1e: hard error reading fsbn 850845887 of 425422912-425422943 (ad1s1 bn 850845887; cn 52962 tn 180 sn 17) status=59 error=40 ... OK, swapped out the cable (from an 80- to 40-wire one, as it happened, although that should make no difference on a UDMA33 controller). Same errors appeared again while the backups were running. Some more information on how this drive is being used - we're dumping two vinum RAID5 volumes onto it, one local and one remote, writing to the backup disk over NFS. Both dumps kick off at 0300, with the remote one finishing at 0305 last night. The first ATA error appeared in the logs at 0325, while the local backup was still running. The last error was logged at 0355, but the backup itself didn't finish until nearly 0500. Anyone have any more ideas on how to diagnose this? It does occur to me that the daily periodic run also kicks off at 0301 but that is usually all done before 0330. It's a real drive problem, but possibly not a terminal one. (I had the same issue on one of my drives a few months ago and it's fine now.) The problem is that the system is getting an error trying to read this area of the disk. It's an unmapped bunch of bad blocks. The system gets an unrecoverable error trying to read these blocks and that is what you see reported. Since it can't read "good" data, it does not relocate the bad data, but just leaves it there and reports errors every time it tries to read the data. First, any files containing data stored in these blocks are probably toast. Or, at least garbled. Sorry. The fix/workaround is to move the file(s) involved so that the damaged blocks are marked free and relocated to spar space on the drive. You can try to figure out just which file(s) use those blocks. There might even be a reasonable way to do this...I just don't know what it is. Another "fix"is to simply copy the drive onto another and then copy it back. dd(1) will do the trick as will dump/restore. (I'd suggest the dump/restore to copy the data out and dd to copy it back if the disks have identical geometries.) Once the data is restored to the original disk, the bad blocks will have been re-directed by the drive and will no longer trouble you. Modern disks are pretty smart at error recovery, but some failures are too sudden for the drive to be able to deal with them without losing data. Regarding a fix: I had similar read error message not long ago when dumping to tape, wondered what they could mean. So I went to the hard drive manufacturer's website and download a DOS tool to scan/repair the harddrive. Just to note an issue: I had one bootdisk for to check my harddrive which was an Hitachi (HGST) drive in my laptop and one for the Western Digital which was the drive of concern. For some reason I used the software utility from Hitachi on the WD, which was a good thing, because it reported bad blocks and wouldn't fix them because it recognized that it wasn't their drive. Then I used the bootdisk I had created for the WD utility and ran it, (this is why it was a good thing) it did the scan reported NO issues, checked its logs to see if it had logged fixing any problems. The utilities logs said the drive had no issues. Just to double check I re-ran the HGST tool and it didn't find any bad blocks. Hmm. Those knuckle-heads at Western Digital made the utility to fix the bad blocks silently. I find this under-handed because you might not find a disk going bad until the disk is totally failing. Hmm. wonder if this helps get it past the warranty before the drive completely fails. [ya know when 1:bad blocks show up, 2:you clean 'em up 3:return to step1 that the drive will die the death in the near future] So you can get utilities from the manufacturer usually {atleast WD, HGST, and Seagate} to do some subset of turn on and off S.M.A.R.T., exercise the harddrive, scan for errors, repair errors, low-level format, If you don't mind booting from a DOS bootdisk to run the tool. WD is a little confusing which to grab. But note as I found out some manufacturer's might silently repair cert
Re: IPNAT/Slow TCP/Pings fine/4.8-REL
Larry Rosenman wrote: --On Monday, October 13, 2003 14:03:59 -0700 Chris Pressey <[EMAIL PROTECTED]> wrote: On Mon, 13 Oct 2003 00:19:54 -0500 Larry Rosenman <[EMAIL PROTECTED]> wrote: I was trying(!) to help a friend out, and built a 4.8-REL box to play Router/NAT and it's ALMOST working. I can't seem to telnet/surf from NAT'd addresses, but PING works fine. [...] What am I missing? What else do you/I need? This was with the ipfilter ipnat. I tried ipfw, and had the IPDIVERT and the same symptoms. What's got me is the fact that I can PING, and apparently do DNS lookups, but TCP just doesn't. :-( LER THanks for any QUICK replies! "options IPDIVERT" in your kernel config...? -Chris ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]" If you would post this to freebsd-questions you would probably get better service, since it is most likely a configuration issue. And yes, it is my understanding that IPDIVERT is not needed for IPFILTER and ipnat. anyone? the rc.conf gateway_enable option and setting the sysctl forwarding option do the same thing, someone more knowledgeable can answer to that one. Oh, I just checked it sets the forwarding but not fastforwarding. So you need either method you choose, both is redundant. You are not very descriptive: can ping? ping [ip.num.for.localhost] or ping [ip.num.for.externalhost] or ping [host.domain.tld] apparently do name lookups?? are you getting good results from nslookup www.abcnews.com or such? I think there is a top like command line option for ipfilter you can use to see what ipfilter is doing, but I am not sure if it is helpful with ipnat. posting to questions instead, I think is appropriate. Have a good day, David ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Spamassasin
On Tue, 2003-10-14 at 12:59, Philip Reynolds wrote: > Chris Stenton <[EMAIL PROTECTED]> 25 lines of wisdom included: > > I would second this but I use mailscanner which does the same job. > > Mailscanner seems like a very poorly designed piece of software, at > least from my experience with Postfix. > > It directly manipulates the Postfix queue which can cause message > corruption. This has been raised on the Postfix list recently and > since Wietse & co. have been advising _against_ using it with > Postfix. > > Just an FYI. Interesting. I still use sendmail (with check_local) and it seems OK with that but it is setup to use an in and out queue. Chris ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: ATA failure with 4.6.2 & 250GB drive?
If you want to see the SMART information from the hard drives, and you are rrunning a recent 5-CURRENT (that includes that ATAng), you can also test out the smartmontools package from http://smartmontools.sourceforge.net I just finished up some work on porting the code to FreeBSD, and if you check out the latest CVS version (or soon to be release 5.21 release), you can help me test it out and see what the drive itself is reporting. The smartmontools package should also work with SCSI drives (utilizing CAM), and that portion should work under a 4-STABLE release, although I have to admit I haven't tested it. I also plan on submitting a PORT for it in the very near future (just waiting for the .5.21 release) Ed On Tue, 2003-10-14 at 14:12, DavidB wrote: > Kevin Oberman wrote: > > >> Date: Tue, 14 Oct 2003 09:55:54 +0100 > >> From: Scott Mitchell <[EMAIL PROTECTED]> > >> Sender: [EMAIL PROTECTED] > >> > >> On Mon, Oct 13, 2003 at 10:09:10AM +0100, Scott Mitchell wrote: > >> > >>> Hi all, > >>> > >>> Just installed a Maxtor 250GB PATA drive in one of our servers, to > >>> be used > >>> as a backup staging area. This was actually a replacement for an > >>> identical > >>> drive that appeared to have died after a month of service. > >>> > >>> Anyway, 2 days after this drive was installed I start seeing this in > >>> the > >>> daily logs: > >>> > >>> > ad1s1e: hard error reading fsbn 850845887 of 425422912-425422943 > (ad1s1 bn 850845887; cn 52962 tn 180 sn 17) trying PIO mode > ad1s1e: hard error reading fsbn 850845887 of 425422912-425422943 > (ad1s1 bn 850845887; cn 52962 tn 180 sn 17) status=59 error=40 > ad1s1e: hard error reading fsbn 850845887 of 425422912-425422943 > (ad1s1 bn 850845887; cn 52962 tn 180 sn 17) status=59 error=40 > ad1s1e: hard error reading fsbn 850845887 of 425422912-425422943 > (ad1s1 bn 850845887; cn 52962 tn 180 sn 17) status=59 error=40 > >>> > >>> > >>> ... > >> > >> > >> OK, swapped out the cable (from an 80- to 40-wire one, as it happened, > >> although that should make no difference on a UDMA33 controller). Same > >> errors appeared again while the backups were running. > >> > >> Some more information on how this drive is being used - we're dumping > >> two > >> vinum RAID5 volumes onto it, one local and one remote, writing to the > >> backup disk over NFS. Both dumps kick off at 0300, with the remote one > >> finishing at 0305 last night. The first ATA error appeared in the > >> logs at > >> 0325, while the local backup was still running. The last error was > >> logged > >> at 0355, but the backup itself didn't finish until nearly 0500. > >> > >> Anyone have any more ideas on how to diagnose this? It does occur to me > >> that the daily periodic run also kicks off at 0301 but that is > >> usually all > >> done before 0330. > > > > > > > > It's a real drive problem, but possibly not a terminal one. (I had the > > same issue on one of my drives a few months ago and it's fine now.) > > > > The problem is that the system is getting an error trying to read this > > area of the disk. It's an unmapped bunch of bad blocks. The system > > gets an unrecoverable error trying to read these blocks and that is > > what you see reported. Since it can't read "good" data, it does not > > relocate the bad data, but just leaves it there and reports errors > > every time it tries to read the data. > > > > First, any files containing data stored in these blocks are probably > > toast. Or, at least garbled. Sorry. > > > > The fix/workaround is to move the file(s) involved so that the damaged > > blocks are marked free and relocated to spar space on the drive. You > > can try to figure out just which file(s) use those blocks. There > > might even be a reasonable way to do this...I just don't know what it > > is. > > > > Another "fix"is to simply copy the drive onto another and then copy it > > back. dd(1) will do the trick as will dump/restore. (I'd suggest the > > dump/restore to copy the data out and dd to copy it back if the disks > > have identical geometries.) Once the data is restored to the original > > disk, the bad blocks will have been re-directed by the drive and will > > no longer trouble you. > > > > Modern disks are pretty smart at error recovery, but some failures are > > too sudden for the drive to be able to deal with them without losing > > data. > > > Regarding a fix: > > I had similar read error message not long ago when dumping to tape, > wondered what they could mean. So I went to the hard drive > manufacturer's website and download a DOS tool to scan/repair the > harddrive. > > Just to note an issue: I had one bootdisk for to check my harddrive > which was an Hitachi (HGST) drive in my laptop and one for the Western > Digital which was the drive of concern. For some reason I used the > software utility from Hitachi on the WD, which was a good thing, because > it reported bad blocks and wouldn't fi
Re: Spamassasin
On Tue, 14 Oct 2003, Chris Stenton wrote: > On Tue, 2003-10-14 at 12:59, Philip Reynolds wrote: > > > > Mailscanner seems like a very poorly designed piece of software, at > > least from my experience with Postfix. > > > > It directly manipulates the Postfix queue which can cause message > > corruption. This has been raised on the Postfix list recently and > > since Wietse & co. have been advising _against_ using it with > > Postfix. > > Interesting. I still use sendmail (with check_local) and it seems OK > with that but it is setup to use an in and out queue. MailScanner was originally written to interface with sendmail, using an in-queue and an out-queue mechanism. Sendmail's queue files are well documented. Exim support came shortly after, Exim being architecturally very similar to sendmail. Again, there is good documentation for the queue format, and its queue handling is robust. Exim/sendmail+MailScanner combinations are used extensively in the UK academic community to good effect. Those close to Exim's author use Exim+MailScanner, and advice on one way of integrating Exim+MailScanner was written by Exim's author. MailScanner's design is pretty simple: it expects the SMTP daemon to place incoming mail in a queue, and do nothing more with it. MS will process messages in that queue, and when done launch an outgoing mail process to deliver it or place messages in an outgoing queue for an MTA to read. It handles locking and such like, and makes sure that messages are not finally removed from the incoming queue until they have been fully processed and submitted for onward delivery. Postfix support was only fairly recently added after repeated requests. I've heard that the queue format is less clearly documented (I don't know; I don't use postfix). I also understand that the Postfix developers prefer other programs not to mess around with the Postfix queue directly. Whether MS does so robustly or not I couldn't say: best ask someone who runs Postfix+MS. If Postfix's developers are unhappy with the way that MS does so, then I guess it isn't surprising that they would advise against using MS. MS' developers strive for robustness, so if the information is readily available on how to safely access the Postfix queue they will probably have taken it into account. Speaking personally, MS has saved us time and time again from email-bourne threats over the past couple of years, and allowed us to implement a fine-grained mail security policy that is customisable on a per-user basis if necessary. No other AV solution offers even half the features and configurability that MS does, and MS now scans and protects huge amounts of mail in many many installations. Our site was protected from Sobig.whatever before the thing was even released, without needing to wait for AV definitions to be updated. Just felt that a little defence of MS was necessary. Jethro. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jethro R Binks Computing Officer, IT Services University Of Strathclyde, Glasgow, UK ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: ATA failure with 4.6.2 & 250GB drive?
Hi Scott, > OK, swapped out the cable (from an 80- to 40-wire one, as it happened, > although that should make no difference on a UDMA33 controller). Same > errors appeared again while the backups were running. FWIW, I have had _major_ problems (ie drive failure) when trying to use an 80 wire cable on a UDMA33 board with an ATA66 drive - I would recommend never using 80 wire cables on controllers that don't support UDMA66 or better. The reverse, using 80 wire cables on UDMA66 (or better) controllers with ATA33 drives is ok, however, and I've done that successfully. Once bitten, twice shy :) regards, -- joel -- AusCERT, ITS, Uni of Qld, Australia -- hotline: [+61] [07] 33654417 my opinions in this email are not endorsed by AusCERT or Uni of Qld this message may not be onforwarded without my expressed permission ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: ATA failure with 4.6.2 & 250GB drive?
> FWIW, I have had _major_ problems (ie drive failure) when trying to use > an 80 wire cable on a UDMA33 board with an ATA66 drive - I would recommend > never using 80 wire cables on controllers that don't support UDMA66 or > better. The reverse, using 80 wire cables on UDMA66 (or better) controllers > with ATA33 drives is ok, however, and I've done that successfully. The only difference between 40 and 80 wire ATA cables are the extra 40 wires which are all connected to ground and pin 34 in the host connector which is not connected to the conductor in an 80 wire cable. The ANSI ATA/ATAPI-5 standard says that "80-conductor cable assemblies may be used in place of 40-conductor cable assemblies to improve signal quality ..." (section 4.2.2.1). Unless someone can suggest a specific mechanism for the drive failures, I would expect that your correlation between drive failures and using 80 conductor cables with UDMA33 controllers was just coincidence. Dan Strick [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"