date:20031014

Re: Using pam_ssh with gdm

2003-10-14 Thread Joe Kelsey

Volker Stolz wrote:
Am 13. Oct 2003 um 16:56 CEST schrieb Joe Kelsey:

first try, logging the following to syslog:
Oct 13 07:24:30 zircon gdm[186]: Couldn't open session for joek
Then, gdm resets and I reenter the password and passphrase.  The second 
time, I get in.  Apparantly, now ssh-agent has started, but pam_ssh did 
not pass along any authentication information, so I have to call ssh-add 
by hand to actually enter the key information.  This means that every 
time I log in, I have to type my password twice and my passphrase three 
times.


The first thing you're probably experiencing is this:
http://www.freebsd.org/cgi/query-pr.cgi?pr=bin/45669
Description
The pam_ssh module uses popen() to start an ssh-agent for the user during PAM
authentication. However, pclose() causes the pam-module to return an error if
somebody else already called waitpid(-1,...) because now pclose returns -1
and errno is set to ECHILD (observed with gdm who uses a whole bunch of processes).
That fits exactly!  I stumbled on a gdm error message in the logs about 
ssh-agent and child processes.  I run 4-STABLE, your PR relates to 
5-CURRENT.  Has anyone doen anything about fixing this in 4-STABLE? 
Also, switching to only using my ssh passpharase doesn't tickle the 
ssh-agent child process bug.

Also, why doesn't pam_ssh export my identities into ssh-agent?  I still 
have to do a separate ssh-add to load the keys into ssh-agent.  The 
pam_ssh man page still says that it does this, but obviously it doesn't.

/Joe

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Firewire on STABLE: Sane for drive-based backups? (SOLVED operator error)

2003-10-14 Thread Stephen Hilton

On Mon, 13 Oct 2003 02:34:01 -0500
Stephen Hilton <[EMAIL PROTECTED]> wrote:

> On Sat, 11 Oct 2003 18:11:54 -0700 (PDT)
> Jason Fesler <[EMAIL PROTECTED]> wrote:
> 
> > 
> > I did not have much luck on digging through the archives. Does anyone have
> > any sucess stories on using external firewire drives on the stable branch
> > of freebsd?  Does hot swap work?  Can I mount, dd or ufsdump or
> > newfs/rsync, then umount and unplug it cleanly?
> > 
> > I'm considering my options for doing once-a-month backups, and tape just
> > totally blows the budget.  I'm currently using a second drive to produce
> > snapshots, but that doens't leave me with any off-site backups without
> > taking the system down to swap drives.
> 
> 
> I have been using a Buslink 1394 Firewire HD model #FW80 72E on 2  
> 4-STABLE systems for backup. The Firewire ports are onbord on an 
> ASUS P4PE and an ASUS P3B-1394 motherboard.
> 
> Mounting and unmounting works fine, I do have troubles with hot swapping 
> though.
> 
> 
> On the ASUS P3B-1394 system
> 
> FreeBSD 4.9-RC cvsup'd and built/installed today
> 
> Snips from my dmesg.boot
> ---
> fwohci0:  mem 
> 0xcb00-0xcb003fff,0xcb80-0xcb8007ff irq 11 at device 6.0 on pci0
> fwohci0: OHCI version 1.0 (ROM=1)
> fwohci0: No. of Isochronous channel is 4.
> fwohci0: EUI64 00:e0:18:00:00:00:16:bd
> fwohci0: Phy 1394a available S400, 3 ports.
> fwohci0: Link S400, max_rec 2048 bytes.
> firewire0:  on fwohci0
> sbp0:  on firewire0
> fwohci0: Initiate bus reset
> fwohci0: BUS reset
> fwohci0: node_id=0xc800ffc1, gen=1, CYCLEMASTER mode
> firewire0: 2 nodes, maxhop <= 1, cable IRM = 1 (me)
> firewire0: bus manager 1 (me)
> 
> sym0: <895> port 0xb000-0xb0ff mem 0xc980-0xc9800fff,0xca00-0xcaff 
> irq 11 at device 10.0 on pci0
> sym0: Tekram NVRAM, ID 7, Fast-40, LVD, parity checking
> firewire0: New S400 device ID:0030e001e000177c
> Mounting root from ufs:/dev/da0s2a
> da2 at sbp0 bus 0 target 0 lun 0
> da2:  Fixed Simplified Direct Access SCSI-4 device
> da2: 50.000MB/s transfers, Tagged Queueing Enabled
> da2: 76319MB (156301488 512 byte sectors: 255H 63S/T 9729C)
> da0 at sym0 bus 0 target 0 lun 0
> da0:  Fixed Direct Access SCSI-3 device
> da0: 80.000MB/s transfers (40.000MHz, offset 31, 16bit), Tagged Queueing Enabled
> da0: 17501MB (35843670 512 byte sectors: 255H 63S/T 2231C)
> da1 at sym0 bus 0 target 1 lun 0
> da1:  Fixed Direct Access SCSI-2 device
> da1: 80.000MB/s transfers (40.000MHz, offset 15, 16bit), Tagged Queueing Enabled
> da1: 8683MB (17783240 512 byte sectors: 255H 63S/T 1106C)
> ---
> 
> 
> SCSI stuff from my kernel config file
> ---
> # using SCSI-IDE atapicam emulation for DVD/CDRW access.
> device  atapicam# emulate ATAPI devices as SCSI ditto via CAM
> # needs CAM to be present (scbus & pass)
>  
> # SCSI Controllers
> device  sym0# NCR/Symbios Logic (newer chipsets)
> device  scbus0 at sym0
> device  da0 at scbus0 target 0
> device  da1 at scbus0 target 1
> options SYM_SETUP_SCSI_DIFF #-HVD support for 825a, 875, 885
> options SCSI_DELAY=3000 #Delay (in ms) before probing SCSI
> # disabled:0 (default), enabled:1
> # SCSI peripherals
> device  scbus   # SCSI bus (required)
> device  da  # Direct Access (disks)
> device  sa  # Sequential Access (tape etc)
> device  cd  # CD
> device  pass# Passthrough device (direct SCSI access)
>  
> # Firewire support
> device  firewire# Firewire bus code
> device  sbp # SCSI over Firewire (Requires scbus and da)
> # devicefwe # Ethernet over Firewire (non-standard!)
> ---
> 
> On bootup with my firewire drive plugged in everything is fine.
> 
> Unplugging the firewire cable gives this in my console:
> fwohci0: BUS reset
> fwohci0: node_id=0xc800ffc0, gen=2, CYCLEMASTER mode
> firewire0: 1 nodes, maxhop <= 0, cable IRM = 0 (me)
> firewire0: bus manager 0 (me)
> 
> plugging the firewire cable back in gives this:
> fwohci0: BUS reset
> fwohci0: node_id=0xc800ffc1, gen=3, CYCLEMASTER mode
> fwohci0: SID Error
> 
> At this point I cannot mount the firewire drive anymore.
> 
> I have tried using 'fwcontrol -r' but no luck.
> 
> Any help or pointers appreciated.


This was my operator error, the correct sequence for hot swapping this 
firewire hard drive on my motherboard is this:
(this is with firewire and sbp devices compiled into my kernel)

...
plug the firewire drive cable in
run 'fwcontrol -r'
mount the firewire drive
do backups
unmount the fi

Re: Firewire on STABLE: Sane for drive-based backups?

2003-10-14 Thread Stephen Hilton

On Mon, 13 Oct 2003 02:34:01 -0500
Stephen Hilton <[EMAIL PROTECTED]> wrote:

> On Sat, 11 Oct 2003 18:11:54 -0700 (PDT)
> Jason Fesler <[EMAIL PROTECTED]> wrote:
> 
> > 
> > I did not have much luck on digging through the archives. Does anyone have
> > any sucess stories on using external firewire drives on the stable branch
> > of freebsd?  Does hot swap work?  Can I mount, dd or ufsdump or
> > newfs/rsync, then umount and unplug it cleanly?
> > 
> > I'm considering my options for doing once-a-month backups, and tape just
> > totally blows the budget.  I'm currently using a second drive to produce
> > snapshots, but that doens't leave me with any off-site backups without
> > taking the system down to swap drives.
> 
> 
> I have been using a Buslink 1394 Firewire HD model #FW80 72E on 2  
> 4-STABLE systems for backup. The Firewire ports are onbord on an 
> ASUS P4PE and an ASUS P3B-1394 motherboard.
> 
> Mounting and unmounting works fine, I do have troubles with hot swapping 
> though.
> 
> 
> On the ASUS P3B-1394 system
> 
> FreeBSD 4.9-RC cvsup'd and built/installed today
> 
> Snips from my dmesg.boot
> ---
> fwohci0:  mem 
> 0xcb00-0xcb003fff,0xcb80-0xcb8007ff irq 11 at device 6.0 on pci0
> fwohci0: OHCI version 1.0 (ROM=1)
> fwohci0: No. of Isochronous channel is 4.
> fwohci0: EUI64 00:e0:18:00:00:00:16:bd
> fwohci0: Phy 1394a available S400, 3 ports.
> fwohci0: Link S400, max_rec 2048 bytes.
> firewire0:  on fwohci0
> sbp0:  on firewire0
> fwohci0: Initiate bus reset
> fwohci0: BUS reset
> fwohci0: node_id=0xc800ffc1, gen=1, CYCLEMASTER mode
> firewire0: 2 nodes, maxhop <= 1, cable IRM = 1 (me)
> firewire0: bus manager 1 (me)
> 
> sym0: <895> port 0xb000-0xb0ff mem 0xc980-0xc9800fff,0xca00-0xcaff 
> irq 11 at device 10.0 on pci0
> sym0: Tekram NVRAM, ID 7, Fast-40, LVD, parity checking
> firewire0: New S400 device ID:0030e001e000177c
> Mounting root from ufs:/dev/da0s2a
> da2 at sbp0 bus 0 target 0 lun 0
> da2:  Fixed Simplified Direct Access SCSI-4 device
> da2: 50.000MB/s transfers, Tagged Queueing Enabled
> da2: 76319MB (156301488 512 byte sectors: 255H 63S/T 9729C)
> da0 at sym0 bus 0 target 0 lun 0
> da0:  Fixed Direct Access SCSI-3 device
> da0: 80.000MB/s transfers (40.000MHz, offset 31, 16bit), Tagged Queueing Enabled
> da0: 17501MB (35843670 512 byte sectors: 255H 63S/T 2231C)
> da1 at sym0 bus 0 target 1 lun 0
> da1:  Fixed Direct Access SCSI-2 device
> da1: 80.000MB/s transfers (40.000MHz, offset 15, 16bit), Tagged Queueing Enabled
> da1: 8683MB (17783240 512 byte sectors: 255H 63S/T 1106C)
> ---
> 
> 
> SCSI stuff from my kernel config file
> ---
> # using SCSI-IDE atapicam emulation for DVD/CDRW access.
> device  atapicam# emulate ATAPI devices as SCSI ditto via CAM
> # needs CAM to be present (scbus & pass)
>  
> # SCSI Controllers
> device  sym0# NCR/Symbios Logic (newer chipsets)
> device  scbus0 at sym0
> device  da0 at scbus0 target 0
> device  da1 at scbus0 target 1
> options SYM_SETUP_SCSI_DIFF #-HVD support for 825a, 875, 885
> options SCSI_DELAY=3000 #Delay (in ms) before probing SCSI
> # disabled:0 (default), enabled:1
> # SCSI peripherals
> device  scbus   # SCSI bus (required)
> device  da  # Direct Access (disks)
> device  sa  # Sequential Access (tape etc)
> device  cd  # CD
> device  pass# Passthrough device (direct SCSI access)
>  
> # Firewire support
> device  firewire# Firewire bus code
> device  sbp # SCSI over Firewire (Requires scbus and da)
> # devicefwe # Ethernet over Firewire (non-standard!)
> ---
> 
> On bootup with my firewire drive plugged in everything is fine.
> 
> Unplugging the firewire cable gives this in my console:
> fwohci0: BUS reset
> fwohci0: node_id=0xc800ffc0, gen=2, CYCLEMASTER mode
> firewire0: 1 nodes, maxhop <= 0, cable IRM = 0 (me)
> firewire0: bus manager 0 (me)
> 
> plugging the firewire cable back in gives this:
> fwohci0: BUS reset
> fwohci0: node_id=0xc800ffc1, gen=3, CYCLEMASTER mode
> fwohci0: SID Error
> 
> At this point I cannot mount the firewire drive anymore.
> 
> I have tried using 'fwcontrol -r' but no luck.

This was my operator error, the correct sequence for hot swapping this 
firewire hard drive on my motherboard is this:
(this is with firewire and sbp devices compiled into my kernel)

...
plug the firewire drive cable in
run 'fwcontrol -r'
mount the firewire drive
do backups
unmount the firewire drive
unplug the firewire drive c

Re: ATA failure with 4.6.2 & 250GB drive?

2003-10-14 Thread Kevin Oberman

> Date: Tue, 14 Oct 2003 09:55:54 +0100
> From: Scott Mitchell <[EMAIL PROTECTED]>
> Sender: [EMAIL PROTECTED]
> 
> On Mon, Oct 13, 2003 at 10:09:10AM +0100, Scott Mitchell wrote:
> > Hi all,
> > 
> > Just installed a Maxtor 250GB PATA drive in one of our servers, to be used
> > as a backup staging area.  This was actually a replacement for an identical
> > drive that appeared to have died after a month of service.
> > 
> > Anyway, 2 days after this drive was installed I start seeing this in the
> > daily logs:
> > 
> > > ad1s1e: hard error reading fsbn 850845887 of 425422912-425422943 (ad1s1 bn 
> > > 850845887; cn 52962 tn 180 sn 17) trying PIO mode
> > > ad1s1e: hard error reading fsbn 850845887 of 425422912-425422943 (ad1s1 bn 
> > > 850845887; cn 52962 tn 180 sn 17) status=59 error=40
> > > ad1s1e: hard error reading fsbn 850845887 of 425422912-425422943 (ad1s1 bn 
> > > 850845887; cn 52962 tn 180 sn 17) status=59 error=40
> > > ad1s1e: hard error reading fsbn 850845887 of 425422912-425422943 (ad1s1 bn 
> > > 850845887; cn 52962 tn 180 sn 17) status=59 error=40
> > ...
> 
> OK, swapped out the cable (from an 80- to 40-wire one, as it happened,
> although that should make no difference on a UDMA33 controller).  Same
> errors appeared again while the backups were running.
> 
> Some more information on how this drive is being used - we're dumping two
> vinum RAID5 volumes onto it, one local and one remote, writing to the
> backup disk over NFS.  Both dumps kick off at 0300, with the remote one
> finishing at 0305 last night.  The first ATA error appeared in the logs at
> 0325, while the local backup was still running.  The last error was logged
> at 0355, but the backup itself didn't finish until nearly 0500.
> 
> Anyone have any more ideas on how to diagnose this?  It does occur to me
> that the daily periodic run also kicks off at 0301 but that is usually all
> done before 0330.

It's a real drive problem, but possibly not a terminal one. (I had the
same issue on one of my drives a few months ago and it's fine now.)

The problem is that the system is getting an error trying to read this
area of the disk. It's an unmapped bunch of bad blocks. The system
gets an unrecoverable error trying to read these blocks and that is
what you see reported. Since it can't read "good" data, it does not
relocate the bad data, but just leaves it there and reports errors
every time it tries to read the data.

First, any files containing data stored in these blocks are probably
toast. Or, at least garbled. Sorry.

The fix/workaround is to move the file(s) involved so that the damaged
blocks are marked free and relocated to spar space on the drive. You
can try to figure out just which file(s) use those blocks. There
might even be a reasonable way to do this...I just don't know what it
is.

Another "fix"is to simply copy the drive onto another and then copy it
back. dd(1) will do the trick as will dump/restore. (I'd suggest the
dump/restore to copy the data out and dd to copy it back if the disks
have identical geometries.) Once the data is restored to the original
disk, the bad blocks will have been re-directed by the drive and will
no longer trouble you.

Modern disks are pretty smart at error recovery, but some failures are
too sudden for the drive to be able to deal with them without losing
data. 
-- 
R. Kevin Oberman, Network Engineer
Energy Sciences Network (ESnet)
Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
E-mail: [EMAIL PROTECTED]   Phone: +1 510 486-8634
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: ATA failure with 4.6.2 & 250GB drive?

2003-10-14 Thread DavidB

Kevin Oberman wrote:

Date: Tue, 14 Oct 2003 09:55:54 +0100
From: Scott Mitchell <[EMAIL PROTECTED]>
Sender: [EMAIL PROTECTED]
On Mon, Oct 13, 2003 at 10:09:10AM +0100, Scott Mitchell wrote:

Hi all,

Just installed a Maxtor 250GB PATA drive in one of our servers, to 
be used
as a backup staging area.  This was actually a replacement for an 
identical
drive that appeared to have died after a month of service.

Anyway, 2 days after this drive was installed I start seeing this in 
the
daily logs:

ad1s1e: hard error reading fsbn 850845887 of 425422912-425422943 
(ad1s1 bn 850845887; cn 52962 tn 180 sn 17) trying PIO mode
ad1s1e: hard error reading fsbn 850845887 of 425422912-425422943 
(ad1s1 bn 850845887; cn 52962 tn 180 sn 17) status=59 error=40
ad1s1e: hard error reading fsbn 850845887 of 425422912-425422943 
(ad1s1 bn 850845887; cn 52962 tn 180 sn 17) status=59 error=40
ad1s1e: hard error reading fsbn 850845887 of 425422912-425422943 
(ad1s1 bn 850845887; cn 52962 tn 180 sn 17) status=59 error=40

...

OK, swapped out the cable (from an 80- to 40-wire one, as it happened,
although that should make no difference on a UDMA33 controller).  Same
errors appeared again while the backups were running.
Some more information on how this drive is being used - we're dumping 
two
vinum RAID5 volumes onto it, one local and one remote, writing to the
backup disk over NFS.  Both dumps kick off at 0300, with the remote one
finishing at 0305 last night.  The first ATA error appeared in the 
logs at
0325, while the local backup was still running.  The last error was 
logged
at 0355, but the backup itself didn't finish until nearly 0500.

Anyone have any more ideas on how to diagnose this?  It does occur to me
that the daily periodic run also kicks off at 0301 but that is 
usually all
done before 0330.

It's a real drive problem, but possibly not a terminal one. (I had the
same issue on one of my drives a few months ago and it's fine now.)
The problem is that the system is getting an error trying to read this
area of the disk. It's an unmapped bunch of bad blocks. The system
gets an unrecoverable error trying to read these blocks and that is
what you see reported. Since it can't read "good" data, it does not
relocate the bad data, but just leaves it there and reports errors
every time it tries to read the data.
First, any files containing data stored in these blocks are probably
toast. Or, at least garbled. Sorry.
The fix/workaround is to move the file(s) involved so that the damaged
blocks are marked free and relocated to spar space on the drive. You
can try to figure out just which file(s) use those blocks. There
might even be a reasonable way to do this...I just don't know what it
is.
Another "fix"is to simply copy the drive onto another and then copy it
back. dd(1) will do the trick as will dump/restore. (I'd suggest the
dump/restore to copy the data out and dd to copy it back if the disks
have identical geometries.) Once the data is restored to the original
disk, the bad blocks will have been re-directed by the drive and will
no longer trouble you.
Modern disks are pretty smart at error recovery, but some failures are
too sudden for the drive to be able to deal with them without losing
data. 

Regarding a fix:

I had similar read error message not long ago when dumping to tape, 
wondered what they could mean. So I went to the hard drive 
manufacturer's website and download a DOS tool to scan/repair the 
harddrive.

Just to note an issue:  I had one bootdisk for to check my harddrive 
which was an Hitachi (HGST) drive in my laptop and one for the Western 
Digital which was the drive of concern. For some reason I used the 
software utility from Hitachi on the WD, which was a good thing, because 
it reported bad blocks and wouldn't fix them because it recognized that 
it wasn't their drive. Then I used the bootdisk I had created for the WD 
utility and ran it, (this is why it was a good thing) it did the scan 
reported NO issues, checked its logs to see if it had logged fixing any 
problems. The utilities logs said the drive had no issues.
Just to double check I re-ran the HGST tool and it didn't find any bad 
blocks.  Hmm. Those knuckle-heads at Western Digital made the utility to 
fix the bad blocks silently.  I find this under-handed because you might 
not find a disk going bad until the disk is totally failing. Hmm. wonder 
if this helps get it past the warranty before the drive completely fails.
[ya know when 1:bad blocks show up, 2:you clean 'em up 3:return to step1 
that the drive will die the death in the near future]

So you can get utilities from the manufacturer usually {atleast WD, 
HGST, and Seagate} to do some subset of turn on and off S.M.A.R.T., 
exercise the harddrive, scan for errors, repair errors, low-level 
format, 

If you don't mind booting from a DOS bootdisk to run the tool.  WD is a 
little confusing which to grab.  But note as I found out some 
manufacturer's might silently repair cert

Re: IPNAT/Slow TCP/Pings fine/4.8-REL

2003-10-14 Thread DavidB

Larry Rosenman wrote:


--On Monday, October 13, 2003 14:03:59 -0700 Chris Pressey 
<[EMAIL PROTECTED]> wrote:

On Mon, 13 Oct 2003 00:19:54 -0500
Larry Rosenman <[EMAIL PROTECTED]> wrote:
I was trying(!) to help a friend out, and built a 4.8-REL box
to play Router/NAT and it's ALMOST working.  I can't seem to telnet/surf
from NAT'd addresses, but PING works fine.
[...]
What am I missing?  What else do you/I need?
This was with the ipfilter ipnat.  I tried ipfw, and had the IPDIVERT
and the same symptoms.
What's got me is the fact that I can PING, and apparently do DNS 
lookups, but TCP just doesn't. :-(

LER

THanks for any QUICK replies!


"options IPDIVERT" in your kernel config...?

-Chris
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"



If you would post this to freebsd-questions you would probably get 
better service, since it is most likely a configuration issue.

And yes, it is my understanding that IPDIVERT is not needed for IPFILTER 
and ipnat. anyone?

the rc.conf  gateway_enable option and setting the sysctl forwarding 
option do the same thing, someone more knowledgeable can answer to that 
one.  Oh, I just checked it sets the forwarding but not fastforwarding.
So you need either method you choose, both is redundant.

You are not very descriptive: can ping?  ping [ip.num.for.localhost] or 
ping [ip.num.for.externalhost] or ping [host.domain.tld]

apparently do name lookups??  are you getting good results from
nslookup www.abcnews.com or such?
I think there is a top like command line option for ipfilter you can use 
to see what ipfilter is doing, but I am not sure if it is helpful with 
ipnat.

posting to questions instead, I think is appropriate.

Have a good day,
David




___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Spamassasin

2003-10-14 Thread Chris Stenton

On Tue, 2003-10-14 at 12:59, Philip Reynolds wrote:
> Chris Stenton <[EMAIL PROTECTED]> 25 lines of wisdom included:
> > I would second this but I use mailscanner which does the same job.
> 
> Mailscanner seems like a very poorly designed piece of software, at
> least from my experience with Postfix.
> 
> It directly manipulates the Postfix queue which can cause message
> corruption. This has been raised on the Postfix list recently and
> since Wietse & co. have been advising _against_ using it with
> Postfix.
> 
> Just an FYI.

Interesting. I still use sendmail (with check_local) and it seems OK
with that but it is setup to use an in and out queue.

Chris

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: ATA failure with 4.6.2 & 250GB drive?

2003-10-14 Thread Eduard Martinescu


If you want to see the SMART information from the hard drives, and you
are rrunning a recent 5-CURRENT (that includes that ATAng),
you can also test out the smartmontools package from
http://smartmontools.sourceforge.net  I just finished up some work on 
porting the code to FreeBSD, and if you check out the latest CVS version
(or soon to be release 5.21 release), you can help
me test it out and see what the drive itself is reporting.


The smartmontools package should also work with SCSI drives (utilizing
CAM), and that portion should work under a 4-STABLE release, although I
have to admit I haven't tested it.

I also plan on submitting a PORT for it in the very near future  (just
waiting for the .5.21 release)

Ed

On Tue, 2003-10-14 at 14:12, DavidB wrote:

> Kevin Oberman wrote:
> 
> >> Date: Tue, 14 Oct 2003 09:55:54 +0100
> >> From: Scott Mitchell <[EMAIL PROTECTED]>
> >> Sender: [EMAIL PROTECTED]
> >>
> >> On Mon, Oct 13, 2003 at 10:09:10AM +0100, Scott Mitchell wrote:
> >>
> >>> Hi all,
> >>>
> >>> Just installed a Maxtor 250GB PATA drive in one of our servers, to 
> >>> be used
> >>> as a backup staging area.  This was actually a replacement for an 
> >>> identical
> >>> drive that appeared to have died after a month of service.
> >>>
> >>> Anyway, 2 days after this drive was installed I start seeing this in 
> >>> the
> >>> daily logs:
> >>>
> >>>
>  ad1s1e: hard error reading fsbn 850845887 of 425422912-425422943 
>  (ad1s1 bn 850845887; cn 52962 tn 180 sn 17) trying PIO mode
>  ad1s1e: hard error reading fsbn 850845887 of 425422912-425422943 
>  (ad1s1 bn 850845887; cn 52962 tn 180 sn 17) status=59 error=40
>  ad1s1e: hard error reading fsbn 850845887 of 425422912-425422943 
>  (ad1s1 bn 850845887; cn 52962 tn 180 sn 17) status=59 error=40
>  ad1s1e: hard error reading fsbn 850845887 of 425422912-425422943 
>  (ad1s1 bn 850845887; cn 52962 tn 180 sn 17) status=59 error=40
> >>>
> >>>
> >>> ...
> >>
> >>
> >> OK, swapped out the cable (from an 80- to 40-wire one, as it happened,
> >> although that should make no difference on a UDMA33 controller).  Same
> >> errors appeared again while the backups were running.
> >>
> >> Some more information on how this drive is being used - we're dumping 
> >> two
> >> vinum RAID5 volumes onto it, one local and one remote, writing to the
> >> backup disk over NFS.  Both dumps kick off at 0300, with the remote one
> >> finishing at 0305 last night.  The first ATA error appeared in the 
> >> logs at
> >> 0325, while the local backup was still running.  The last error was 
> >> logged
> >> at 0355, but the backup itself didn't finish until nearly 0500.
> >>
> >> Anyone have any more ideas on how to diagnose this?  It does occur to me
> >> that the daily periodic run also kicks off at 0301 but that is 
> >> usually all
> >> done before 0330.
> >
> >
> >
> > It's a real drive problem, but possibly not a terminal one. (I had the
> > same issue on one of my drives a few months ago and it's fine now.)
> >
> > The problem is that the system is getting an error trying to read this
> > area of the disk. It's an unmapped bunch of bad blocks. The system
> > gets an unrecoverable error trying to read these blocks and that is
> > what you see reported. Since it can't read "good" data, it does not
> > relocate the bad data, but just leaves it there and reports errors
> > every time it tries to read the data.
> >
> > First, any files containing data stored in these blocks are probably
> > toast. Or, at least garbled. Sorry.
> >
> > The fix/workaround is to move the file(s) involved so that the damaged
> > blocks are marked free and relocated to spar space on the drive. You
> > can try to figure out just which file(s) use those blocks. There
> > might even be a reasonable way to do this...I just don't know what it
> > is.
> >
> > Another "fix"is to simply copy the drive onto another and then copy it
> > back. dd(1) will do the trick as will dump/restore. (I'd suggest the
> > dump/restore to copy the data out and dd to copy it back if the disks
> > have identical geometries.) Once the data is restored to the original
> > disk, the bad blocks will have been re-directed by the drive and will
> > no longer trouble you.
> >
> > Modern disks are pretty smart at error recovery, but some failures are
> > too sudden for the drive to be able to deal with them without losing
> > data. 
> 
> 
> Regarding a fix:
> 
> I had similar read error message not long ago when dumping to tape, 
> wondered what they could mean. So I went to the hard drive 
> manufacturer's website and download a DOS tool to scan/repair the 
> harddrive.
> 
> Just to note an issue:  I had one bootdisk for to check my harddrive 
> which was an Hitachi (HGST) drive in my laptop and one for the Western 
> Digital which was the drive of concern. For some reason I used the 
> software utility from Hitachi on the WD, which was a good thing, because 
> it reported bad blocks and wouldn't fi

Re: Spamassasin

2003-10-14 Thread Jethro R Binks

On Tue, 14 Oct 2003, Chris Stenton wrote:

> On Tue, 2003-10-14 at 12:59, Philip Reynolds wrote:
> >
> > Mailscanner seems like a very poorly designed piece of software, at
> > least from my experience with Postfix.
> >
> > It directly manipulates the Postfix queue which can cause message
> > corruption. This has been raised on the Postfix list recently and
> > since Wietse & co. have been advising _against_ using it with
> > Postfix.
>
> Interesting. I still use sendmail (with check_local) and it seems OK
> with that but it is setup to use an in and out queue.

MailScanner was originally written to interface with sendmail, using an
in-queue and an out-queue mechanism.  Sendmail's queue files are well
documented.  Exim support came shortly after, Exim being architecturally
very similar to sendmail.  Again, there is good documentation for the
queue format, and its queue handling is robust.

Exim/sendmail+MailScanner combinations are used extensively in the UK
academic community to good effect.  Those close to Exim's author use
Exim+MailScanner, and advice on one way of integrating Exim+MailScanner
was written by Exim's author.

MailScanner's design is pretty simple: it expects the SMTP daemon to place
incoming mail in a queue, and do nothing more with it.  MS will process
messages in that queue, and when done launch an outgoing mail process to
deliver it or place messages in an outgoing queue for an MTA to read.  It
handles locking and such like, and makes sure that messages are not
finally removed from the incoming queue until they have been fully
processed and submitted for onward delivery.

Postfix support was only fairly recently added after repeated requests.
I've heard that the queue format is less clearly documented (I don't know;
I don't use postfix).  I also understand that the Postfix developers
prefer other programs not to mess around with the Postfix queue directly.
Whether MS does so robustly or not I couldn't say: best ask someone who
runs Postfix+MS.  If Postfix's developers are unhappy with the way that MS
does so, then I guess it isn't surprising that they would advise against
using MS.  MS' developers strive for robustness, so if the information is
readily available on how to safely access the Postfix queue they will
probably have taken it into account.

Speaking personally, MS has saved us time and time again from email-bourne
threats over the past couple of years, and allowed us to implement a
fine-grained mail security policy that is customisable on a per-user basis
if necessary.  No other AV solution offers even half the features and
configurability that MS does, and MS now scans and protects huge amounts
of mail in many many installations.  Our site was protected from
Sobig.whatever before the thing was even released, without needing to wait
for AV definitions to be updated.

Just felt that a little defence of MS was necessary.

Jethro.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Jethro R Binks
Computing Officer, IT Services
University Of Strathclyde, Glasgow, UK
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: ATA failure with 4.6.2 & 250GB drive?

2003-10-14 Thread freebsd-stable

Hi Scott,

> OK, swapped out the cable (from an 80- to 40-wire one, as it happened,
> although that should make no difference on a UDMA33 controller).  Same
> errors appeared again while the backups were running.

FWIW, I have had _major_ problems (ie drive failure) when trying to use
an 80 wire cable on a UDMA33 board with  an ATA66 drive - I would recommend
never using 80 wire cables on controllers that don't support UDMA66 or
better. The reverse, using 80 wire cables on UDMA66 (or better) controllers
with ATA33 drives is ok, however, and I've done that successfully.

Once bitten, twice shy :)

regards,
-- joel --
AusCERT, ITS, Uni of Qld, Australia -- hotline: [+61] [07] 33654417
my opinions in this email are not endorsed by AusCERT or Uni of Qld
this message may not be onforwarded without my expressed permission

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: ATA failure with 4.6.2 & 250GB drive?

2003-10-14 Thread Dan Strick

> FWIW, I have had _major_ problems (ie drive failure) when trying to use
> an 80 wire cable on a UDMA33 board with  an ATA66 drive - I would recommend
> never using 80 wire cables on controllers that don't support UDMA66 or
> better. The reverse, using 80 wire cables on UDMA66 (or better) controllers
> with ATA33 drives is ok, however, and I've done that successfully.

The only difference between 40 and 80 wire ATA cables are the extra 40
wires which are all connected to ground and pin 34 in the host connector
which is not connected to the conductor in an 80 wire cable.  The ANSI
ATA/ATAPI-5 standard says that "80-conductor cable assemblies may be used
in place of 40-conductor cable assemblies to improve signal quality ..."
(section 4.2.2.1).

Unless someone can suggest a specific mechanism for the drive failures,
I would expect that your correlation between drive failures and using
80 conductor cables with UDMA33 controllers was just coincidence.

Dan Strick
[EMAIL PROTECTED]
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Using pam_ssh with gdm

Re: Firewire on STABLE: Sane for drive-based backups? (SOLVED operator error)

Re: Firewire on STABLE: Sane for drive-based backups?

Re: ATA failure with 4.6.2 & 250GB drive?

Re: ATA failure with 4.6.2 & 250GB drive?

Re: IPNAT/Slow TCP/Pings fine/4.8-REL

Re: Spamassasin

Re: ATA failure with 4.6.2 & 250GB drive?

Re: Spamassasin

Re: ATA failure with 4.6.2 & 250GB drive?

Re: ATA failure with 4.6.2 & 250GB drive?

11 matches

Site Navigation

Mail list logo

Footer information