Mount error: "Specified device does not match mounted device"

2003-10-02 Thread jjf
Hello to all.  Sorry if this is a little terse, but a production machine is down
and I'm very tired.

Our mailserver (running 4.5-STABLE) died today, due to massive read errros
from /dev/sd0s1a -- the root partition.

I figured we'd just replace the primary drive and all would be well.

I put the spare drive into a 4.9 box -- all I had available -- and
created my slice and partitions.  I did not match the parition sizes
exactly with previous drive, but in each case they were larger, so I
knew I'd have plenty of space.

So having made my slice and partitioned it, I pulled my backups off of
tape, transferred them to the 4.9 machine, and used 'tar xpzf' to write
them to the shiny new partitions.  All seemed well.

We put the new drive into the dead server, and fire it up.  Boot was halted
due to inability to load mount a vinum device.  But set that vinum issue
aside for now.

'/' was mounted read-only, and I attempted to force it into read-write
with 'mount -f /', which has always worked for me.  But not this time.
I got: 

  mount; /dev/ad0s1a on /: specified device does not match
  mounted device.

First thing I did was to verify that I had the proper entry for '/' 
in /etc/fstab, and indeed I did.  

I was however, able to successfully mount the other partitions I'd
created on the replacement disk.

I ran 'mount' after mounting the other partitions, and I saw a strange
thing.  All but one of the partitions were labled with their full device
name (i.e. "/dev/ad01s1f on /usr (ufs, local)"), but it was different 
for ad0s1a (the root partition).  It said "ad0s1a on / (read-only)".
What is the significance of this, the root partition, being only
partially labled  like that?

Note: this drive was ad0 on the mail server, but was ad1 on the 4.9
machine.  Could this be the cause of the problem?  Do the drives
somehow cache their most recent system designation?

I was also wondering if maybe the problem is related to disk partitioning
on a 4.9 machine and sticking the disk into a 4.5 machine.

I have seen in the archives other people with this general problem, but
it seems that in those cases the problem was incorrect entires in /etc/fstab,
so we're very confused here.

Any help would be appreciated,

John
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Why is PCE not set in CR4?

2003-10-02 Thread Grumble
I have read the perfmon documentation and source code. For several 
reasons, I do not think it is totally adequate in my situation.
This is an extension to the i386_vm86() syscall which will let you turn
PCE on and off if you're the superuser.
Now that I think on this a bit more, a sysctl might be a better place to
put this, but it seemed to belong with the i386_vm86() bits, rather than
polluting initcpu.c right away.
Is vm86 related to virtual-8086 mode? Probably not... What does vm86 
stand for? Virtual machine?

Mind you, if you're going to hack perfmon, perhaps putting this in initcpu
isn't such a bad idea after all, with a loader tunable instead. That way
perfmon can pickup on the tunable when attached by nexus during boot.
I am tempted to remove perfmon from the kernel, and write a kernel 
module for Athlon and another one for NetBurst.

Can a kernel module catch #UD (Invalid Opcode) and #GP (General 
Protection) exceptions generated from within the kernel module 
itself? Can I use sigaction(2)?

Can a kernel module catch a specific #GP exception generated from 
user land? Can I register a signal handler with sigaction(2)?

BTW, are performance-monitoring counters saved and restored on a 
context switch?

Shill

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Why is PCE not set in CR4?

2003-10-02 Thread Terry Lambert
Bruce M Simpson wrote:
> Now that I think on this a bit more, a sysctl might be a better place to
> put this, but it seemed to belong with the i386_vm86() bits, rather than
> polluting initcpu.c right away.

The important thing is to allow the kernel to intermediate and
control allocation of counters to applications, so where you put
it is less important than that it be a procedural interface.  A
sysctl can be a procedural interface, but it's kind of ugly.

-- Terry
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: CAM suspend

2003-10-02 Thread Jia-Shiun Li
Since the memory content will be kept across suspension,
I guess there is no need for da to take special care.
Just like what ad does.
The actual suspend/resume method is for ata-pci.
For hardware devices to come back to previous state,
the correct place may be in SCSI HBA driver like ahc, sym, etc.?
Jia-Shiun.

Walter C. Pelissero wrote:
Having noticed that there is not a big interest in it, among the
fellow FreeBSDers, I was about to set off and hack up the scsi
subsystem to implement spindown on suspend and spinup on resume of the
da devices, when I realized that there seems to be no hook in the SCSI
code for this events.
I'm not a device driver expert, so I'm looking for clues.

What I mean is that the ata-pci driver, for instance, specifies hooks
via the device_method_t structure which is not available in the
scsi_all or scsi_da modules.  I understand that they are simply
different kind of beasts (sitting on different layers of the kernel
code), but I was wondering if there might be a similar mechanism to do
what I want.
So, what is the recommended way (if there is one), to hook a function
of the SCSI subsystem to an event like suspend/resume?
I would most appreciate if anyone could point me to a suitable
document or even anything related to FreeBSD kernel hacking.
Cheers,

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


pam_opieaccess.so and opiepasswd -d

2003-10-02 Thread Eugene M. Kim
Greetings,

pam_opieaccess.so is documented to allow cleartext password (by 
returning PAM_SUCCESS) when OPIE is disabled for the user.

However, on both -current and 4-stable, pam_opieaccess.so checks whether 
OPIE is enabled only by checking the existence of the user's record from 
/etc/opiekeys.  Since a valid /etc/opiekeys record can also indicate 
that the OPIE access is disabled (i.e. one runs opiepasswd -d to set the 
value field to `'), I guess the module should check this 
as well.

Currently this check is not performed, so when one has pam_opie.so plus 
pam_opieaccess.so combination, users with explicitly disabled OPIE 
record and a cleartext password won't be able to log in even when 
/etc/opieaccess allows cleartext password logins.

Is the current behavior an intended feature, or should it be fixed (the 
patch would be trivial)?

Eugene

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


4.6.2-p23 and [tcp bad cksum]

2003-10-02 Thread Will Froning
After upgrading my 2 Sendmail servers to 4.6.2-p23 last week I ran into
an amazing issue sending mail to hotmail.com and msn.com addresses.

I noticed it because of the large queue for those domains building up
daily.  In the process of troubleshooting I noticed massive [bad tcp
cksum] messages in the tcpdump of a hotmail queue run.

At first I thought it was a networking problem maybe on our side, maybe
theirs.  To save disk space I moved it to a 4.9-PRE box and ran the
queue for the heck of it.  Amazingly all messages went through.  I was
still experiencing [bad tcp cksum] messages, but at a much lower rate
(enough to process all the messages).

For the heck of it I upgraded the lower volume box to 4.8 (was on my
schedule for this weekend anyhow) and messages started being sent out to
hotmail.com and msn.com address.  Once that worked I upgraded the other
4.6.2 box and mail is going smoothly.

So, this problem didn't happen until -p23 (arp patch).  If anyone needs
more information please send me a message.

If needed I have a partial tcpdump of the working 4.9-PRE box with cksum
errors, but at a greatly reduced quantity.

Thanks,
Will

P.S. the lower volume was running patched sendmail that came w/ 4.6.2
(8.12.3p2 I think?). The other is running 8.12.10.

-- 
Will Froning
Unix Sys. Admin.
(209)946-7470
(209)662-4725
[EMAIL PROTECTED]
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: 4.6.2-p23 and [tcp bad cksum]

2003-10-02 Thread Paul Saab
What NIC?
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: 4.6.2-p23 and [tcp bad cksum]

2003-10-02 Thread Will Froning
> Which is which? 

The 2 boxes that didn't work, one had a Fiber EM and the other had a
copper BGE running at 100 full (Dell 1650 and 2550 respectively)

The 4.9-pre is a Fiber EM (dell 2600).

> Which one was causing large amount of checksum errors?

Both of the 4.6.2 boxes.

> Which one fixed it? etc etc etc etc.

Upgrading the 4.6.2 boxes to 4.8 fixed the problem allowed the messages
to pass through.  Although I didn't record a tcpdump when it started
working, as it scrolled by I noticed a reduction in the number of
errors.

Shoot me an e-mail if you need more info.

Thanks,
Will

> Will Froning ([EMAIL PROTECTED]) wrote:
> > one was copper BGE the other a Fiber EM.
> > 
> > Will
> > 
> > On Thu, 2 Oct 2003 13:48:21 -0700
> > Paul Saab <[EMAIL PROTECTED]> wrote:
> > 
> > > What NIC?
> > 
> > 
> > -- 
> > Will Froning
> > Unix Sys. Admin.
> > (209)946-7470
> > (209)662-4725
> > [EMAIL PROTECTED]
> > 
> 
> -- 
> -ps


-- 
Will Froning
Unix Sys. Admin.
(209)946-7470
(209)662-4725
[EMAIL PROTECTED]
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: tty layer and lbolt sleeps

2003-10-02 Thread Mike Durian
On Tuesday 16 September 2003 04:47 pm, Mike Durian wrote:
> I'm trying to implement a serial protocol that is timing sensitive.
> I'm noticing things like drains and reads and blocking until the
> next kernel tick.  I believe this is due to the lbolt sleeps
> in the tty.c code.

Following up on my own post in case anyone was interested.

My assumption about the lbolt sleep was incorrect.  The delay
I'm seeing is not in the tty layer, it is in the sio driver.

If I change the tick count for the siobusycheck timeout
from (hz / 100) to just 1 and bump up HZ to 5000, I can get
some reasonable responsiveness with write and drain.

To get good responsiveness in the read direction, I need to force
the RX FIFO trigger level down to FIFO_RX_LOW.

After doing both those things, I can acheive the control I need.
However, I don't really like cranking up HZ just to get decent
sio(4) latencies.  I'm assuming the use of siobusycheck in a polled
manner is just an artifact from old crufty serial devices.  I
suppose uart(4) will clear this up when it is stable.

Adding an ioctl to set UART RX trigger levels would be something
I would find useful.  Perhaps others too.

I disagree with the following comment in the sio.c source:

 * Use a fifo trigger level low enough so that the input
 * latency from the fifo is less than about 16 msec and
 * the total latency is less than about 30 msec.  These
 * latencies are reasonable for humans.  Serial comms
 * protocols shouldn't expect anything better since modem
 * latencies are larger.

It makes the tacit assumption that all serial protocols go
through a modem and thus latency isn't important.  I suspect
I'm not the only person out there using a serial port that
isn't connected to a modem or a terminal.

mike


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Why is PCE not set in CR4?

2003-10-02 Thread Bruce M Simpson
On Thu, Oct 02, 2003 at 02:57:03PM +0200, Grumble wrote:
> Is vm86 related to virtual-8086 mode? Probably not... What does vm86 
> stand for? Virtual machine?

vm86 is something of a catchall for vm86-related functions. One of the
things it implements is a means of getting in and out of Virtual 8086
mode from a userland process. doscmd(1) uses this, as does my s3switch
port for getting into an S3 card's video BIOS to execute the functions
required to enable the video-out port.

A few other knobs exist in there for dealing with i386-specific things,
such as permitting access to an IO port range for a user process (by
changing the appropriate state in the TSS).

> I am tempted to remove perfmon from the kernel, and write a kernel 
> module for Athlon and another one for NetBurst.

I would ask you to please consider patching perfmon to do what you
need it to do.

> Can a kernel module catch #UD (Invalid Opcode) and #GP (General 
> Protection) exceptions generated from within the kernel module 
> itself? Can I use sigaction(2)?
> Can a kernel module catch a specific #GP exception generated from 
> user land? Can I register a signal handler with sigaction(2)?

What'll happen is that DDB will most likely catch the exception, unless
you specifically patch trap.c to catch those exceptions. You should also
look at the special exception handlers in identcpu.c.

> BTW, are performance-monitoring counters saved and restored on a 
> context switch?

Look at i386's definition of cpu_switch(). You'll need to find some
unused space in the TSS and do it yourself.

BMS
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Why is PCE not set in CR4?

2003-10-02 Thread Terry Lambert
Bruce M Simpson wrote:
> On Wed, Oct 01, 2003 at 11:39:36AM +0200, Grumble wrote:
> > >>However, I am not allowed to use the RDPMC instruction from ring 3
> > >>because the PCE (Performance-monitoring Counters Enable) bit is not set.
> > >
> > >You can do it with /dev/perfmon. man 4 perfmon.
> >
> > I have read the perfmon documentation and source code. For several
> > reasons, I do not think it is totally adequate in my situation.
[ ... ]
> 
> This is an extension to the i386_vm86() syscall which will let you turn
> PCE on and off if you're the superuser.

I like this a lot better.

To answer the inevitable question of "why": PCE counters are a
scarce resource, and the kernel needs to run interference on
their allocation and deallocation by user space applications, to
avoid collisions between applications; this is the same reason
we have AGP and sound card device drivers in the kernel.

I'm not sure if restricting this to root users is exactly
necessary, but it can't hurt, given that there is a performance
denial of service possible otherwise.

-- Terry
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"