from:"Kevin Day"

Re: Unkillable processes

1999-07-24 Thread Kevin Day


> I've got myself two processes which can't be gotten rid of by SIGKILL:
> 
> kkenn  92724 32.0  0.8  5736  356  ??  RN6:25PM 136:52.96 kvt -T Terminal -
> kkenn   1103  0.0  0.0  5740  388  ??  TWN  - 0:00.00 (kvt)
> 
> (kvt is the KDE 1.1.1 xterm)
> 
> I am able to trigger this by attempting to paste the contents of a large
> buffer from xemacs (v21.1 from ports) into the pico editor from pine4.
> 
> Any ideas before I recompile kvt with -g and try and track down what it's
> doing?
> 
> Kris
> 
> 

For one, do another 'ps' with the 'l' option, so you can see what it's stuck
on.

The second process is a zombie, which isn't killable until the parent tells
it to go away. (Which could very possibly be the first kvt)


Kevin


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: Unkillable processes

1999-07-24 Thread Kevin Day

> On Sat, 24 Jul 1999, Kevin Day wrote:
> 
> > For one, do another 'ps' with the 'l' option, so you can see what it's stuck
> > on.
> 
>   UID   PID  PPID CPU PRI NI   VSZ  RSS WCHAN  STAT  TT   TIME COMMAND
>  1000  1103  1086  29  75 20  5740  384 -  TWN   ??0:00.00 (kvt)
>  1000  1109  1103   0   4  0  15040 ttywri IWs+  p10:00.00 (tcsh)
> 
>  1000 92724  1086 279 105 20  5736  356 -  RN??  139:40.13 kvt -T Termi
>  1000 92743 92724   2  18  0  15760 pause  IWs   p80:00.00 (tcsh)
> 
> > The second process is a zombie, which isn't killable until the parent tells
> > it to go away. (Which could very possibly be the first kvt)
> 
> Both still present empty terminal windows on my desktop and were spawned 
> from the KDE panel. The second one was running a copy of pine and was in
> the same state as the other initially, until I kill -KILL'ed the pine
> process, at which point it changed to what it is now.
> 
> Kris

Well, since the CPU time in the active process (92724) went up since your
last e-mail, and it's in the RUN state (a - in the WCHAN and a R in the
STAT), it looks like the process is just spinning, eating CPU.

The tcsh listed below that is a zombie of the running kvt. If you can
somehow kill that kvt, the tcsh will go away.

The top kvt (1103) is also a zombie, waiting for it's parent to reap it.
Whatever process 1086 is decided not to clean it up, you may want to see
what it's doing.

Will process 92724 die if you kill -9 it?

This seems to be more of a kvt bug than a freebsd bug. :)

Kevin

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: Unkillable processes

1999-07-24 Thread Kevin Day


> On Saturday, 24 July 1999 at 20:51:37 -0500, Kevin Day wrote:
> >> On Sat, 24 Jul 1999, Kevin Day wrote:
> >>
> >>> For one, do another 'ps' with the 'l' option, so you can see what it's stuck
> >>> on.
> >>
> >>   UID   PID  PPID CPU PRI NI   VSZ  RSS WCHAN  STAT  TT   TIME COMMAND
> >>  1000  1103  1086  29  75 20  5740  384 -  TWN   ??0:00.00 (kvt)
> >>  1000  1109  1103   0   4  0  15040 ttywri IWs+  p10:00.00 (tcsh)
> >>
> >>  1000 92724  1086 279 105 20  5736  356 -  RN??  139:40.13 kvt -T Termi
> >>  1000 92743 92724   2  18  0  15760 pause  IWs   p80:00.00 (tcsh)
> >>
> > Well, since the CPU time in the active process (92724) went up since your
> > last e-mail, and it's in the RUN state (a - in the WCHAN and a R in the
> > STAT), it looks like the process is just spinning, eating CPU.
> 
> Right.
> 
> > The tcsh listed below that is a zombie of the running kvt. 
> 
> There aren't any zombies here.  
> 
> It's a child of the kvt.  It's not a zombie.  Take a look at the STAT
> field (and ps(1)): process 

Good point, i didn't notice that, i saw the ()'s from his first message,

> Process 92724 is runnable, nice and running (no WCHAN).  I really
> don't understand why you can't stop this one.

The only time I've seen this is when my console is getting flooded with
'vm_fault: pager error' messages for that process. Otherwise, there's no
reason why a running process can't be killed, correct?

Kevin


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: mountpoint locking with fbsd-nfs

1999-08-01 Thread Kevin Day


> Well, theoretically there is nothing wrong going on since you can mount
> things on top of an NFS directory.  Mount only complains about 
> duplicate normal partition mounts because it can't open the buffered
> block device the second time.  NFS doesn't care how many times a 
> directory is imported or exported.
> 
>   -Matt
>   Matthew Dillon 
>   <[EMAIL PROTECTED]>
> 
> 

You sure about you can export a directory multiple times? I can't even
export two directories under the same filesystem.

su-2.03# mount
/dev/wd0s1a on / (NFS exported, local, noatime, soft-updates, writes: sync 3945 async 
1317317)
procfs on /proc (local)
su-2.03# cat /etc/exports

/varhome
/var/tmphome
su-2.03# mountd
Aug  1 22:43:01 celery mountd[46042]: can't change attributes for /var/tmp
Aug  1 22:43:01 celery mountd[46042]: bad exports list line /var/tmp home 



It actually exported /, which may not have been what i wanted. :)

Or did I misunderstand you?

Kevin


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: mountpoint locking with fbsd-nfs

1999-08-01 Thread Kevin Day


> 
> To export a single filesystem multiple times, *all* of the attributes must
> be the same.  If they aren't the only person you are fooling is yourself, 
> since once a filesystem is NFS exported, it is open to the world.
> 
> anyway the syntax for what you want is:
> 
> /var /var/mailsome.machine
> 

Ahh.. That was a bad example I gave anyway... I wanted to have say... /a
exported to a few machines, and /b exported to only one machine... Couldn't
do it, which was kinda annoying. :)

Kevin


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: mountpoint locking with fbsd-nfs

1999-08-01 Thread Kevin Day


> 
> You misunderstood me.  The problem you have is the fact that NFS exports
> are usually limited to the physical mount point of the filesystem being
> exported.   Thus it thinks that /var above is the same as /, or that
> /var/tmp is the same as /var if both happen to be in the same partition.
> Mount gets confused by that when you specify what it believes to be the
> same partition several times in the exports list.
> 
> You can use the '-alldirs' flag in the exports list to export a partition
> and allow any subdirectory within that partition to be mounted instead of
> the partition itself.  There may be a way to export several specific
> subdirectories in the same partition but I'm not sure.
> 
> I was talking about things like:
> 
>   mount apollo:/usr   m1
>   mount apollo:/usr   m2
>   mount apollo:/usr   m3
>   mount apollo:/usr   m3
>   mount apollo:/usr   m3
> 
> I can import a filesystem as many times as I want, and even overlay mount
> points.
> 

Yeah, I know about -alldirs... The problem was that we had customers who
wanted us to export their home directories, and unless I gave them their own
filesystem, I couldn't restrict it in the manner i wanted. :)

Just checking to see that I wasn't missing a way to do this. :)

Kevin


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: mountpoint locking with fbsd-nfs

1999-08-01 Thread Kevin Day


> 
> :Yeah, I know about -alldirs... The problem was that we had customers who
> :wanted us to export their home directories, and unless I gave them their own
> :filesystem, I couldn't restrict it in the manner i wanted. :)
> :
> :Just checking to see that I wasn't missing a way to do this. :)
> :
> :Kevin
> 
> I've never in my life tried this - it probably won't work, but ...
> use the null device maybe to create a mount point for each home
> dir and then export that? 
> 

I think it sees through this.

su-2.03# cat /etc/exports
/varhome
/mnthome
su-2.03# mount
/dev/wd0s1a on / (NFS exported, local, noatime, soft-updates, writes: sync
3970 async 1321097)
procfs on /proc (local)
nfs:/home on /usr/home (noatime)
nfs:/var/mail on /var/mail (noatime)
/var/tmp on /mnt (local)
su-2.03# mountd
Aug  1 23:17:48 celery mountd[89177]: can't change attributes for /mnt

That was a very good idea though, i'd never have thought of it. :)

I'll have to play with this more. :)

Kevin


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: "The Matrix" screensaver, v.0.2

1999-08-21 Thread Kevin Day


At 10:26 AM 8/21/99 +0100, Nik Clayton wrote:
>On Fri, Aug 20, 1999 at 07:34:31PM +0200, Andrzej Bialecki wrote:
> > Both versions are available at:
> >
> >   http://www.freebsd.org/~abial/matrix_3.2.tgz
> >   http://www.freebsd.org/~abial/matrix_4.0.tgz
>FWIW, there are at least two other 'matrix' implementations out there.
>One is part of xscreensaver, and is quite nice -- it's even better if you
>halve the size of the image it's using first.  This has the advantage that
>the characters actually look like the ones in the film (reversed numbers
>and Japanese katana (sp?) characters).  That one's (obviously) X only.
>
>The other is 'cmatrix'.  A web search should turn it up.  As the name
>implies, this is a console version.

For those of you using Windows or MacOs

http://www.whatisthematrix.com/cmp/screensaver_index.html

That's the 'official' screen saver. (The Windows version uses some kind of 
runtime ShockWave and eats nearly 100% cpu, but it looks authentic)

Kevin



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: "The Matrix" screensaver, v.0.2

1999-08-21 Thread Kevin Day


A> Anyway, this module was meant more as a joke, but if you guys like it so
> > much you could vote for putting it in the tree...
>
>What do you mean "vote"? I was waiting for it to show up on my tree
>after a cvsup!

I hate to keep bringing things like this up, or start a legal war, but this 
screensaver is more than likely a copyright and/or trademark violation, and 
bringing it into the source tree may not be a good idea. Yes, lots of 
people may be making things like this, but it would probably be best to 
distance FreeBSD itself from such a thing.

Kevin

(speaking as an employee of a company who's products are frequently 
infringed on, and have been through this exact situation before, except 
from the other side)



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: "The Matrix" screensaver, v.0.2

1999-08-22 Thread Kevin Day

>On Sun, 22 Aug 1999, Andrzej Bialecki wrote:
>
> > On Sat, 21 Aug 1999, Kevin Day wrote:
> >
> > >
> > > I hate to keep bringing things like this up, or start a legal war, 
> but this
> > > screensaver is more than likely a copyright and/or trademark 
> violation, and
> > > bringing it into the source tree may not be a good idea. Yes, lots of
> > > people may be making things like this, but it would probably be best to
> > > distance FreeBSD itself from such a thing.
> >
> > You can trademark the title "The Matrix", but you can't trademark a common
> > word "matrix". That's the only word I use for the name of the module. As
> > Daniel mentioned, they even can't claim that it's their idea.
> >
> > So I think I can pretty safely import it.
> >
>
>If we wanted to be legally paranoid we would call it the "letter" saver
>and add as the comment the words

It's not just the name "Matrix" though. Make a screen saver of the Superman 
'S' logo, and see how quickly a certain comic book company comes after you. 
:) Making a derivative work based on something that was in a movie probably 
is a copyright violation. Warner Brothers could easily say that you've 
copied an element from their movie (even if it's not the entire movie), and 
even go so far as to get a judge to get any CD-ROM distributors of FreeBSD 
to recall all unsold CD's, and destroy them.

As for the trademark issue, it doesn't have to be a name to be trademarked. 
Logos, effects, and even sounds can be trademarked.

I'm really not trying to be annoying about things like this, but I already 
had to fight for the ability to be able to use FreeBSD at work, after they 
discovered other copyright/trademark violations in the source tree.. (Trek, 
etc). "If they'll steal things here, how do we know the entire kernel isn't 
stolen from somewhere else?" Yes, it's silly logic, but they do sort of 
have a point. We're selling a product with FreeBSD embedded in it. Should 
some copyright/patent holder come up proving that the VM system is his, and 
FreeBSD stole it from him, they could legally force us to recall every 
machine we've sold, and replace it with non-infringing materials. Obviously 
we're not shipping 'trek' on our system, and wouldn't include the matrix 
saver anyway, but I (for completely selfish reasons) would like to keep 
FreeBSD distanced from anything that could possibly be infringing on 
anything, and let you download it from somewhere else if you want it. :)

Kevin

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: "The Matrix" screensaver, v.0.2

1999-08-22 Thread Kevin Day

At 06:44 PM 8/22/99 +0200, Andrzej Bialecki wrote:
>On Sun, 22 Aug 1999, Kevin Day wrote:
>
>[trademark violation warning]
>
>Ok, maybe you have a point - I dont know, I'm not a lawyer. But with this
>line of reasoning they could claim that anything using falling letters
>effect on your screen partly violates they trademarked special effect,
>which is silly.
>
>What can we do, then? Why don't we ask them politely if it's ok?
>
>Andrzej Bialecki

While I can't speak for how Warner Brothers' lawyers are, as a general 
rule.. "It's much easier to say No, than it is to say Yes, and regret it 
later." If you ask, you'll probably get a No.

It's not that you made a falling letter effect, it's that you made a 
falling letter effect to copy the effect in the movie, and it does look 
very much like what's in the movie. That could be called willful 
infringement. While I doubt they'd stop a fan from making something like 
this, (I don't know this for a fact though, see what Paramount did with 
Trek sites before, or Mattel with Barbie) they may be more led to taking 
action against a product being sold that contains it. (FreeBSD being sold 
by Walnut Creek and others).

If this is distributed as a fan based thing, the worst they'd likely do is 
say "Take it down.". If this is on thousands of FreeBSD cd's, it could 
become a financial problem if they want to take it far enough.

This is just my opinion though, and not to be used as legal advice for 
anyone. I just don't want FreeBSD to become a ball of intellectual property 
infringements. :)

Kevin

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: 2 hours to compile mysql?

2000-01-03 Thread Kevin Day


> > Is amount of ram available (portably) to configure?
> > So configure could decide to use --low-memory by itself? Allowing
> > overrides, naturally.
> > 
> > Leif
> > 
> 
> There is actually a method to portably guess how much RAM your have available
> from configure -- just write a small C program that will keep malloc()-ing until
> it gets an error, but I do not think it is worth the effort.
> 
> -- 
> Sasha Pachev


How much ram you have and how much ram+swap you have before you hit your
limit is quite different. :)

# sysctl hw.physmem
hw.physmem: 400883712

This will return the amount of ram you have minus your kernel size, though.
Perhaps helpful if you really want to do this. :)



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: ascii art in hosts.allow

2000-01-24 Thread Kevin Day


> 
> On Tue, Jan 25, 2000 at 03:03:32PM +1100, Andy Farkas wrote:
> > Here is a patch:  (please notice the spelling correction)
> 
> Where?  I just ran ispell on src/etc/hosts.allow and it didn't catch
> anything.

A more direct patch would have been:

-# NOTE: The hosts.deny file is not longer used.  Instead, put both 'allow'
+# NOTE: The hosts.deny file is no longer used.  Instead, put both 'allow'


:)


Kevin


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: XFree86 3.9.18

2000-02-23 Thread Kevin Day


> 
> That's odd... I just built it tonight, and I havn't had anything but this:
> 
> XFree86-Bigfont extension: shmat() failed, size = 4096, errno = 24
> XFree86-Bigfont extension: shmat() failed, size = 4096, errno = 24
> XFree86-Bigfont extension: shmat() failed, size = 4096, errno = 24
> XFree86-Bigfont extension: shmat() failed, size = 4096, errno = 24
> XFree86-Bigfont extension: shmat() failed, size = 4096, errno = 24
> XFree86-Bigfont extension: shmat() failed, size = 4096, errno = 24
> XFree86-Bigfont extension: shmat() failed, size = 4096, errno = 24
> 
> Which doesn't cause the server to crash, in fact, it seems pretty stable.
> 

 24 EMFILE Too many open files.  Getdtablesize(2) will obtain the
 current limit.



You can probably bump up your ulimit, and make this go away, too.

Kevin


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

lo0 tcp connections in TIME_WAIT/LAST_ACK/FIN_WAIT?

2000-02-28 Thread Kevin Day




After upgrading from 3.4 to RC2, i'm noticing something that I never saw
before:


Active Internet connections (including servers)
Proto Recv-Q Send-Q  Local Address  Foreign Address(state)
tcp0  0  127.0.0.1.4954 127.0.0.1.4242 SYN_SENT
tcp0  0  127.0.0.1.4953 127.0.0.1.4242 TIME_WAIT
tcp0  0  127.0.0.1.4952 127.0.0.1.4242 TIME_WAIT
tcp0  0  127.0.0.1.4951 127.0.0.1.4242 TIME_WAIT
tcp0  0  127.0.0.1.4950 127.0.0.1.4242 TIME_WAIT
tcp0  0  127.0.0.1.4949 127.0.0.1.4242 TIME_WAIT
tcp0  0  127.0.0.1.4948 127.0.0.1.4242 LAST_ACK
tcp0  0  127.0.0.1.4947 127.0.0.1.4242 CLOSE_WAIT
tcp0  0  127.0.0.1.4945 127.0.0.1.4242 TIME_WAIT
tcp0  0  127.0.0.1.4944 127.0.0.1.4242 TIME_WAIT
tcp0  0  127.0.0.1.4942 127.0.0.1.4242 TIME_WAIT
tcp0  0  127.0.0.1.4940 127.0.0.1.4242 FIN_WAIT_1
tcp0  0  127.0.0.1.4938 127.0.0.1.4242 FIN_WAIT_1
tcp0  0  127.0.0.1.4937 127.0.0.1.4242 TIME_WAIT
tcp0  0  127.0.0.1.4936 127.0.0.1.4242 TIME_WAIT


Are tcp connections going through lo0 ever supposed to end up like this? I
thought everything that went through lo0 was supposed to be.. well..
instant and mostly lossless.  Any ideas?


Kevin




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: current.freebsd.org (FTP)

2000-02-29 Thread Kevin Day

> 
> 
> > 
> > Forrest Aldrich wrote:
> > 
> > > Is not allowing anonymous ftp logins.
> > > 
> > > To Unsubscribe: send mail to [EMAIL PROTECTED]
> > > with "unsubscribe freebsd-current" in the body of the message
> > 
> > I noticed this too...  Maybe there's too many users, and is refusing
> > connections?  Hmmm...
> 
>   ???  a little more information would be very welcome.
> 
> jmb

Here's what I see:

# ftp current.freebsd.org
Connected to usw2.freebsd.org.
220 usw2.freebsd.org FTP server (Version wu-2.6.0(1) Tue Jan 25 00:05:38 CST 2000) 
ready.
Name (current.freebsd.org:toasty): ftp
331 Guest login ok, send your complete e-mail address as password.
Password:
530 Login incorrect.
ftp: Login failed.

Kevin

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: lo0 tcp connections in TIME_WAIT/LAST_ACK/FIN_WAIT?

2000-03-02 Thread Kevin Day


> 
> > After upgrading from 3.4 to RC2, i'm noticing something that I never saw
> > before:
> > 
> > Active Internet connections (including servers)
> > Proto Recv-Q Send-Q  Local Address  Foreign Address(state)
> > tcp0  0  127.0.0.1.4954 127.0.0.1.4242 SYN_SENT
> > tcp0  0  127.0.0.1.4953 127.0.0.1.4242 TIME_WAIT
> > tcp0  0  127.0.0.1.4952 127.0.0.1.4242 TIME_WAIT
> > tcp0  0  127.0.0.1.4951 127.0.0.1.4242 TIME_WAIT
> > tcp0  0  127.0.0.1.4950 127.0.0.1.4242 TIME_WAIT
> > tcp0  0  127.0.0.1.4949 127.0.0.1.4242 TIME_WAIT
> > tcp0  0  127.0.0.1.4948 127.0.0.1.4242 LAST_ACK
> > tcp0  0  127.0.0.1.4947 127.0.0.1.4242 CLOSE_WAIT
> > tcp0  0  127.0.0.1.4945 127.0.0.1.4242 TIME_WAIT
> > tcp0  0  127.0.0.1.4944 127.0.0.1.4242 TIME_WAIT
> > tcp0  0  127.0.0.1.4942 127.0.0.1.4242 TIME_WAIT
> > tcp0  0  127.0.0.1.4940 127.0.0.1.4242 FIN_WAIT_1
> > tcp0  0  127.0.0.1.4938 127.0.0.1.4242 FIN_WAIT_1
> > tcp0  0  127.0.0.1.4937 127.0.0.1.4242 TIME_WAIT
> > tcp0  0  127.0.0.1.4936 127.0.0.1.4242 TIME_WAIT
> > 
> > 
> > Are tcp connections going through lo0 ever supposed to end up like this? I
> > thought everything that went through lo0 was supposed to be.. well..
> > instant and mostly lossless.  Any ideas?
> > 
> > Kevin
> 
> Hi,
> does that happen for any apps?
> Could you please give me info about what is the apps which use
> the port 4242?
> 
> Thanks,
> Yoshinobu Inoue
> 

Right now, it only seems to be happening for bbd (part of
/usr/ports/net/bb), when local connections are talking to bbd. (I moved bbd
to port 4242, it's default is port 1984)

Doing an ifconfig lo0 down ; ifconfig lo0 up seems to have cleared them,
too.

Kevin


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: Recent kernel hangs during boot with pnp sio.

1999-10-05 Thread Kevin Day


> 
> > Afaik all 3C509B's are PnP. At least here in the UK there is not
> > shortage of those cards.
> 
> If I can get a difinitive statement to this effect then I'll grab a
> 3c509B.  There was some question as to them actually being PnP though.
> 

Yes, the 3c509B can have PnP turned on or off through a DOS utility. Either
you set an IO/IRQ setting, or you set it to PnP and let the system do it. (I
believe they come with PnP enabled now, but before the default was off)

Kevin


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: make install trick

1999-10-06 Thread Kevin Day


> 
> On Wed, Oct 06, 1999 at 02:57:23PM +1000, a little birdie told me
> that Peter Jeremy remarked
> > 
> > I guess we disagree on this.  My feeling is that write activity on
> > root should be minimised to minimise the risk that root will be
> > inconsistent following a crash.
> 
> Indeed.
> Thus:
> /dev/da0s1a on / (local, synchronous, writes: sync 32 async 15100)
>  ^^^
> 
> Though I'm still waiting for an explanation of WHY exactly I have async
> writes on a sync partition.   Nobody yet has said anything but 'that's
> interesting...'.  A direction to look would be helpful.
> 

My understanding was that that was just a indication of writes that were
able to be done asynchronously without any risk, so they were done async.

(sync isn't purely sync, only synchronous when it's required for integrity)

Kevin


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: vga driver and signal

1999-01-02 Thread Kevin Day


> > Kind of complex though. Also the interrupt latency problem is still there.
> 
> Not sure that this is as elegant as what you are suggesting , can 
> the kernel schedule a user level routine to be executed when an interrupt 
> occurs? I guess on Windoze land this is called a driver call-back.
> 
> 

In a project I'm working on now (that some of you saw at FreeBSDCon) I had a
need to sync a lot of things to a vsync interrupt. I ended up writing a
small driver to attach to the video card.  My program would do a blocking
read on the device, which would put that process to sleep. The interrupt
handler would shove one byte of data back to the process through the read
(indicating interrupt status) and wake up the process.

This works, but still has a problem if latency and missed interrupts if you
aren't reading when the interrupt happens. (I've worked around those too,
but that's quite a bit more involved to fix it). You'll probably need to end
up changing the scheduler slightly, or playing with rtprio.

Kevin


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: Intel 810?

1999-12-06 Thread Kevin Day


> 
> > I recently got a quote from a hardware vendor which made the following
> > claim:
> > 
> > > All Socket 370PGA Motherboards use either the 810 or [the] 810c chip
> > > set which does not support FreeBSD because 16MB of the motherboard
> > > memory is used for the display controller.  There is no way to tell
> > > the FreeBSD kernel not to use this memory so it will corrupt data.
> > 
> > I find this statement rather dubious.  Can anyone out there say with
> > more certainty?
> 
> I can say with certainty that there are S370PGA boards that don't use the 
> 810; we have a number inhouse here that use the 440BX for example.
> 
> I'd be quite surprised if the 16MB shared video aperture wasn't correctly 
> described by the PnP data; this may require 4.x or 3.x with VM86 defined 
> to deal with it "right".  If nobody else has any commentary on this, 
> we'll get one into the lab.
> 

I'm using a Socket 370 board with a 440ZX and one with a 440BX, they both
work fine... I've got an 810c and an 810e that I'll try later today.

Kevin


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: Intel 810?

1999-12-06 Thread Kevin Day


> 
> < said:
> 
> > As others have stated, Socket370 boards arent all 810/810c...my 4.0-Current
> 
> The important issue to me is: will FreeBSD work on an 810 motherboard?
> The reason I care is because I need the form-factor (a 1U-high
> server); if I am to use some alternate motherboard, I'll need to be
> certain in advance that it will fit in a MicroATX opening.
> 

Have you considered NLX or LPX form factors? I can dig up the specs if you
want, Intel makes motherboards in both form factors.

Kevin


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: Serious server-side NFS problem

1999-12-15 Thread Kevin Day

> 
> In message <[EMAIL PROTECTED]>, Matthew Dillon writes:
> >
> >:>NFS uses the kernel 'boottime' structure to generate its version id.
> >:>Now normally you might believe that this structure, once set, will
> >:>never change.  The authors of NFS certainly make that assumption!
> >:
> >:Is this another case of "lets assume the time of day is a random number" or
> >:is there any underlying assumption about time in this ?
> >:
> >:--
> >:Poul-Henning Kamp FreeBSD coreteam member
> >:[EMAIL PROTECTED]   "Real hackers run -current on their laptop."
> >
> >It basically needs to be a unique for each server reboot in order
> >to allow clients to resynchronize.
> 
> Ok, then I suggest that you cache a copy of the boottime in the NFS
> code for this purpose.
> 

Ack, I was using this very same thing for several devices in an isolated
peer-to-peer network to decide who the 'master' was. (Whoever had been up
longest knew more about the state of the network) Having this change could
cause weirdness for me too... I assumed (without checking *thwap*) that
boottime was a constant.

Perhaps a 'real_boottime' or 'unadjusted_boottime' that gets copied after
'boottime' gets initialized so that others can use it, not just NFS? :)

Kevin

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: Serious server-side NFS problem

1999-12-16 Thread Kevin Day


> 
> > In message <[EMAIL PROTECTED]>, Kevin Day writes:
> > 
> > >Ack, I was using this very same thing for several devices in an isolated
> > >peer-to-peer network to decide who the 'master' was. (Whoever had been up
> > >longest knew more about the state of the network) Having this change could
> > >cause weirdness for me too... I assumed (without checking *thwap*) that
> > >boottime was a constant.
> > >
> > >Perhaps a 'real_boottime' or 'unadjusted_boottime' that gets copied after
> > >'boottime' gets initialized so that others can use it, not just NFS? :)
> > 
> > no, I think that is a bad idea.  In your case you want to use the
> > "uptime" which *is* a measure of how long the system has been
> > running.
> 
> Uptime is also a constantly changing number.  Forgive me for my
> ignorance, but why does bootime constantly change?  I would have thought
> it would be a constant?  I've got software that also uses this to
> determine when a new copy of it exists (although I do keep a local cache
> of the value in case my software crashes, since it can recover from a
> crash, but not a reboot).
> 
> I would think that boottime would be constant, since you didn't keep
> booting at a different time...
> 

Yeah, uptime is moving which makes it difficult for me too. When new
machines enter the network, they need to announce a number which is used to
decice who will become the master if the current master disappears. I could
just announce currenttime-uptime, but that's got a slightly different
meaning that I'll have to consider.

Anyway, enough of my proprietary mess, but... I do see a few uses for a
non-moving boottime, but won't argue here or now. :) This behaviour is
documented in time(9) though, so I really can't complain. :)

Kevin


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: MegaRAID jiggles clock?

2000-03-25 Thread Kevin Day


> 
> I'm wondering if the AMI MegaRAID controller/driver might be the
> reason that I'm getting a large number of clock resets from ntpd.
> About every half hour, ntpd seems to feel the need to reset the clock
> on the server by about 1/3 of a second.  The server has a moderate NFS 
> load (going out through 12 dc interfaces) and an AMI MegaRAID 1400
> controller with 8 disks in a RAID-5 config.
> 
> I have other servers with 12 dc ports, and havn't seen any
> particularly bad time performance from them, which is why I'm
> suspicious of the megaraid.  This machine is also using a motherboard
> common to many of our other machines.  None of our other servers (we
> have a "ring" of 5 time servers to which all our internal hosts
> connect) or clients appear to have any issues.
> 
> I have considered setting the option on ntpd to only adjust time by
> adjusting the frequency ... to see if this is just a bogon clock chip
> or somesuch.
> 
> ideas?
> 
> Dave.
> 

Granted, this is an old 4.0-current machine(from around September), but
I've seen heavy NFS server load affect the clocks on all three of my NFS
servers. The heavier the load, the faster the clock seems to run.

Mar 25 10:00:01 nfs ntpdate[75363]: adjust time server 192.160.127.90 offset -0.028636
Mar 25 11:00:02 nfs ntpdate[75406]: adjust time server 192.160.127.90 offset -0.033046
Mar 25 12:00:01 nfs ntpdate[75448]: adjust time server 192.160.127.90 offset -0.031371
Mar 25 13:00:01 nfs ntpdate[75490]: adjust time server 192.160.127.90 offset -0.030030
Mar 25 14:00:01 nfs ntpdate[75532]: adjust time server 192.160.127.90 offset -0.031346
Mar 25 15:00:00 nfs ntpdate[75573]: adjust time server 192.160.127.90 offset -0.030992
Mar 25 16:00:00 nfs ntpdate[75616]: adjust time server 192.160.127.90 offset -0.031654
Mar 25 17:00:00 nfs ntpdate[75657]: adjust time server 192.160.127.90 offset -0.031354

Because my NFS load isn't consistant throughout the day, xntpd seems to
really freak out about trying to keep it balanced. 

Relevant info:


Timecounter "i8254"  frequency 1193182 Hz
Timecounter "TSC"  frequency 337194185 Hz
CPU: AMD-K6(tm) 3D processor (337.19-MHz 586-class CPU)
  Origin = "AuthenticAMD"  Id = 0x580  Stepping = 0
  Features=0x8001bf
  AMD Features=0x8800
real memory  = 134217728 (131072K bytes)
avail memory = 126246912 (123288K bytes)
vinum: loaded
npx0:  on motherboard
npx0: INT 16 interface
pcib0:  on motherboard
pci0:  on pcib0
pcib1:  at device 1.0 on pci0
pci1:  on pcib1
isab0:  at device 7.0 on pci0
isa0:  on isab0
fxp0:  irq 11 at device 14.0 on pci0
fxp0: Ethernet address 00:90:27:34:c1:ec
ata-pci0:  irq 0 at device 15.0 on pci0
ata-pci0: Busmastering DMA supported
ata0 at 0x01f0 irq 14 on ata-pci0
ata1 at 0x0170 irq 15 on ata-pci0
vga-pci0:  at device 16.0 on pci0
pn0: <82c169 PNIC 10/100BaseTX> irq 10 at device 18.0 on pci0
pn0: Ethernet address: 00:a0:cc:3e:c6:3d
pn0: autoneg complete, link status good (full-duplex, 100Mbps)
ahc0:  irq 9 at device 20.0 on pci0
ahc0: aic7860 Single Channel A, SCSI Id=7, 3/255 SCBs
ata0: master: setting up UDMA2 mode on Aladdin chip OK


-- Kevin



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Load average calculation?

2000-04-02 Thread Kevin Day



I'm not sure if this is -current fodder or not, but since it's still
happening in -current, I'll ask.

We recently upgraded a server from 2.2.8 to 4.0(the same behavior is shown
on 5.0-current, too). Before, with the exact same load, we'd see load
averages from between 0.20 and 0.30. Now, we're getting:

load averages:  4.16,  4.23,  4.66

Top shows the same CPU percentages, just a much higher load average for the
same work being done. Did the load average calculation change, or something
with the scheduler differ? Customers are complaining that the load average
is too high, which is kinda silly, since 4.0 seems noticably faster in some
cases.

Any ideas?

Kevin


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: Load average calculation?

2000-04-02 Thread Kevin Day


> :We recently upgraded a server from 2.2.8 to 4.0(the same behavior is shown
> :on 5.0-current, too). Before, with the exact same load, we'd see load
> :averages from between 0.20 and 0.30. Now, we're getting:
> :
> :load averages:  4.16,  4.23,  4.66
> :
> :Top shows the same CPU percentages, just a much higher load average for the
> :same work being done. Did the load average calculation change, or something
> :with the scheduler differ? Customers are complaining that the load average
> :is too high, which is kinda silly, since 4.0 seems noticably faster in some
> :cases.
> :
> :Any ideas?
> :
> :Kevin
> 
> I believe the load average was changed quite a while ago to reflect not
> only runnable processes but also processes stuck in disk-wait.  It's
> a more accurate measure of load.
> 

Ahh, and since nearly everything is done on this system via NFS, I can
imagine that several things are waiting for NFS responses. 

It's probably more accurate, but from a PR standpoint it makes it "look"
like FreeBSD is choking under the load, when it really isn't. Or am I the
only one that even cares about this? :)


Kevin


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: Load average calculation?

2000-04-02 Thread Kevin Day


> > > I believe the load average was changed quite a while ago to reflect not
> > > only runnable processes but also processes stuck in disk-wait.  It's
> > > a more accurate measure of load.
> > 
> > Ahh, and since nearly everything is done on this system via NFS, I can
> > imagine that several things are waiting for NFS responses. 
> > 
> > It's probably more accurate, but from a PR standpoint it makes it "look"
> > like FreeBSD is choking under the load, when it really isn't. Or am I the
> > only one that even cares about this? :)
> 
> What does the man page for 'w' say about it? At least the change should be
> reflected there I guess.

getloadavg(3)(which 'w' and 'uptime' use) says:

 The getloadavg() function returns the number of processes in the system
 run queue averaged over various periods of time.


The 'w' and 'uptime' manpages really don't mention anything relevant.

Kevin


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: Load average calculation?

2000-04-03 Thread Kevin Day


> 
> < said:
> 
> > It's probably more accurate, but from a PR standpoint it makes it "look"
> > like FreeBSD is choking under the load, when it really isn't.
> 
> Actually, you have it backwards -- it makes it look as if FreeBSD is
> *not* choking under what appears to be a very heavy load
> 
> -GAWollman
> 

Well, my first impression was "Well, before doing this task the load average
was only 0.20, now it's 4.0, obviously it can't keep up now." Which could
probably be extended to "Under Linux the load average for running my
database is only 0.20, FreeBSD's is 4.0, Linux must be faster."

Granted it's flawed logic, but it's all a matter of perception at times.

Kevin


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Panic with userquota(softupdates?)

2000-06-16 Thread Kevin Day




I keep getting panics in dqget(ufs_quota.c), with a -current from a couple
of days ago. I think this might be softupdates related, since I can't make
it happen with softupdates turned off, although it's quite possible that it
has nothing to do with it. Does anyone have any idea what might be causing
this?

Any other information that might be useful here?

-- Kevin



SMP 2 cpus
IdlePTD 3813376
initial pcb at 3178c0
panicstr: page fault
panic messages:
---
Fatal trap 12: page fault while in kernel mode
mp_lock = 0002; cpuid = 0; lapic.id = 
fault virtual address   = 0x0
fault code  = supervisor write, page not present
instruction pointer = 0x8:0xc023d3d2
stack pointer   = 0x10:0xd9176d28
frame pointer   = 0x10:0xd9176d78
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 3384 (smbd)
interrupt mask  = none <- SMP: XXX
trap number = 12
panic: page fault
mp_lock = 0002; cpuid = 0; lapic.id = 
boot() called on cpu#0

syncing disks... 
done
Uptime: 5h9m37s

dumping to dev #da/0x20001, offset 1735462
dump 511 510 509 508 507 506 505 504 503 502 501 500 499 498 497 496 495 494 493 492 
491 490 489 488 487 486 485 484 483 482 481 480 479 478 477 476 475 474 473 472 471 
470 469 468 467 466 465 464 463 462 461 460 459 458 457 456 455 454 453 452 451 450 
449 448 447 446 445 444 443 442 441 440 439 438 437 436 435 434 433 432 431 430 429 
428 427 426 425 424 423 422 421 420 419 418 417 416 415 414 413 412 411 410 409 408 
407 406 405 404 403 402 401 400 399 398 397 396 395 394 393 392 391 390 389 388 387 
386 385 384 383 382 381 380 379 378 377 376 375 374 373 372 371 370 369 368 367 366 
365 364 363 362 361 360 359 358 357 356 355 354 353 352 351 350 349 348 347 346 345 
344 343 342 341 340 339 338 337 336 335 334 333 332 331 330 329 328 327 326 325 324 
323 322 321 320 319 318 317 316 315 314 313 312 311 310 309 308 307 306 305 304 303 
302 301 300 299 298 297 296 295 294 293 292 291 290 289 288 287 286 285 284 283 282 
281 280 279 278 277 276 275 274 273 272 271 270 269 268 267 266 26!
5 264 263 262 261 260 259 258 257 256 255 254 253 252 251 250 249 248 247 246 245 244 
243 242 241 240 239 238 237 236 235 234 233 232 231 230 229 228 227 226 225 224 223 
222 221 220 219 218 217 216 215 214 213 212 211 210 209 208 207 206 205 204 203 202 
201 200 199 198 197 196 195 194 193 192 191 190 189 188 187 186 185 184 183 182 181 
180 179 178 177 176 175 174 173 172 171 170 169 168 167 166 165 164 163 162 161 160 
159 158 157 156 155 154 153 152 151 150 149 148 147 146 145 144 143 142 141 140 139 
138 137 136 135 134 133 132 131 130 129 128 127 126 125 124 123 122 121 120 119 118 
117 116 115 114 113 112 111 110 109 108 107 106 105 104 103 102 101 100 99 98 97 96 95 
94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 
65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 
36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 
6 5 4 3 2 1 0 
---
#0  boot (howto=256) at ../../kern/kern_shutdown.c:303
303 dumppcb.pcb_cr3 = rcr3();
(kgdb) bt
#0  boot (howto=256) at ../../kern/kern_shutdown.c:303
#1  0xc016a2d9 in panic (fmt=0xc02c532f "page fault") at ../../kern/kern_shutdown.c:553
#2  0xc0282190 in trap_fatal (frame=0xd9176ce8, eva=0) at ../../i386/i386/trap.c:927
#3  0xc0281e01 in trap_pfault (frame=0xd9176ce8, usermode=0, eva=0) at 
../../i386/i386/trap.c:820
#4  0xc028196b in trap (frame={tf_fs = 24, tf_es = 16, tf_ds = -652804080, tf_edi = 
-653726848, 
  tf_esi = -1041411164, tf_ebp = -652776072, tf_isp = -652776172, tf_ebx = 
-1038103936, 
  tf_edx = 0, tf_ecx = -1012515072, tf_eax = 0, tf_trapno = 12, tf_err = 2, 
  tf_eip = -1071393838, tf_cs = 8, tf_eflags = 66118, tf_esp = -1012515072, tf_ss 
= -633865984})
at ../../i386/i386/trap.c:426
#5  0xc023d3d2 in dqget (vp=0xda37f900, id=65534, ump=0xc1f76200, type=0, 
dqp=0xc3a63f44)
at ../../ufs/ufs/ufs_quota.c:763
#6  0xc023c796 in getinoquota (ip=0xc3a63f00) at ../../ufs/ufs/ufs_quota.c:95
#7  0xc023ddb5 in ufs_access (ap=0xd9176dfc) at ../../ufs/ufs/ufs_vnops.c:324
#8  0xc02408e9 in ufs_vnoperate (ap=0xd9176dfc) at ../../ufs/ufs/ufs_vnops.c:2287
#9  0xc019fc4b in vn_open (ndp=0xd9176ec8, fmode=3, cmode=484) at vnode_if.h:247
#10 0xc019bd8d in open (p=0xd916eac0, uap=0xd9176f80) at ../../kern/vfs_syscalls.c:995
#11 0xc02824c1 in syscall2 (frame={tf_fs = 47, tf_es = 47, tf_ds = -1078001617, 
  tf_edi = -1077940192, tf_esi = 144677504, tf_ebp = -1077939168, tf_isp = 
-652775468, 
  tf_ebx = 484, tf_edx = 2, tf_ecx = 135418240, tf_eax = 5, tf_trapno = 7, tf_err 
= 2, 
  tf_eip = 673002960, tf_cs = 31, tf_eflags = 663, tf_esp = -1077941548, tf_ss = 
47})
at ../../i386/i386/trap.c:1126
#12 0x

Re: mount_nfs/df bug?

2000-06-21 Thread Kevin Day


> 
> Hello!
> 
> Today I wanted to add a new NFS to my /etc/fstab, but forgot to add it
> to /etc/exports on the server.
> However, I did mount -a several times and always got a "Permission
> denied" for the last one.
> 
> Now look what I have here:
> 
> Filesystem   1K-blocks UsedAvail Capacity  Mounted on
> /dev/ad2a   396895   2919047324080%/
> /dev/ad2e  5257421  4626154   21067496%/usr
> procfs   440   100%/proc
> /dev/ad0s1 4224828  3755464   46936489%/dos
> neutron:/usr/ports  496367   3634489321080%/usr/ports
> neutron:/usr/ports-distfiles   2482878  1191660  109258852%
>/usr/ports-distfiles
> neutron:/usr/home/ncvs  992439   9606843175597%/usr/home/ncvs
> neutron:/usr/src928695   482371   37202956%/usr/src
> neutron:/usr/home/mp3  9591515  9298876   29263997%/usr/home/mp3
> neutron:/usr/home/brenn 695311   5948384484993%/usr/home/brenn
> neutron:/www/docs   297423   168669   10496162%/www
> neutron:/usr/doc   2482878  1191660  109258852%/usr/doc
> neutron:/usr/home/ncvs  992439   9606843175597%/usr/home/ncvs
> neutron:/usr/home/mp3  9591515  9298876   29263997%/usr/home/mp3
> neutron:/usr/home/brenn 695311   5948384484993%/usr/home/brenn
> neutron:/usr/home/ncvs  992439   9606843175597%/usr/home/ncvs
> neutron:/usr/home/mp3  9591515  9298876   29263997%/usr/home/mp3
> neutron:/usr/home/brenn 695311   5948384484993%/usr/home/brenn
> neutron:/usr/home/ncvs  992439   9606843175597%/usr/home/ncvs
> neutron:/usr/home/mp3  9591515  9298876   29263997%/usr/home/mp3
> neutron:/usr/home/brenn 695311   5948384484993%/usr/home/brenn
> neutron:/usr/home/ncvs  992439   9606843175597%/usr/home/ncvs
> neutron:/usr/home/mp3  9591515  9298876   29263997%/usr/home/mp3
> neutron:/usr/home/brenn 695311   5948384484993%/usr/home/brenn
> 
> Cute, isn't it?
> 
> Not yet discovered why.
> 
> Alex
> -- 
> cat: /home/alex/.sig: No such file or directory
> 
> 
> To Unsubscribe: send mail to [EMAIL PROTECTED]
> with "unsubscribe freebsd-current" in the body of the message
> 

This is probably similar to this:

http://www.freebsd.org/cgi/query-pr.cgi?pr=6187


-- Kevin


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

function name collision on "getcontext" with ports/editors/joe

2002-02-09 Thread Kevin Day



I'm the maintainer for ports/editors/joe, and just tried compiling it under
-CURRENT.

 includes  which includes ucontext.h

> cc -O -pipe  -c umath.c
> In file included from b.h:6,
>  from bw.h:23,
>  from umath.c:5:
> rc.h:41: conflicting types for `getcontext'
> /usr/include/sys/ucontext.h:54: previous declaration of `getcontext'
> *** Error code 1
> 
> Stop in /usr/ports/editors/joe/work/joe.


I can rename getcontext in joe, but "getcontext" seems like a pretty common
function name, I know I've used it in projects before. Not including
signal.h isn't really an option either.

I'm not familiar with any of the ucontext.h functions, are they complying
with some kind of standard and can't be renamed or have a prefix added to
it?

-- Kevin

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

function name collision on "getcontext" with ports/editors/joe

2002-02-11 Thread Kevin Day



I'm the maintainer for ports/editors/joe, and just tried compiling it under
-CURRENT.

 includes  which includes ucontext.h

> cc -O -pipe  -c umath.c
> In file included from b.h:6,
>  from bw.h:23,
>  from umath.c:5:
> rc.h:41: conflicting types for `getcontext'
> /usr/include/sys/ucontext.h:54: previous declaration of `getcontext'
> *** Error code 1
> 
> Stop in /usr/ports/editors/joe/work/joe.


I can rename getcontext in joe, but "getcontext" seems like a pretty common
function name, I know I've used it in projects before. Not including
signal.h isn't really an option either.

I'm not familiar with any of the ucontext.h functions, are they complying
with some kind of standard and can't be renamed or have a prefix added to
it?

-- Kevin

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: Filesystem deadlock

1999-02-23 Thread Kevin Day

> On Mon, 22 Feb 1999, Alexander N. Kabaev wrote:
> 
> > The following script reliably causes FreeBSD 4.0-CURRENT (and 3.1-STABLE
> > as of today) to lookup. Shortly after this script is started, all disk 
> > activity
> > 
> > stops and any attempt to create new process causes system to freese. While 
> > in DDB, ps command
> > 
> > shows, that all ten fgrep processes are sleeping on inode, all xargs are in 
> > waitpid and
> > 
> > all sh processes are in wait.
> 
> You forget about all the processes (just a few, actually) stuck in "kmaw"
> (kmem_alloc_wait). This is definitely reproducible :( Should be simple for
> someone more knowledgeable to diagnose, as it looks to be a straight
> vm/vfs(ufs/ffs) interaction.


This is happening to me too, with a system that was from the 19th's SNAP, as
well as today's kernel. (except I don't see anything in 'kmaw'). The process
'swapper' is stuck in 'inode', as well as anything else that's tried to
touch the disk. Lots of 'sh's sitting in 'wait'.

This machine is a heavy NFS client, but I'm not sure that it's related.


Kevin


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message

Re: panic: zone: entry not free

1999-03-10 Thread Kevin Day

> :> :
> :> :This means that invariants need to add relatively little overhead.
> :> :
> :> :Peter
> :> 
> :>  which they do.
> :
> :You know, guys, for programmers, wanting immediate panics on stuff like
> :this is great, but there isn't one user in a thousand that wants this.
> :If you make this kinda stuff default on a version *other than* current
> :(current being by definition, for programmers/developers only) then
> :you're going to hear bloody murder, and you guys will be doing vast
> :damage to FreeBSD's reputation.
> :
> :Users don't want panics, and they don't care why, they just want things
> 
> No no no... you are missing the whole point.
> 
> *IF* we put these kinds of checks in by default, the result is a 
> few more panics in the near term, but *NO* panics in the medium and
> long term.
> 
> In otherwords, by putting the checks in now, the kernel gets debugged
> much more quickly --- to the point where a year down the line we no
> longer get kernel panics at all.
> 


Also, try commenting out a panic line in a known bug, and watch how
quickly the kernel crashes anyway, in the same situation.

Most of the time, the panic is dumping out (some) debugging information
before crashing all over itself. Just taking out the diagnostic message is
really just making crashes more obscure.

If the error were recoverable, normally the system recovers from it. If it's
not, it panic's and dies. Take out the panic, and all you've got left is a
'die', which probably will lead people on a wild goose chase as to where
that section of memory really got trashed.

Kevin


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message

Re: Games

1999-04-01 Thread Kevin Day

> On Thu, 1 Apr 1999, Rod Taylor wrote:
> > Just out of curiosity, why are there games included in the FreeBSD
> > source tree?
> > 
> > For a group of people that was so worried about including dhcp because
> > it's extra code, don't you think it's time to make those games into
> > ports only?
> > 
> > I say this under the assumption that they're not required for FreeBSD to
> > function.  (Not like IE for windows ;)
> 
> As far as I am concerned, things like fortune, pom, pig and banner
> have been included with BSD-ish systems for ages... Tradition...
> I wouldn't feel the same if I didn't get my fortune every login.
> 
> also, don't you have the option to not have the games?

It's not just a matter of turning them off though. A few of the games in the
distro are trademark infringements. While the product I'm developing that
uses FreeBSD doesn't have the games installed, it brought up the comment
from our lawyers "What else are they infringing on that we *are* using?"

(see trek, mille, boggle, tetris, wargames)

Kevin


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message

Re: Games

1999-04-01 Thread Kevin Day

> :It's not just a matter of turning them off though. A few of the games in the
> :distro are trademark infringements. While the product I'm developing that
> :uses FreeBSD doesn't have the games installed, it brought up the comment
> :from our lawyers "What else are they infringing on that we *are* using?"
> :
> :(see trek, mille, boggle, tetris, wargames)
> :
> :Kevin
> 
> Tetris hasn't been in the distribution for a while.
> 

Oops, i did an 'ls /usr/games' on a machine that's been around since 2.2.2.
Ok, forget tetris.


> Your lawyers need a dose of reality if they think the existance of
> the other games in a distribution could ever come back to haunt you
> or your company.  Tell them to screw their heads on straight and try
> again.  If your really worried, just delete them.
> 

That wasn't really the issue though. It's that FreeBSD is infringing on
trademarks in the games distribution, so what's to make them think that the
vm system isn't an infringement? (i.e. "They're the type of people who don't
care about trademarks/copyrights. How can we trust this code?") It also
probably doesn't help things that i'm working for a video game company.

/usr/games is deleted here, anyway. :)

Kevin


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message

Using 4.3-RELEASE's libc on 5.0 causes hard lockups

2003-02-02 Thread Kevin Day


We had a system running 4.3-RELEASE that I used the sysinstall upgrade 
mechanism to upgrade to 5.0-RELEASE. I installed "compat4x" to use our 
existing 4.x binaries.

Immediately after rebooting, I noticed most old 4.x binaries were 
complaining about "_stdoutp" being an undefined symbol. However, the scary 
part was that when I started apache/mod_php4 the server crashed (hard 
lockup) within 10 seconds under load. This was easily reproducible, at 
least a dozen times while trying to debug this I started httpd, and the 
server locked up within 10 seconds.

I recompiled all of apache, mod_php4 and all of its libraries, started up 
httpd and had no problems with that.

Things were fine that night until an "analog" cron job ran, every time THAT 
ran, I also got a hard lockup of the server, OR between 100 and 500 of my 
httpd processes would suddenly SEGV.

After a little more poking around, I saw in /usr/lib:

lrwxr-xr-x 1 root wheel 9 Feb 1 00:18 libc.so -> libc.so.5
lrwxr-xr-x 1 root wheel 16 Jul 5 2002 libc.so.3 -> /usr/lib/libc.so
-r--r--r-- 1 root wheel 571480 Aug 5 13:45 libc.so.4
-r--r--r-- 1 root wheel 836892 Feb 1 00:18 libc.so.5


Shouldn't libc.so.4 have been a symlink to libc.so after a compat4x 
install? In any case, doing that myself seemed to fix everything.

My questions:

1) Shouldn't something along the way of doing a sysinstall upgrade or 
installing compat4x have fixed /usr/lib/libc.so.4 into a symlink? (That is 
the correct situation, right?)

2) Is it possible that some kernel interface has changed, and something 
isn't being validated in the kernel side? Non-root userland applications 
being able to lockup the server, and/or affect other processes simply by 
using a different libc would seem to indicate this.


I know this is a pretty vague bug report,  but this is a production server, 
so I wasn't able to play around too much with it. I do have a backup of the 
entire server before it was upgraded to 5.0 if you'd like me to check 
anything there. I did compile with INVARIANTS and WITNESS and got no 
debugging output when things did lock up. The keyboard and serial console 
were totally dead when this happened, so DDB isn't an option either.


(originally emailed security-officer about this because of the possibility 
for a security issue, who told me to forward this here)




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: Using 4.3-RELEASE's libc on 5.0 causes hard lockups

2003-02-02 Thread Kevin Day

At 11:42 AM 2/2/2003, Jacques A. Vidrine wrote:

On Sun, Feb 02, 2003 at 11:41:32AM -0600, Kevin Day wrote:
> lrwxr-xr-x 1 root wheel 9 Feb 1 00:18 libc.so -> libc.so.5
> lrwxr-xr-x 1 root wheel 16 Jul 5 2002 libc.so.3 -> /usr/lib/libc.so
^
This is seriously messed up.  See below.

> -r--r--r-- 1 root wheel 571480 Aug 5 13:45 libc.so.4
> -r--r--r-- 1 root wheel 836892 Feb 1 00:18 libc.so.5
>
>
> Shouldn't libc.so.4 have been a symlink to libc.so after a compat4x
> install? In any case, doing that myself seemed to fix everything.

No, this would cause you major problems.  Binaries that expected the
libc.so.4 interface would be calling into libc.so.5, and probably
causing very strange behaviour.

Ok, I admit, no matter how it happened, an application using the wrong libc 
is a bad thing.

But, how are things supposed to work? Apps that were using the old 
libc.so.4 complained about unresolved symbols(_stdoutp usually). If I 
removed /usr/lib/libc.so.4 they complained that they couldn't find libc, If 
I did create link libc.so.4 to libc.so.5 everything appeared to work just 
fine, but I know that's probably a fluke.

In any case, a system lockup or being able to crash other user's processes 
just by having the wrong libc shouldn't be possible no matter what happens.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: Using 4.3-RELEASE's libc on 5.0 causes hard lockups

2003-02-02 Thread Kevin Day

At 11:54 AM 2/2/2003, Jacques A. Vidrine wrote:

> Ok, I admit, no matter how it happened, an application using the wrong 
libc
> is a bad thing.
>
> But, how are things supposed to work?

Apps that need the old libc.so.4 will find it in
/usr/lib/compat/libc.so.4 (or /usr/lib/libc.so.4 if you didn't remove
it, for that matter).

Well, things were definitely picking /usr/lib/libc.so.4 over anything in 
compat. Should sysinstall have nuked my /usr/lib/libc if it was putting the 
correct one in compat?

> In any case, a system lockup or being able to crash other user's processes
> just by having the wrong libc shouldn't be possible no matter what happens.

Probably not, although if you have processes running as root and using
the `wrong' libc, all bets are off.

Well, after I recompiled httpd (which did have a single process owned by 
root) and rebooted, nothing at all owned by root touched anything that was 
compiled under 4.x. Non-privileged regular users owned the process owned by 
analog, which caused the same behavior. Me running analog under my normal 
account could kill processes owned by "nobody" with segfaults.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: DoS from local users (fwd)

1999-04-10 Thread Kevin Day

> 
> :
> :It should be possible to prevent a user from hogging a system if the system's
> :naive scheduler is improved.
> :
> : Amancio
> 
> No, it isn't.  For a very simple reason:  The resources users need to do
> real work are very similar to the resources users need to hog the system.
> 
> Saying that the system should somehow be able to magically make the 
> distinction between the two is a pipedream.  It takes a human to make
> the distinction.
> 
> Short of restricting the resources you give to users to the point where
> they can't even start a mail or news client, there is just no way to
> prevent said users from loading down the machine if they choose to.
> 
>   -Matt
> 
> 

On the shell servers I run, we've got 200-300 users running tasks.
Occasionally, through intent or misconfiguration, a user either forkbombs,
or gets a large number of processes running sucking lots of cpu.

I'd like to see an option that makes all the processes run by one uid have
the same weight as one process another uid is running.

i.e. uid 1001 starts 40 processes eating as much cpu as they can. Then uid
1002 starts up one process. Uid 1002's process gets 50% cpu, and uid 1001's
40 processes get 50% cpu shared between them. 

This way, one errant user can't have as significant of an impact.

Is this plausable?


Kevin


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message

Re: DoS from local users (fwd)

1999-04-11 Thread Kevin Day

> In message <199904102057.paa27...@home.dragondata.com> Kevin Day writes:
> : i.e. uid 1001 starts 40 processes eating as much cpu as they can. Then uid
> : 1002 starts up one process. Uid 1002's process gets 50% cpu, and uid 1001's
> : 40 processes get 50% cpu shared between them. 
> 
> I've seen some experimental patches in the past that try to do just
> this.  However, there are some problems.  What if uid 1002's process
> does a sleep.  Should the 40 processes that 1001 just get 50% of the
> cpu?  Or should there be other limits.  It turns into an interesting
> research problem in a hurry.
> 
> Warner
> 

I was thinking essentially just processes in the RUN state get applied to
this. If the cpu would otherwise be sitting idle, by all means give it to
someone. But, if two users have processes running, just because one user has
50 processes doesn't mean it should get 50x the cpu as one user who has one
process running. If a process is in sleep or blocked(select, IO, whatever),
it's taken out of consideration for the cpu, and the full cpu is given to
those processes that actually have work to do.

At least, that's my take on it.

I run into this problem daily, and i get enough user complains of "User x
has 50 processes running, eating as much cpu as they can, my compile just
took 15 minutes".

Kevin

To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message

Re: Alright, who's the smart alleck that fixed NFS this last week? :) , WAS: Re: solid NFS patch #6 avail for -current - need tester

1999-04-21 Thread Kevin Day

> yeah the clocks are not setup properly :) but otherwise i'm just
> gonna say HOLY SH*T you fixed NFS! :)


We all owe Matt big for this. :)


> I'm using the default mount operations, as far as NFS server
> not responding messages, i have no clue, but the server is still
> up and i've seen that message happen when a lot of pressure is
> being put on an NFS server even though everything is fine.

Try mounting with -d... Can I make a guess that the NFS mount is going over
100MB ethernet? I have a strong theory that the dynamic retransmit timer
needs rework for low latency connections, with high variability in their
performance during high traffic. (lots of collisions)


Kevin


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message

buildworld breaks in doscmd if no X installed?

1999-04-30 Thread Kevin Day


cvsupped last night, and I don't have X installed.

Is this a matter of "If you want to buildworld, install X"? I'm
certain i've done this before though. :)


Kevin



cc -nostdinc -O -pipe -I. -I/usr/X11R6/include -DDISASSEMBLER   
-I/usr/obj/usr/src/tmp/usr/include  -o doscmd AsyncIO.o ParseBuffer.o bios.o 
callback.o cpu.o dos.o cmos.o config.o cwd.o debug.o disktab.o doscmd.o ems.o 
emuint.o exe.o i386-pinsn.o int.o int10.o int13.o int14.o int16.o int17.o 
int1a.o int2f.o intff.o mem.o mouse.o net.o port.o setver.o signal.o timer.o 
trace.o trap.o tty.o xms.o  -L/usr/X11R6/lib -lX11
tty.o: In function `video_setborder':
tty.o(.text+0x22b): undefined reference to `XSetWindowBackground'
tty.o: In function `setgc':
tty.o(.text+0x2c1): undefined reference to `XChangeGC'
tty.o: In function `video_update':
tty.o(.text+0x4ca): undefined reference to `XDrawImageString'
tty.o(.text+0x550): undefined reference to `XDrawImageString'
tty.o(.text+0x64a): undefined reference to `XChangeGC'
tty.o(.text+0x6d2): undefined reference to `XFillRectangle'
tty.o(.text+0x77e): undefined reference to `XChangeGC'
tty.o(.text+0x7bb): undefined reference to `XFillRectangle'
tty.o(.text+0x7c9): undefined reference to `XFlush'
tty.o: In function `debug_event':
tty.o(.text+0xbf8): undefined reference to `XBell'
tty.o(.text+0xc03): undefined reference to `XFlush'
tty.o: In function `video_async_event':
tty.o(.text+0x11c3): undefined reference to `XFlush'
tty.o(.text+0x11e3): undefined reference to `XNextEvent'
tty.o(.text+0x1257): undefined reference to `XFlush'
tty.o(.text+0x1298): undefined reference to `XNextEvent'
tty.o: In function `video_event':
tty.o(.text+0x1774): undefined reference to `XLookupString'
tty.o(.text+0x18f8): undefined reference to `XLookupString'
tty.o: In function `tty_write':
tty.o(.text+0x1f82): undefined reference to `XBell'
tty.o: In function `KbdWrite':
tty.o(.text+0x27da): undefined reference to `XBell'
tty.o: In function `video_init':
tty.o(.text+0x2b35): undefined reference to `XOpenDisplay'
tty.o(.text+0x2b5c): undefined reference to `XDisplayName'
tty.o(.text+0x2c39): undefined reference to `XAllocNamedColor'
tty.o(.text+0x2c92): undefined reference to `XLoadQueryFont'
tty.o(.text+0x2cae): undefined reference to `XLoadQueryFont'
tty.o(.text+0x2d81): undefined reference to `XCreateSimpleWindow'
tty.o(.text+0x2de4): undefined reference to `XCreateGC'
tty.o(.text+0x2e1b): undefined reference to `XCreateGC'
tty.o(.text+0x2e38): undefined reference to `XSetNormalHints'
tty.o(.text+0x2e62): undefined reference to `XSelectInput'
tty.o(.text+0x2e76): undefined reference to `XMapWindow'
tty.o(.text+0x2e81): undefined reference to `XFlush'
*** Error code 1

Stop.
*** Error code 1

Stop.
*** Error code 1

Stop.
*** Error code 1

Stop.
*** Error code 1

Stop.
*** Error code 1



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message

Re: solid NFS patch #6 avail for -current - need testers files)

1999-04-30 Thread Kevin Day

> > 
> > To sum it all up is there any difference between the branches?
> 
> Yes.  We hope that people like you will help us by participating in the 
> testing of potential releases _before_ they go out as releases, not 
> _afterwards_.
> 
> Sitting around doing nothing and then complaining after the fact 
> doesn't help anyone, least of all yourself.
> 

This isn't meant in a bad way, but let me share with you my experiences.

Before 3.0 was released, I said several times "Hey, NFS got a lot worse on
-CURRENT. Is anyone looking at this?" and got several replies of "Duh, this
is -CURRENT. Don't whine about it. If you're trying to use this in a
production environment, you're crazy."

After 3.0 was released, I said "Hey, 3.0 got released, and NFS was still
broken", to which I got "Why didn't you bug us about this before the
release?" and/or "Why didn't you test this before release?"

I understand NFS is a 'special' problem, but for those of us not in the
trenches coding, I think the '3-level' system would be better. -CURRENT for
those who are coding, -BETA for people like me to test things and bring up
what broke, and -RELEASE for everyone else.

I honestly don't know when to bring up things like that, now. :)

Kevin


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message

Re: solid NFS patch #6 avail for -current - need testers files)

1999-05-01 Thread Kevin Day

> > I honestly don't know when to bring up things like that, now. :)
> 
> For 3.2, _right_now_.  What you're doing with Matt is the first stage; 
> the next involves bringing it back to the 3.2-beta tree and testing it 
> there.
> 
> Please understand that if "you" (the community) aren't working on this, 
> nobody else is.  We don't have enough people manning the trenches 
> because they're all sitting back in the chateau waiting for the 
> afternoon dispatches.  This doesn't work. 8)
> 

Can I propose something? I realize gnats does most of this, but...

Suppose there's some central list where anyone who is having unresolved
problems can post their e-mail address, section of code, and a brief
explaination of the symptoms. Other people acn go to this list and tack on
their e-mail address to other people's compalaints saying "I'm seeing this
too.". Before each release, all of these people are e-mailed saying "Can you
test to see if your problem still exists?" This will also be a bonus for
developers to find people who are experiencing specific problems, to see if
their fixes work.

I know this is a lot like gnats, however:


I don't think gnats wants a list of 'me too's in it.

It's not easy to mail groups of people from gnats.

There's no reason for anyone to add their e-mail address to a PR at the
moment.



I'm not sure if this'll make things more confusing or not, but... It'll stop
people with legitimate problems from getting lost in the shuffle, and keeps
PR's to more timely issues.

Anyone have any comments? Really, i'm just picturing a list of people with
specific problems... maybe gnats can be tuned a bit for this...


Kevin


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message

-current deadlocks within 5 mins, over NFS

1999-05-07 Thread Kevin Day


Matt, I told you about this before, but completely forgot about it. After
doing considerable testing on my test servers, i thought -current was safe
enough to try on our production shell servers. I installed -current on one
of my servers, and to my dismay, it hung. :)

Within 5 minutes of running, nearly every process is blocked on 'inode',
with the exception of a single 'cp' stuck in vmopar.

I have a very silly, *very* poorly written script i run out of cron, every
10 mins or so, to update my passwd and group files.

#/bin/sh

cp /home/private/passwd /etc
cp /home/private/master.passwd /etc
cp /home/private/group /etc
rm /etc/spwd.db.tmp >/dev/null 2>&1 
pwd_mkdb /etc/master.passwd



This script is the only source of a 'cp' anywhere... If I turn this off, I
was able to run for at least 30 mins(more, if i hadn't rebooted)

/home is a UDP NFS2 mount.  Moving the source of those cp's to a local drive
also fixes the problem, but breaks the use for my script. :)

What more info do you need for help in debugging this? 

Kevin


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message

Re: -current deadlocks within 5 mins, over NFS

1999-05-07 Thread Kevin Day

> 
> Matt, I told you about this before, but completely forgot about it. After
> doing considerable testing on my test servers, i thought -current was safe
> enough to try on our production shell servers. I installed -current on one
> of my servers, and to my dismay, it hung. :)
> 
> Within 5 minutes of running, nearly every process is blocked on 'inode',
> with the exception of a single 'cp' stuck in vmopar.
> 


Just to add more to this, before someone replies. Ran all night, still
hasn't crashed. On my test system, if I add that cron job back, it dies
very quickly, so this shouldn't be hard to reproduce to someone who needs
it.

-current sources grabbed last night.


Kevin


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message

-current NFS crash (out of mbuf clusters)

1999-05-08 Thread Kevin Day


I'm sure by now Matt is gonna kill me. :)

-current from 2 days ago.



IdlePTD 3096576
initial pcb at 27ea40
panicstr: Out of mbuf clusters
panic messages:
---
panic: Out of mbuf clusters

syncing disks... panic: Out of mbuf clusters

dumping to dev 20001, offset 467137
dump 255 254 253 252 251 250 249 248 247 246 245 244 243 242 241 240 239 238
237 236 235 234 233 232 231 230 229 228 227 226 225 224 223 222 221 220 219
218 217 216 215 214 213 212 211 210 209 208 207 206 205 204 203 202 201 200
199 198 197 196 195 194 193 192 191 190 189 188 187 186 185 184 183 182 181
180 179 178 177 176 175 174 173 172 171 170 169 168 167 166 165 164 163 162
161 160 159 158 157 156 155 154 153 152 151 150 149 148 147 146 145 144 143
142 141 140 139 138 137 136 135 134 133 132 131 130 129 128 127 126 125 124
123 122 121 120 119 118 117 116 115 114 113 112 111 110 109 108 107 106 105
104 103 102 101 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81
80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56
55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31
30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3
2 1 
---
#0  boot (howto=260) at ../../kern/kern_shutdown.c:288
288 dumppcb.pcb_cr3 = rcr3();
(kgdb) 
 bt
#0  boot (howto=260) at ../../kern/kern_shutdown.c:288
#1  0xc0145755 in panic () at ../../kern/kern_shutdown.c:450
#2  0xc015c382 in m_retryhdr (i=0, t=1) at ../../kern/uipc_mbuf.c:297
#3  0xc01b3013 in nfsm_rpchead (cr=0xc1096a00, nmflag=35328, procid=21, 
auth_type=1, auth_len=24, auth_str=0x0, verf_len=-1071591891, 
verf_str=0x0, mrest=0xc0acb900, mrest_len=44, mbp=0xcad93990, 
xidp=0xcad93994) at ../../nfs/nfs_subs.c:657
#4  0xc01b0317 in nfs_request (vp=0xca465400, mrest=0xc0acb900, procnum=21, 
procp=0xc02955a0, cred=0xc1096a00, mrp=0xcad939fc, mdp=0xcad93a00, 
dposp=0xcad93a04) at ../../nfs/nfs_socket.c:971
#5  0xc01c90d5 in nfs_commit (vp=0xca465400, offset=0, cnt=7463, 
cred=0xc1096a00, procp=0xc02955a0) at ../../nfs/nfs_vnops.c:2586
#6  0xc01c9620 in nfs_flush (vp=0xca465400, cred=0xc0a5e900, waitfor=2, 
p=0xc02955a0, commit=1) at ../../nfs/nfs_vnops.c:2846
#7  0xc01c9389 in nfs_fsync (ap=0xcad93b2c) at ../../nfs/nfs_vnops.c:2710
#8  0xc01b9489 in nfs_sync (mp=0xc0ecee00, waitfor=2, cred=0xc0a5e900, 
p=0xc02955a0) at vnode_if.h:499
#9  0xc016ceaf in sync (p=0xc02955a0, uap=0x0) at
../../kern/vfs_syscalls.c:543
#10 0xc014535a in boot (howto=256) at ../../kern/kern_shutdown.c:205
#11 0xc0145755 in panic () at ../../kern/kern_shutdown.c:450
#12 0xc015c2ca in m_retry (i=0, t=1) at ../../kern/uipc_mbuf.c:269
#13 0xc015c8e7 in m_copym (m=0xc0adf400, off0=0, len=10, wait=0)
at ../../kern/uipc_mbuf.c:450
#14 0xc01b047e in nfs_request (vp=0xca346440, mrest=0xc0ded380, procnum=4, 
procp=0xcac83940, cred=0xc0a5e900, mrp=0xcad93ccc, mdp=0xcad93cd0, 
dposp=0xcad93cd4) at ../../nfs/nfs_socket.c:1024
#15 0xc01b99a9 in nfs_access (ap=0xcad93d84) at ../../nfs/nfs_vnops.c:357
#16 0xc01bb9cf in nfs_lookup (ap=0xcad93e30) at vnode_if.h:219
#17 0xc016930b in lookup (ndp=0xcad93eb4) at vnode_if.h:31
#18 0xc0168d12 in namei (ndp=0xcad93eb4) at ../../kern/vfs_lookup.c:152
#19 0xc016e878 in lstat (p=0xcac83940, uap=0xcad93f90)
at ../../kern/vfs_syscalls.c:1702
#20 0xc020ec26 in syscall (frame={tf_fs = 47, tf_es = 47, tf_ds = 47, 
  tf_edi = -1, tf_esi = 0, tf_ebp = -1077947556, tf_isp = -891731996, 
  tf_ebx = 671539952, tf_edx = -1077947508, tf_ecx = 0, tf_eax = 190, 
  tf_trapno = 12, tf_err = 2, tf_eip = 671750724, tf_cs = 31, 
  tf_eflags = 582, tf_esp = -1077947676, tf_ss = 47})
at ../../i386/i386/trap.c:1066
#21 0xc0203f20 in Xint0x80_syscall ()
#22 0x2806afc7 in ?? ()
#23 0x2806b24d in ?? ()
#24 0x2806aa82 in ?? ()
#25 0x804a56e in ?? ()
#26 0x804a246 in ?? ()
#27 0x804acc7 in ?? ()
#28 0x8049bd1 in ?? ()
#29 0x80499c1 in ?? ()
#30 0x80497c9 in ?? ()



Copyright (c) 1992-1999 The FreeBSD Project.
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California. All rights reserved.
FreeBSD 4.0-CURRENT #0: Fri May  7 02:53:27 CDT 1999
toa...@shell3.dragondata.com:/usr/src/sys/compile/SHELL3
Timecounter "i8254"  frequency 1193182 Hz
CPU: AMD-K6(tm) 3D processor (300.68-MHz 586-class CPU)
  Origin = "AuthenticAMD"  Id = 0x580  Stepping=0
  Features=0x8001bf
real memory  = 268435456 (262144K bytes)
sio0: system console
avail memory = 258457600 (252400K bytes)
Probing for PnP devices:
npx0:  on motherboard
npx0: INT 16 interface
pcib0:  on motherboard
pci0:  on pcib0
chip0:  at device 0.0 on pci0
pcib1:  at device 1.0 on pci0
pci1:  on pcib1
isab0:  at device 7.0 on pci0
ide_pci0:  at device
15.0 on pci0
fxp0:  at device 16.0 on pci0
fxp0: interrupting at irq 12
fxp0: Ethernet address 00:90:27:34:b9:a7
fxp1:  at device 18.0 on pci0
fxp1: interrupting at irq 10
fxp1: Ethernet address 00:90:27:34:c0:12
eisa0:  on motherboard

Re: -current NFS crash (out of mbuf clusters)

1999-05-08 Thread Kevin Day

> :I'm sure by now Matt is gonna kill me. :)
> :
> :-current from 2 days ago.
> :
> :IdlePTD 3096576
> :initial pcb at 27ea40
> :panicstr: Out of mbuf clusters
> :panic messages:
> :---
> :panic: Out of mbuf clusters
> 
> This is probably not NFS related unless there is a leak somewhere.
> 
> You may have to mess with the NMBCLUSTERS kernel config to increase
> the number of mbuf clusters.  FreeBSD tends to not allocate enough
> by default in more heavily loaded larger-memory configurations.
> 
> It should be possible to confirm that the problem is not NFS by taking
> a general look at the state of the system at the time of the crash.  You
> can run 'ps' and 'netstat' on the core dump:
> 
> cd /var/crash
> ps -axl -M vmcore.XX -N kernel.XX

  UID   PID  PPID CPU PRI NI   VSZ  RSS WCHAN  STAT  TT   TIME COMMAND
0 0 0   0 -18  0 00 sched  DLs   ??0:00.00  (swapper)
0 1 0   0  10  0   5000 wait   Is??0:00.00  (init)
0 2 0   0 -18  0 00 psleep DL??0:00.00  (pagedaemon)
0 3 0   0  18  0 00 psleep DL??0:00.00  (vmdaemon)
0 4 0   0  -1  0 00 nfsrcv DL??0:00.00  (syncer)
039 1  30  18  0   2040 pause  Is??0:00.00  (adjkerntz)
1   233 1  30   2  0   8320 select Is??0:00.00  (portmap)
0   268 1   7  29  0  10240 -  Rs??0:00.00  (cron)
10079   384 1   0  -1  0  20920 nfsrcv D ??0:00.00  (eggdrop)
 1200   703 1   0  -1  0  17400 nfsrcv D ??0:00.00  (eggdrop)
10039   706 1   0  -1  0  16560 nfsrcv D ??0:00.00  (eggdrop)
10173   711 1   0   2  0  18040 select S ??0:00.00  (eggdrop)
10336  1075 1   0  -1  0  19160 nfsrcv D ??0:00.00  (eggdrop)
10051  1245 1   0  -1  0  23560 nfsrcv D ??0:00.00  
(eggdrop-1.3.23)
10467  1686 1   0  -1  0  18200 nfsrcv D ??0:00.00  (eggdrop)
10173  1697 1   0   2  0  17920 select S ??0:00.00  (eggdrop)
10387  1726 1   0   2  0  18000 select S ??0:00.00  (eggdrop)
10387  1727 1   0   2  0  17920 select S ??0:00.00  (eggdrop)
 1279  1743 1   0  -1  0  22280 nfsrcv D ??0:00.00  (eggdrop)
10176  1745 1   6   2  0  24600 select S ??0:00.00  
(eggdrop-1.3.26)
10051  2128 1   0   2  0  11600 select Ss??0:00.00  (ezbounce)
0  2200   268   0  -6  0  10560 piperd I ??0:00.00  (cron)
10002  2206  2200   1  28  0 00 -  Z ??0:00.00  (sh)
0  2548  2200   5  -6  0  13280 piperd I ??0:00.00  (sendmail)
 1292  2602 1  11  10  0   5000 wait   Is??0:00.00  (sh)
10002  2655 1   0   2  0   8280 select Is??0:00.00  (bnc)
 1392  2657 1   0   2  0   8600 select Is??0:00.00  (bnc)
10218  2658 1   0   2  0   8600 select Is??0:00.00  (bnc)
10177  2664 1   0   2  0   8600 select Is??0:00.00  (bnc)
10033  2666 1   0   2  0   8600 select Is??0:00.00  (bnc)
 1294  2667 1   0   2  0   8760 select Is??0:00.00  (bnc)
10452  2673 1   0  -1  0   9560 nfsrcv D ??0:00.00  (mech)
 1292  2688  2602   0  10  0   5040 wait   I ??0:00.00  (sh)
10427  2726 1   0  -1  0  19880 nfsrcv D ??0:00.00  (eggdrop)
 1292  2755  2688   0  -6  0  17400 piperd I ??0:00.00  (eggdrop)
 1339  2762 1   0  -1  0  18520 nfsrcv D ??0:00.00  (BitchX)
 1339  2772 1   0  -1  0  18400 nfsrcv D ??0:00.00  (bnc)
10391  2854 1   0  -1  0  17440 nfsrcv D ??0:00.00  (eggdrop)
10027  2858 1   0  -1  0  16960 nfsrcv D ??0:00.00  (eggdrop)
10027  2859 1   0  -1  0  16960 nfsrcv D ??0:00.00  (eggdrop)
10027  2860 1   0  -1  0  16960 nfsrcv D ??0:00.00  (eggdrop)
 1272  2870 1   0  -1  0  20560 nfsrcv D ??0:00.00  (eggdrop)
10237  2871 1   0   2  0  16880 select S ??0:00.00  (eggdrop)
10169  2872 1   0  -1  0  18360 nfsrcv D ??0:00.00  (eggdrop)
 1405  2874 1   0  -1  0  17000 nfsrcv D ??0:00.00  (eggdrop)
 1285  2875 1   0  -1  0  22920 nfsrcv D ??0:00.00  (eggdrop)
10099  2877 1   0  -1  0  18320 nfsrcv D ??0:00.00  (eggdrop)
10112  2878 1   0  -1  0  19920 nfsrcv D ??0:00.00  (eggdrop)
10239  2879 1   0  -1  0  19080 nfsrcv D ??0:00.00  (eggdrop)
10385  2880 1   0  -1  0  15400 nfsrcv D ??0:00.00  (eggdrop)
 1079  2891 1   0  -1  0  19000 nfsrcv D ??0:00.00  (eggdrop)
10002  2892 1   0  -1  0  21280 nfsrcv D ??0:00.00  (eggdrop)
10428  2900 1   0  -1  0  17800 nfsrcv D ??0:00.00  (eggdrop)
10428  2901 1   0  -1  0  1764

Re: -current NFS crash (out of mbuf clusters)

1999-05-08 Thread Kevin Day



Erm, sorry guys, that huge message wasn't intended to go back to -current,
just Matt. 

My apologies. :)

Kevin



> > :I'm sure by now Matt is gonna kill me. :)
> > :
> > :-current from 2 days ago.
> > :
> > :IdlePTD 3096576
> > :initial pcb at 27ea40
> > :panicstr: Out of mbuf clusters
> > :panic messages:
> > :---
> > :panic: Out of mbuf clusters
> > 
> > This is probably not NFS related unless there is a leak somewhere.
> > 
> > You may have to mess with the NMBCLUSTERS kernel config to increase
> > the number of mbuf clusters.  FreeBSD tends to not allocate enough
> > by default in more heavily loaded larger-memory configurations.
> > 
> > It should be possible to confirm that the problem is not NFS by taking
> > a general look at the state of the system at the time of the crash.  You
> > can run 'ps' and 'netstat' on the core dump:
> > 
> > cd /var/crash
> > ps -axl -M vmcore.XX -N kernel.XX
> 
>   UID   PID  PPID CPU PRI NI   VSZ  RSS WCHAN  STAT  TT   TIME COMMAND
> 0 0 0   0 -18  0 00 sched  DLs   ??0:00.00  (swapper)
> 0 1 0   0  10  0   5000 wait   Is??0:00.00  (init)


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message

Re: -current NFS crash (out of mbuf clusters)

1999-05-08 Thread Kevin Day

> :> netstat -m -M vmcore.XX -N kernel.XX
> :> 
> :
> :1014/2144 mbufs in use:
> : 714 mbufs allocated to data
> : 300 mbufs allocated to packet headers
> :638/1324/1536 mbuf clusters in use (current/peak/max)
> :2916 Kbytes allocated to network (48% in use)
> :0 requests for memory denied
> :0 requests for memory delayed
> :0 calls to protocol drain routines
> :
> :What does this tell you?
> :
> :Kevin
> 
> It tells me your userbase is out of control :-)  From the looks
> of it, hundreds of cron jobs are starting up simultaniously
> and overloading some system resource.
> 

Yeah, I wrote a patch to cron for a while that wouldn't allow that to
happen, but it didn't apply cleanly to 4.0's cron, so I'm going to go check
why. :)

(It staggered the requests, only allow x to run per quantum)

> I would also recommend:
> 
>   vmstat -m -M vmcore.XX -N kernel.XX
> 

Memory statistics by bucket size
Size   In Use   Free   Requests  HighWater  Couldfree
  16  883141   820528161280  0
  32 7569   9711 561955 640253
  6427568   1744   87910913 320253
 128 1526202   80132397 1601021415
 25620096   3616 244895  80151
 512   78  2   1911  40  0
  1K  277135  13673  20314
  2K   32 12173  10 67
  4K6  1   4585   5  0
  8K0  1  2   5  0
 16K3  0  3   5  0
 32K4  0  4   5  0
 64K5  0  5   5  0
128K1  0  1   5  0
256K1  0  1   5  0
512K0  0  2   5  0

Memory usage type by bucket size
Size  Type(s)
  16  devbuf, temp, proc, sysctl, rman, soname, pcb, vnodes, ether_multi,
  routetbl, isa_devlist, atkbddev, devbuf, temp, proc, sysctl, rman,
  soname, pcb, vnodes, ether_multi, routetbl, isa_devlist, atkbddev,
  devbuf, temp, proc, sysctl, rman, soname, pcb, vnodes, ether_multi,
  routetbl
  32  kld, sigio, devbuf, temp, pgrp, subproc, sysctl, SWAP, soname, pcb,
  cluster_save buffer, vnodes, ifaddr, ether_multi, routetbl, in_multi,
  NFS req, kld, sigio, devbuf, temp, pgrp, subproc, sysctl, SWAP,
  soname, pcb, cluster_save buffer, vnodes, ifaddr, ether_multi,
  routetbl, in_multi, NFS req, kld, sigio, devbuf, temp, pgrp, subproc,
  sysctl, SWAP, soname, pcb, cluster_save buffer, vnodes, ifaddr,
  ether_multi, routetbl, in_multi, NFS req
  64  file, lockf, namecache, devbuf, temp, session, rman, soname, pcb,
  cluster_save buffer, vnodes, ifaddr, ether_multi, routetbl, NFS req,
  file, lockf, namecache, devbuf, temp, session, rman, soname, pcb,
  cluster_save buffer, vnodes, ifaddr, ether_multi, routetbl, NFS req,
  file, lockf, namecache, devbuf, temp, session, rman, soname, pcb,
  cluster_save buffer, vnodes, ifaddr, ether_multi, routetbl, NFS req
 128  isadev, kld, timecounter, file desc, zombie, namecache, devbuf, temp,
  cred, ttys, soname, vnodes, ifaddr, routetbl, ZONE, isadev, kld,
  timecounter, file desc, zombie, namecache, devbuf, temp, cred, ttys,
  soname, vnodes, ifaddr, routetbl, ZONE, isadev, kld, timecounter,
  file desc, zombie, namecache, devbuf, temp, cred, ttys, soname,
  vnodes, ifaddr, routetbl, ZONE
 256  file desc, devbuf, temp, proc, subproc, vnodes, ifaddr, routetbl,
  NFS srvsock, NFS daemon, FFS node, file desc, devbuf, temp, proc,
  subproc, vnodes, ifaddr, routetbl, NFS srvsock, NFS daemon, FFS node,
  file desc, devbuf, temp, proc, subproc, vnodes, ifaddr, routetbl,
  NFS srvsock, NFS daemon, FFS node
 512  file desc, devbuf, temp, ioctlops, BIO buffer, mount, NFSV3 diroff,
  UFS mount, isa_devlist, file desc, devbuf, temp, ioctlops,
  BIO buffer, mount, NFSV3 diroff, UFS mount, isa_devlist, file desc,
  devbuf, temp, ioctlops, BIO buffer, mount, NFSV3 diroff, UFS mount
  1K  devbuf, temp, proc, BIO buffer, NQNFS Lease, devbuf, temp, proc,
  BIO buffer, NQNFS Lease, devbuf, temp, proc, BIO buffer, NQNFS Lease
  2K  devbuf, temp, pcb, BIO buffer, UFS mount, mbuf, isa_devlist, devbuf,
  temp, pcb, BIO buffer, UFS mount, mbuf, isa_devlist, devbuf, temp,
  pcb, BIO buffer, UFS mount
  4K  devbuf, temp, UFS mount, devbuf, temp, UFS mount, devbuf, temp,
  UFS mount
  8K  temp, temp, temp
 16K  devbuf, devbuf, devbuf
 32K  devbuf, temp, MSDOSFS mount, devbuf, temp, MSDOSFS mount, devbuf,
  temp, MSDOSFS mount
 64K  ISOFS mount, NFS hash, UFS ihash, UFS quota, VM pgdata, ISOFS mount,
  NFS hash, UFS ihash, UFS quota, VM pgdata, ISOFS mount, NFS hash,
  UFS ihash, UFS quota, VM pgdata
128K  name

Incorrect memory sizes reported

1999-05-11 Thread Kevin Day


I'm not sure if this is related to the bug I found in 3.1, regarding mmaping
devices, then forking, but with my -current NFS server:

  PID USERNAME PRI NICE  SIZERES STATETIME   WCPUCPU COMMAND
  139 root   2   0   257M   452K select   0:00  0.00%  0.00% rpc.statd

257M? :) ps shows similar info...


Kevin


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message

Re: Incorrect memory sizes reported

1999-05-11 Thread Kevin Day

> 
>   This is normal. It's using a lot of virtual memory. Fortunately, virtual
> memory is cheap.
> 
>   DS
> 
> > I'm not sure if this is related to the bug I found in 3.1,
> > regarding mmaping
> > devices, then forking, but with my -current NFS server:
> >
> >   PID USERNAME PRI NICE  SIZERES STATETIME   WCPUCPU COMMAND
> >   139 root   2   0   257M   452K select   0:00  0.00%  0.00% rpc.statd
> >
> > 257M? :) ps shows similar info...
> >
> >
> > Kevin
> 
> 

Ok, I stand corrected then 

I hadn't seen this before...

2.2.8:
root 14127  0.0  0.1   176  492  ??  Ss5:14PM0:00.00 rpc.statd

3.1:
root 853  0.0  0.7  172  416  ??  Ss7:18AM   0:00.00 rpc.statd


There still is the issue I described a while back that would make children
show negative numbers in 'size' though, that i can confirm isn't sucking
that much VM.


Kevin


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message

-current page fault at 0xdeadc0de

1999-05-12 Thread Kevin Day



I had two systems reboot at nearly the same time. (30 seconds apart), and
are completely unrelated.

One system was running 2.2.8, and my core file presents me with this:

su-2.02# gdb -k
GDB is free software and you are welcome to distribute copies of it
 under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.16 (i386-unknown-freebsd), Copyright 1996 Free Software Foundation,
Inc.
(kgdb) exec-file kernel.0
(kgdb) symbol-file kernel.0.debug
Reading symbols from kernel.0.debug...done.
(kgdb) core-file vmcore.0
IdlePTD 24a000
current pcb at 202bfc
#0  0x14 in ?? ()
(kgdb) bt
#0  0x14 in ?? ()
#1  0x3404 in ?? ()
Cannot access memory at address 0x7205c76a.

Were things just trashed, or am I doing something wrong?


The other system was running -current, and gives me:

su-2.02# gdb -k
GDB is free software and you are welcome to distribute copies of it
 under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.16 (i386-unknown-freebsd), Copyright 1996 Free Software Foundation,
Inc.
(kgdb) exec-file kernel.2
(kgdb) symbol-file kernel.2.debug
Reading symbols from kernel.2.debug...done.
(kgdb) core-file vmcore.2
IdlePTD 3096576
initial pcb at 27ea40
panicstr: page fault
panic messages:
---
Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0xdeadc0de
fault code  = supervisor read, page not present
instruction pointer = 0x8:0xdeadc0de
stack pointer   = 0x10:0xcb4adec0
frame pointer   = 0x10:0xcb4adefc
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 40969 (eggdrop)
interrupt mask  = 
trap number = 12
panic: page fault

syncing disks... 

Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0xdeadc126
fault code  = supervisor read, page not present
instruction pointer = 0x8:0xc018e3d8
stack pointer   = 0x10:0xcb4ad91c
frame pointer   = 0x10:0xcb4ad93c
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 40969 (eggdrop)
interrupt mask  = 
trap number = 12
panic: page fault

dumping to dev 20001, offset 467137
dump 255 254 253 252 251 250 249 248 247 246 245 244 243 242 241 240 239 238
237 236 235 234 233 232 231 230 229 228 227 226 225 224 223 222 221 220 219
218 217 216 215 214 213 212 211 210 209 208 207 206 205 204 203 202 201 200
199 198 197 196 195 194 193 192 191 190 189 188 187 186 185 184 183 182 181
180 179 178 177 176 175 174 173 172 171 170 169 168 167 166 165 164 163 162
161 160 159 158 157 156 155 154 153 152 151 150 149 148 147 146 145 144 143
142 141 140 139 138 137 136 135 134 133 132 131 130 129 128 127 126 125 124
123 122 121 120 119 118 117 116 115 114 113 112 111 110 109 108 107 106 105
104 103 102 101 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81
80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56
55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31
30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3
2 1 
---
#0  boot (howto=260) at ../../kern/kern_shutdown.c:288
288 dumppcb.pcb_cr3 = rcr3();
(kgdb) bt
#0  boot (howto=260) at ../../kern/kern_shutdown.c:288
#1  0xc0145755 in panic () at ../../kern/kern_shutdown.c:450
#2  0xc020e9e2 in trap_fatal (frame=0xcb4ad8dc, eva=3735929126)
at ../../i386/i386/trap.c:917
#3  0xc020e695 in trap_pfault (frame=0xcb4ad8dc, usermode=0, eva=3735929126)
at ../../i386/i386/trap.c:810
#4  0xc020e2d7 in trap (frame={tf_fs = 16, tf_es = 16, tf_ds = 16, tf_edi =
0, 
  tf_esi = 0, tf_ebp = -884287172, tf_isp = -884287224, tf_ebx = 16384, 
  tf_edx = -559038242, tf_ecx = -1059309536, tf_eax = -1053816960, 
  tf_trapno = 12, tf_err = 0, tf_eip = -1072110632, tf_cs = 8, 
  tf_eflags = 66182, tf_esp = -1062703744, tf_ss = -911937724})
at ../../i386/i386/trap.c:436
(kgdb) 




Not exactly a lot to go on... Mean anything to anyone? Any more info I can
provide?


Kevin


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message

Alladdin IDE slow?

1999-05-14 Thread Kevin Day


I'm using an Alladin chipset in a -current machine...

CPU: AMD-K6(tm) 3D processor (337.19-MHz 586-class CPU)
  Origin = "AuthenticAMD"  Id = 0x580  Stepping=0
  Features=0x8001bf
real memory  = 134217728 (131072K bytes)
avail memory = 126808064 (123836K bytes)
chip0:  at device 0.0 on pci0
pcib1:  at device 1.0 on pci0
pci1:  on pcib1
isab0:  at device 7.0 on pci0
ata-pci0:  at device 15.0 on pci0
ata-pci0: Busmastering DMA supported
ata0 at 0x01f0 irq 14 on ata-pci0
ata1 at 0x0170 irq 15 on ata-pci0
ata0: master: settting up UDMA2 mode on Aladdin chip OK
ad0:  ATA-3 disk at ata0 as master
ad0: 3079MB (6306048 sectors), 6256 cyls, 16 heads, 63 S/T, 512 B/S
ad0: piomode=4, dmamode=2, udmamode=2
ad0: 16 secs/int, 0 depth queue, DMA mode
ata0: slave: settting up UDMA2 mode on Aladdin chip OK
ad1:  ATA-3 disk at ata0 as slave 
ad1: 3079MB (6306048 sectors), 6256 cyls, 16 heads, 63 S/T, 512 B/S
ad1: piomode=4, dmamode=2, udmamode=2
ad1: 16 secs/int, 0 depth queue, DMA mode
ata1: master: settting up UDMA2 mode on Aladdin chip OK
ad2:  ATA-3 disk at ata1 as master
ad2: 3079MB (6306048 sectors), 6256 cyls, 16 heads, 63 S/T, 512 B/S
ad2: piomode=4, dmamode=2, udmamode=2
ata1: slave: settting up UDMA2 mode on Aladdin chip OK
ad3:  ATA-4 disk at ata1 as slave
ad3: 16479MB (33750864 sectors), 33483 cyls, 16 heads, 63 S/T, 512 B/S
ad3: piomode=4, dmamode=2, udmamode=2
ad3: 16 secs/int, 0 depth queue, DMA mode



ad3 is the one getting the heaviest use, from me... However, I notice a few
things from when I went to the ata driver, from a 3.1 kernel using the wd0
driver.

The drive is now much slower... While I don't have numbers either way, this
system acts as a nfs server. Not only are the NFS clients acting slower
after my switch, but nearly all my nfsd's are sitting in biord or biowr now,
where before they were usually idle.

Also, the IDE LED on the case/motherboard is now acting kinda erratic. I can
hear the HD doing accesses when the light is off, and at times the light
seems to stay on for 2-3 seconds, when there's no activity. (This didn't
happen under wd0)...

Is this a case of DMA just not working well for me, or is there a magic flag
I'm missing? This is -current from about a week ago.


Kevin




To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message

-Current still leaking mbuf's

1999-05-27 Thread Kevin Day


I've got two systems that panic about every 48 hours, saying they're out of
mbuf's. I've tried raising maxusers. (It's at 128 now, but i've gone up to
256 and still seen the same thing).

I believe it's a leak, since it's pretty consistant how long it will stay up
before it runs out.

I've tried raising NBMCLUSTERs, but this just seems to prolong it before it
finally panic's.

The only unusual thing about these two machines are that they're very heavy
NFS client users.

Is there anything any of you would like to see, if someone's willing to try
to debug this? vmstat -m doesn't show anything too out of the ordinary, but
I've got several coredumps waiting. :)

Kevin


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message

Re: -Current still leaking mbuf's

1999-05-27 Thread Kevin Day

> On Thu, 27 May 1999, Kevin Day wrote:
> 
> > I've got two systems that panic about every 48 hours, saying they're out of
> > mbuf's. I've tried raising maxusers. (It's at 128 now, but i've gone up to
> > 256 and still seen the same thing).
> >
> > I believe it's a leak, since it's pretty consistant how long it will stay up
> > before it runs out.
> > 
> > I've tried raising NBMCLUSTERs, but this just seems to prolong it before it
> > finally panic's.
> 
> How high do you have it set?
> 

I tried doubling whatever it was that putting maxusers at 256 set it at. (I
can get the exact number later). I'm running with no NMBCLUSTERS setting,
just with maxusers at 128 at the moment.

> You might want to collect some netstat -m stats as time goes on.  In
> addition to being easier to read, it may give you some hints as to how
> high you want to go with mbuf clusters.

I added a cron job to to netstat -m every half hour... Right now, after 10
hours of being up:

494/2624 mbufs in use:
160 mbufs allocated to data
334 mbufs allocated to packet headers
130/1686/2560 mbuf clusters in use (current/peak/max)
3700 Kbytes allocated to network (8% in use)
0 requests for memory denied
0 requests for memory delayed
0 calls to protocol drain routines

> > The only unusual thing about these two machines are that they're very heavy
> > NFS client users.
> 
> That might do it by itself irrespective of any bugs.
> 

This didn't happen in 2.2.8 or 3.1, so I'm trying to figure out what's
causing it. :)

Here's a typical panic:

IdlePTD 3096576
initial pcb at 27ea40
panicstr: Out of mbuf clusters
panic messages:
---
panic: Out of mbuf clusters

syncing disks... panic: Out of mbuf clusters

dumping to dev 20001, offset 467137
dump 255 254 253 252 251 250 249 248 247 246 245 244 243 242 241 240 239 238
237 236 235 234 233 232 231 230 229 228 227 226 225 224 223 222 221 220 219
218 217 216 215 214 213 212 211 210 209 208 207 206 205 204 203 202 201 200
199 198 197 196 195 194 193 192 191 190 189 188 187 186 185 184 183 182 181
180 179 178 177 176 175 174 173 172 171 170 169 168 167 166 165 164 163 162
161 160 159 158 157 156 155 154 153 152 151 150 149 148 147 146 145 144 143
142 141 140 139 138 137 136 135 134 133 132 131 130 129 128 127 126 125 124
123 122 121 120 119 118 117 116 115 114 113 112 111 110 109 108 107 106 105
104 103 102 101 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81
80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56
55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31
30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3
2 1 
---
#0  boot (howto=260) at ../../kern/kern_shutdown.c:288
288 dumppcb.pcb_cr3 = rcr3();
(kgdb) bt
#0  boot (howto=260) at ../../kern/kern_shutdown.c:288
#1  0xc0145755 in panic () at ../../kern/kern_shutdown.c:450
#2  0xc015c2ca in m_retry (i=0, t=1) at ../../kern/uipc_mbuf.c:269
#3  0xc01b2c77 in nfsm_reqh (vp=0xcb6bddc0, procid=21, hsiz=68, 
bposp=0xcb2cecdc) at ../../nfs/nfs_subs.c:599
#4  0xc01c8e13 in nfs_commit (vp=0xcb6bddc0, offset=0, cnt=8192, 
cred=0xc13b0200, procp=0xc02955a0) at ../../nfs/nfs_vnops.c:2580
#5  0xc01c9620 in nfs_flush (vp=0xcb6bddc0, cred=0xc0a5f900, waitfor=2, 
p=0xc02955a0, commit=1) at ../../nfs/nfs_vnops.c:2846
#6  0xc01c9389 in nfs_fsync (ap=0xcb2cedfc) at ../../nfs/nfs_vnops.c:2710
#7  0xc01b9489 in nfs_sync (mp=0xc113bc00, waitfor=2, cred=0xc0a5f900, 
p=0xc02955a0) at vnode_if.h:499
#8  0xc016ceaf in sync (p=0xc02955a0, uap=0x0) at
../../kern/vfs_syscalls.c:543
#9  0xc014535a in boot (howto=256) at ../../kern/kern_shutdown.c:205
#10 0xc0145755 in panic () at ../../kern/kern_shutdown.c:450
#11 0xc015c382 in m_retryhdr (i=0, t=1) at ../../kern/uipc_mbuf.c:297
#12 0xc015de2b in sosend (so=0xc9a55000, addr=0x0, uio=0xcb2cef00, top=0x0, 
control=0x0, flags=0, p=0xcb2692e0) at ../../kern/uipc_socket.c:499
#13 0xc016093f in sendit (p=0xcb2692e0, s=5, mp=0xcb2cef40, flags=0)
at ../../kern/uipc_syscalls.c:514
#14 0xc0160a2d in sendto (p=0xcb2692e0, uap=0xcb2cef90)
at ../../kern/uipc_syscalls.c:564
#15 0xc020ec26 in syscall (frame={tf_fs = 47, tf_es = 47, tf_ds = 47, 
  tf_edi = -1077951456, tf_esi = 0, tf_ebp = -1077951564, 
  tf_isp = -886247452, tf_ebx = 538075232, tf_edx = 682064, tf_ecx = 0, 
  tf_eax = 133, tf_trapno = 7, tf_err = 7, tf_eip = 537941473, tf_cs =
31, 
  tf_eflags = 534, tf_esp = -1077951596, tf_ss = 47})
at ../../i386/i386/trap.c:1066
#16 0x7 in ?? ()
(kgdb) 
#10 0xc0145755 in panic () at ../../kern/kern_shutdown.c:450
450 boot(bootopt);
(kgdb) 
#11 0xc015c382 in m_retryhdr (i=0, t=1) at ../../kern/uipc_mbuf.c:297
297 panic("Out of mbuf clusters");


Look

More NFS woes

1999-06-10 Thread Kevin Day



Grabbed another -current, and are still seeing a few problems yet that Matt
and others haven't solved. I'm not pushing anyone, just reminding that these
are still here, and still problems.


1) The 'inode/vmopar' lockup that Matt is aware of, and apparently tracked
down.

2) Processes starting to runaway doing this:

nfs_getpages: error 70
vm_fault: pager read error, pid 1251 (eggdrop)
nfs_getpages: error 70
vm_fault: pager read error, pid 1251 (eggdrop)
nfs_getpages: error 70
vm_fault: pager read error, pid 1251 (eggdrop)
nfs_getpages: error 70
vm_fault: pager read error, pid 1251 (eggdrop)
nfs_getpages: error 70

No, i don't know what the user in question did to make this happen, if
anything. The process was eating about 70% cpu when i killed it, syslogd was
eating the other 30 logging all this. :)


3) Weirdly high load averages. I have two systems, of similar hardware, and
similar jobs run on it. System A runs 2.2.8, and has about 300 processes
running. System B runs -current, and has about 250 processes running. The
processes are doing virtually the same things, and both are heavy NFS
clients. System A's load average is about .10, Syetem B's load average
hovers around 3.0-5.0. 

System A:
last pid:  1933;  load averages:  0.31,  0.10,  0.11
CPU states:  6.2% user,  0.0% nice,  1.9% system,  1.6% interrupt, 90.3% idle
302 processes: 1 running, 294 sleeping, 7 zombie

System B:
last pid: 77084;  load averages:  3.64,  3.70,  4.00
CPU states:  7.0% user,  0.0% nice,  2.7% system,  0.4% interrupt, 89.9% idle
256 processes: 1 running, 254 sleeping, 1 zombie

Has something changed in the load average calculation, between 2.x and
-current, or is there something actually different going on here?

The only hardware differences are that system A uses two de0 cards, and
system B uses two fxp0 cards. System A is a PII, and sytem B is a K6-2.
(similar speeds)

Kevin


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message

qt30 build under -CURRENT fails in rtld

2002-07-08 Thread Kevin Day



I"m not sure if this is a known problem, but I sent this to the maintainer
of the qt30 port, who suggested I post this here. I couldn't find anything
related in the archives about this problem.



I'm attempting to build qt30 (for kde3) under -CURRENT (ports and
kernel/userland from yesterday).

It's dying in:

gmake[3]: Entering directory 
/usr/ports/x11-toolkits/qt30/work/qt-x11-free-3.0.3/tools/designer/designer'
/usr/ports/x11-toolkits/qt30/work/qt-x11-free-3.0.3/bin/uic dbconnections.ui -o 
dbconnections.h
gmake[3]: *** [dbconnections.h] Bus error (core dumped)
gmake[3]: Leaving directory 
/usr/ports/x11-toolkits/qt30/work/qt-x11-free-3.0.3/tools/designer/designer'
gmake[2]: *** [sub-designer] Error 2
gmake[2]: Leaving directory 
/usr/ports/x11-toolkits/qt30/work/qt-x11-free-3.0.3/tools/designer'
gmake[1]: *** [sub-designer] Error 2
gmake[1]: Leaving directory /usr/ports/x11-toolkits/qt30/work/qt-x11-free-3.0.3/tools'
gmake: *** [sub-tools] Error 2
*** Error code 2

Stop in /usr/ports/x11-toolkits/qt30.


# gdb
GNU gdb 5.2.0 (FreeBSD) 20020627
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-undermydesk-freebsd".
(gdb) exec-file ../../../bin/uic
(gdb) core-file uic.core
Core was generated by ic'.
Program terminated with signal 10, Bus error.
Reading symbols from /usr/lib/libz.so.2...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libz.so.2
Reading symbols from 
/usr/ports/x11-toolkits/qt30/work/qt-x11-free-3.0.3/lib/libqt-mt.so.3...(no debugging 
symbols found)...done.
Loaded symbols for 
/usr/ports/x11-toolkits/qt30/work/qt-x11-free-3.0.3/lib/libqt-mt.so.3
Reading symbols from /usr/X11R6/lib/libICE.so.6...(no debugging symbols found)...done.
Loaded symbols for /usr/X11R6/lib/libICE.so.6
Reading symbols from /usr/X11R6/lib/libSM.so.6...(no debugging symbols found)...done.
Loaded symbols for /usr/X11R6/lib/libSM.so.6
Reading symbols from /usr/X11R6/lib/libXext.so.6...(no debugging symbols found)...done.
Loaded symbols for /usr/X11R6/lib/libXext.so.6
Reading symbols from /usr/X11R6/lib/libX11.so.6...(no debugging symbols found)...done.
Loaded symbols for /usr/X11R6/lib/libX11.so.6
Reading symbols from /usr/X11R6/lib/libXrender.so.1...(no debugging symbols 
found)...done.
Loaded symbols for /usr/X11R6/lib/libXrender.so.1
Reading symbols from /usr/X11R6/lib/libXft.so.1...(no debugging symbols found)...done.
Loaded symbols for /usr/X11R6/lib/libXft.so.1
Reading symbols from /usr/local/lib/libfreetype.so.9...(no debugging symbols 
found)...done.
Loaded symbols for /usr/local/lib/libfreetype.so.9
Reading symbols from /usr/lib/libstdc++.so.4...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libstdc++.so.4
Reading symbols from /usr/lib/libm.so.2...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libm.so.2
Reading symbols from /usr/lib/libc_r.so.5...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libc_r.so.5
Reading symbols from /usr/lib/libc.so.5...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libc.so.5
Reading symbols from /usr/local/lib/libmng.so.1...(no debugging symbols found)...done.
Loaded symbols for /usr/local/lib/libmng.so.1
Reading symbols from /usr/local/lib/libjpeg.so.9...(no debugging symbols found)...done.
Loaded symbols for /usr/local/lib/libjpeg.so.9
Reading symbols from /usr/local/lib/libpng.so.5...(no debugging symbols found)...done.
Loaded symbols for /usr/local/lib/libpng.so.5
Reading symbols from /usr/X11R6/lib/libXThrStub.so.6...(no debugging symbols 
found)...done.
Loaded symbols for /usr/X11R6/lib/libXThrStub.so.6
Reading symbols from /usr/local/lib/liblcms.so.1...(no debugging symbols found)...done.
Loaded symbols for /usr/local/lib/liblcms.so.1
Reading symbols from /usr/libexec/ld-elf.so.1...(no debugging symbols found)...done.
Loaded symbols for /usr/libexec/ld-elf.so.1
#0  0x28099094 in reloc_non_plt () from /usr/libexec/ld-elf.so.1
(gdb) bt
#0  0x28099094 in reloc_non_plt () from /usr/libexec/ld-elf.so.1
#1  0x28096a4e in find_symdef () from /usr/libexec/ld-elf.so.1
#2  0x28095602 in _rtld () from /usr/libexec/ld-elf.so.1


I'm building this on an extremely slow system, which took a better part of
today to get this far, so I haven't rebuilt everything with -g yet. Is this
a known problem? If not, I can attempt to rebuild with -g to get a full
backtrace and symbols if needed.

-- Kevin

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: Current panic on boot on H11DSI motherboard with epyc cpu (nexus_add_irq: failed)

2018-05-18 Thread Kevin Day


> On Apr 18, 2018, at 1:42 PM, John Baldwin  wrote:
>> 
>> Chenged made for it was
>> 
>> Index: sys/x86/x86/nexus.c
>> ===
>> --- sys/x86/x86/nexus.c (revision 332663)
>> +++ sys/x86/x86/nexus.c (working copy)
>> @@ -698,7 +698,7 @@
>> {
>> 
>>if (rman_manage_region(&irq_rman, irq, irq) != 0)
>> -   panic("%s: failed", __func__);
>> +   panic("%s: failed irq is: %lu", __func__, irq);
>> }
> 
> O, this is a different issue.  Sorry.  As a hack, try changing
> 'FIRST_MSI_INT' to 512 in sys/amd64/include/intr_machdep.h.  The issue
> is that some systems now include more than 256 interrupt pins on I/O
> APICs, so IRQ 256 is already reserved for use by one of those
> interrupt pins.  The real fix is that I need to make FIRST_MSI_INT
> dynamic instead of a constant and just define it as the first free IRQ
> after the I/O APICs have probed.

I'm testing a very large AMD Epyc system, and I had to change FIRST_MSI_INT to 
768, but that fixed this issue for me.


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

61 matches

Mail list logo