Re: Unkillable processes
> I've got myself two processes which can't be gotten rid of by SIGKILL: > > kkenn 92724 32.0 0.8 5736 356 ?? RN6:25PM 136:52.96 kvt -T Terminal - > kkenn 1103 0.0 0.0 5740 388 ?? TWN - 0:00.00 (kvt) > > (kvt is the KDE 1.1.1 xterm) > > I am able to trigger this by attempting to paste the contents of a large > buffer from xemacs (v21.1 from ports) into the pico editor from pine4. > > Any ideas before I recompile kvt with -g and try and track down what it's > doing? > > Kris > > For one, do another 'ps' with the 'l' option, so you can see what it's stuck on. The second process is a zombie, which isn't killable until the parent tells it to go away. (Which could very possibly be the first kvt) Kevin To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Unkillable processes
> On Sat, 24 Jul 1999, Kevin Day wrote: > > > For one, do another 'ps' with the 'l' option, so you can see what it's stuck > > on. > > UID PID PPID CPU PRI NI VSZ RSS WCHAN STAT TT TIME COMMAND > 1000 1103 1086 29 75 20 5740 384 - TWN ??0:00.00 (kvt) > 1000 1109 1103 0 4 0 15040 ttywri IWs+ p10:00.00 (tcsh) > > 1000 92724 1086 279 105 20 5736 356 - RN?? 139:40.13 kvt -T Termi > 1000 92743 92724 2 18 0 15760 pause IWs p80:00.00 (tcsh) > > > The second process is a zombie, which isn't killable until the parent tells > > it to go away. (Which could very possibly be the first kvt) > > Both still present empty terminal windows on my desktop and were spawned > from the KDE panel. The second one was running a copy of pine and was in > the same state as the other initially, until I kill -KILL'ed the pine > process, at which point it changed to what it is now. > > Kris Well, since the CPU time in the active process (92724) went up since your last e-mail, and it's in the RUN state (a - in the WCHAN and a R in the STAT), it looks like the process is just spinning, eating CPU. The tcsh listed below that is a zombie of the running kvt. If you can somehow kill that kvt, the tcsh will go away. The top kvt (1103) is also a zombie, waiting for it's parent to reap it. Whatever process 1086 is decided not to clean it up, you may want to see what it's doing. Will process 92724 die if you kill -9 it? This seems to be more of a kvt bug than a freebsd bug. :) Kevin To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Unkillable processes
> On Saturday, 24 July 1999 at 20:51:37 -0500, Kevin Day wrote: > >> On Sat, 24 Jul 1999, Kevin Day wrote: > >> > >>> For one, do another 'ps' with the 'l' option, so you can see what it's stuck > >>> on. > >> > >> UID PID PPID CPU PRI NI VSZ RSS WCHAN STAT TT TIME COMMAND > >> 1000 1103 1086 29 75 20 5740 384 - TWN ??0:00.00 (kvt) > >> 1000 1109 1103 0 4 0 15040 ttywri IWs+ p10:00.00 (tcsh) > >> > >> 1000 92724 1086 279 105 20 5736 356 - RN?? 139:40.13 kvt -T Termi > >> 1000 92743 92724 2 18 0 15760 pause IWs p80:00.00 (tcsh) > >> > > Well, since the CPU time in the active process (92724) went up since your > > last e-mail, and it's in the RUN state (a - in the WCHAN and a R in the > > STAT), it looks like the process is just spinning, eating CPU. > > Right. > > > The tcsh listed below that is a zombie of the running kvt. > > There aren't any zombies here. > > It's a child of the kvt. It's not a zombie. Take a look at the STAT > field (and ps(1)): process Good point, i didn't notice that, i saw the ()'s from his first message, > Process 92724 is runnable, nice and running (no WCHAN). I really > don't understand why you can't stop this one. The only time I've seen this is when my console is getting flooded with 'vm_fault: pager error' messages for that process. Otherwise, there's no reason why a running process can't be killed, correct? Kevin To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: mountpoint locking with fbsd-nfs
> Well, theoretically there is nothing wrong going on since you can mount > things on top of an NFS directory. Mount only complains about > duplicate normal partition mounts because it can't open the buffered > block device the second time. NFS doesn't care how many times a > directory is imported or exported. > > -Matt > Matthew Dillon > <[EMAIL PROTECTED]> > > You sure about you can export a directory multiple times? I can't even export two directories under the same filesystem. su-2.03# mount /dev/wd0s1a on / (NFS exported, local, noatime, soft-updates, writes: sync 3945 async 1317317) procfs on /proc (local) su-2.03# cat /etc/exports /varhome /var/tmphome su-2.03# mountd Aug 1 22:43:01 celery mountd[46042]: can't change attributes for /var/tmp Aug 1 22:43:01 celery mountd[46042]: bad exports list line /var/tmp home It actually exported /, which may not have been what i wanted. :) Or did I misunderstand you? Kevin To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: mountpoint locking with fbsd-nfs
> > To export a single filesystem multiple times, *all* of the attributes must > be the same. If they aren't the only person you are fooling is yourself, > since once a filesystem is NFS exported, it is open to the world. > > anyway the syntax for what you want is: > > /var /var/mailsome.machine > Ahh.. That was a bad example I gave anyway... I wanted to have say... /a exported to a few machines, and /b exported to only one machine... Couldn't do it, which was kinda annoying. :) Kevin To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: mountpoint locking with fbsd-nfs
> > You misunderstood me. The problem you have is the fact that NFS exports > are usually limited to the physical mount point of the filesystem being > exported. Thus it thinks that /var above is the same as /, or that > /var/tmp is the same as /var if both happen to be in the same partition. > Mount gets confused by that when you specify what it believes to be the > same partition several times in the exports list. > > You can use the '-alldirs' flag in the exports list to export a partition > and allow any subdirectory within that partition to be mounted instead of > the partition itself. There may be a way to export several specific > subdirectories in the same partition but I'm not sure. > > I was talking about things like: > > mount apollo:/usr m1 > mount apollo:/usr m2 > mount apollo:/usr m3 > mount apollo:/usr m3 > mount apollo:/usr m3 > > I can import a filesystem as many times as I want, and even overlay mount > points. > Yeah, I know about -alldirs... The problem was that we had customers who wanted us to export their home directories, and unless I gave them their own filesystem, I couldn't restrict it in the manner i wanted. :) Just checking to see that I wasn't missing a way to do this. :) Kevin To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: mountpoint locking with fbsd-nfs
> > :Yeah, I know about -alldirs... The problem was that we had customers who > :wanted us to export their home directories, and unless I gave them their own > :filesystem, I couldn't restrict it in the manner i wanted. :) > : > :Just checking to see that I wasn't missing a way to do this. :) > : > :Kevin > > I've never in my life tried this - it probably won't work, but ... > use the null device maybe to create a mount point for each home > dir and then export that? > I think it sees through this. su-2.03# cat /etc/exports /varhome /mnthome su-2.03# mount /dev/wd0s1a on / (NFS exported, local, noatime, soft-updates, writes: sync 3970 async 1321097) procfs on /proc (local) nfs:/home on /usr/home (noatime) nfs:/var/mail on /var/mail (noatime) /var/tmp on /mnt (local) su-2.03# mountd Aug 1 23:17:48 celery mountd[89177]: can't change attributes for /mnt That was a very good idea though, i'd never have thought of it. :) I'll have to play with this more. :) Kevin To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: "The Matrix" screensaver, v.0.2
At 10:26 AM 8/21/99 +0100, Nik Clayton wrote: >On Fri, Aug 20, 1999 at 07:34:31PM +0200, Andrzej Bialecki wrote: > > Both versions are available at: > > > > http://www.freebsd.org/~abial/matrix_3.2.tgz > > http://www.freebsd.org/~abial/matrix_4.0.tgz >FWIW, there are at least two other 'matrix' implementations out there. >One is part of xscreensaver, and is quite nice -- it's even better if you >halve the size of the image it's using first. This has the advantage that >the characters actually look like the ones in the film (reversed numbers >and Japanese katana (sp?) characters). That one's (obviously) X only. > >The other is 'cmatrix'. A web search should turn it up. As the name >implies, this is a console version. For those of you using Windows or MacOs http://www.whatisthematrix.com/cmp/screensaver_index.html That's the 'official' screen saver. (The Windows version uses some kind of runtime ShockWave and eats nearly 100% cpu, but it looks authentic) Kevin To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: "The Matrix" screensaver, v.0.2
A> Anyway, this module was meant more as a joke, but if you guys like it so > > much you could vote for putting it in the tree... > >What do you mean "vote"? I was waiting for it to show up on my tree >after a cvsup! I hate to keep bringing things like this up, or start a legal war, but this screensaver is more than likely a copyright and/or trademark violation, and bringing it into the source tree may not be a good idea. Yes, lots of people may be making things like this, but it would probably be best to distance FreeBSD itself from such a thing. Kevin (speaking as an employee of a company who's products are frequently infringed on, and have been through this exact situation before, except from the other side) To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: "The Matrix" screensaver, v.0.2
>On Sun, 22 Aug 1999, Andrzej Bialecki wrote: > > > On Sat, 21 Aug 1999, Kevin Day wrote: > > > > > > > > I hate to keep bringing things like this up, or start a legal war, > but this > > > screensaver is more than likely a copyright and/or trademark > violation, and > > > bringing it into the source tree may not be a good idea. Yes, lots of > > > people may be making things like this, but it would probably be best to > > > distance FreeBSD itself from such a thing. > > > > You can trademark the title "The Matrix", but you can't trademark a common > > word "matrix". That's the only word I use for the name of the module. As > > Daniel mentioned, they even can't claim that it's their idea. > > > > So I think I can pretty safely import it. > > > >If we wanted to be legally paranoid we would call it the "letter" saver >and add as the comment the words It's not just the name "Matrix" though. Make a screen saver of the Superman 'S' logo, and see how quickly a certain comic book company comes after you. :) Making a derivative work based on something that was in a movie probably is a copyright violation. Warner Brothers could easily say that you've copied an element from their movie (even if it's not the entire movie), and even go so far as to get a judge to get any CD-ROM distributors of FreeBSD to recall all unsold CD's, and destroy them. As for the trademark issue, it doesn't have to be a name to be trademarked. Logos, effects, and even sounds can be trademarked. I'm really not trying to be annoying about things like this, but I already had to fight for the ability to be able to use FreeBSD at work, after they discovered other copyright/trademark violations in the source tree.. (Trek, etc). "If they'll steal things here, how do we know the entire kernel isn't stolen from somewhere else?" Yes, it's silly logic, but they do sort of have a point. We're selling a product with FreeBSD embedded in it. Should some copyright/patent holder come up proving that the VM system is his, and FreeBSD stole it from him, they could legally force us to recall every machine we've sold, and replace it with non-infringing materials. Obviously we're not shipping 'trek' on our system, and wouldn't include the matrix saver anyway, but I (for completely selfish reasons) would like to keep FreeBSD distanced from anything that could possibly be infringing on anything, and let you download it from somewhere else if you want it. :) Kevin To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: "The Matrix" screensaver, v.0.2
At 06:44 PM 8/22/99 +0200, Andrzej Bialecki wrote: >On Sun, 22 Aug 1999, Kevin Day wrote: > >[trademark violation warning] > >Ok, maybe you have a point - I dont know, I'm not a lawyer. But with this >line of reasoning they could claim that anything using falling letters >effect on your screen partly violates they trademarked special effect, >which is silly. > >What can we do, then? Why don't we ask them politely if it's ok? > >Andrzej Bialecki While I can't speak for how Warner Brothers' lawyers are, as a general rule.. "It's much easier to say No, than it is to say Yes, and regret it later." If you ask, you'll probably get a No. It's not that you made a falling letter effect, it's that you made a falling letter effect to copy the effect in the movie, and it does look very much like what's in the movie. That could be called willful infringement. While I doubt they'd stop a fan from making something like this, (I don't know this for a fact though, see what Paramount did with Trek sites before, or Mattel with Barbie) they may be more led to taking action against a product being sold that contains it. (FreeBSD being sold by Walnut Creek and others). If this is distributed as a fan based thing, the worst they'd likely do is say "Take it down.". If this is on thousands of FreeBSD cd's, it could become a financial problem if they want to take it far enough. This is just my opinion though, and not to be used as legal advice for anyone. I just don't want FreeBSD to become a ball of intellectual property infringements. :) Kevin To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: 2 hours to compile mysql?
> > Is amount of ram available (portably) to configure? > > So configure could decide to use --low-memory by itself? Allowing > > overrides, naturally. > > > > Leif > > > > There is actually a method to portably guess how much RAM your have available > from configure -- just write a small C program that will keep malloc()-ing until > it gets an error, but I do not think it is worth the effort. > > -- > Sasha Pachev How much ram you have and how much ram+swap you have before you hit your limit is quite different. :) # sysctl hw.physmem hw.physmem: 400883712 This will return the amount of ram you have minus your kernel size, though. Perhaps helpful if you really want to do this. :) To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: ascii art in hosts.allow
> > On Tue, Jan 25, 2000 at 03:03:32PM +1100, Andy Farkas wrote: > > Here is a patch: (please notice the spelling correction) > > Where? I just ran ispell on src/etc/hosts.allow and it didn't catch > anything. A more direct patch would have been: -# NOTE: The hosts.deny file is not longer used. Instead, put both 'allow' +# NOTE: The hosts.deny file is no longer used. Instead, put both 'allow' :) Kevin To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: XFree86 3.9.18
> > That's odd... I just built it tonight, and I havn't had anything but this: > > XFree86-Bigfont extension: shmat() failed, size = 4096, errno = 24 > XFree86-Bigfont extension: shmat() failed, size = 4096, errno = 24 > XFree86-Bigfont extension: shmat() failed, size = 4096, errno = 24 > XFree86-Bigfont extension: shmat() failed, size = 4096, errno = 24 > XFree86-Bigfont extension: shmat() failed, size = 4096, errno = 24 > XFree86-Bigfont extension: shmat() failed, size = 4096, errno = 24 > XFree86-Bigfont extension: shmat() failed, size = 4096, errno = 24 > > Which doesn't cause the server to crash, in fact, it seems pretty stable. > 24 EMFILE Too many open files. Getdtablesize(2) will obtain the current limit. You can probably bump up your ulimit, and make this go away, too. Kevin To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
lo0 tcp connections in TIME_WAIT/LAST_ACK/FIN_WAIT?
After upgrading from 3.4 to RC2, i'm noticing something that I never saw before: Active Internet connections (including servers) Proto Recv-Q Send-Q Local Address Foreign Address(state) tcp0 0 127.0.0.1.4954 127.0.0.1.4242 SYN_SENT tcp0 0 127.0.0.1.4953 127.0.0.1.4242 TIME_WAIT tcp0 0 127.0.0.1.4952 127.0.0.1.4242 TIME_WAIT tcp0 0 127.0.0.1.4951 127.0.0.1.4242 TIME_WAIT tcp0 0 127.0.0.1.4950 127.0.0.1.4242 TIME_WAIT tcp0 0 127.0.0.1.4949 127.0.0.1.4242 TIME_WAIT tcp0 0 127.0.0.1.4948 127.0.0.1.4242 LAST_ACK tcp0 0 127.0.0.1.4947 127.0.0.1.4242 CLOSE_WAIT tcp0 0 127.0.0.1.4945 127.0.0.1.4242 TIME_WAIT tcp0 0 127.0.0.1.4944 127.0.0.1.4242 TIME_WAIT tcp0 0 127.0.0.1.4942 127.0.0.1.4242 TIME_WAIT tcp0 0 127.0.0.1.4940 127.0.0.1.4242 FIN_WAIT_1 tcp0 0 127.0.0.1.4938 127.0.0.1.4242 FIN_WAIT_1 tcp0 0 127.0.0.1.4937 127.0.0.1.4242 TIME_WAIT tcp0 0 127.0.0.1.4936 127.0.0.1.4242 TIME_WAIT Are tcp connections going through lo0 ever supposed to end up like this? I thought everything that went through lo0 was supposed to be.. well.. instant and mostly lossless. Any ideas? Kevin To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: current.freebsd.org (FTP)
> > > > > > Forrest Aldrich wrote: > > > > > Is not allowing anonymous ftp logins. > > > > > > To Unsubscribe: send mail to [EMAIL PROTECTED] > > > with "unsubscribe freebsd-current" in the body of the message > > > > I noticed this too... Maybe there's too many users, and is refusing > > connections? Hmmm... > > ??? a little more information would be very welcome. > > jmb Here's what I see: # ftp current.freebsd.org Connected to usw2.freebsd.org. 220 usw2.freebsd.org FTP server (Version wu-2.6.0(1) Tue Jan 25 00:05:38 CST 2000) ready. Name (current.freebsd.org:toasty): ftp 331 Guest login ok, send your complete e-mail address as password. Password: 530 Login incorrect. ftp: Login failed. Kevin To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: lo0 tcp connections in TIME_WAIT/LAST_ACK/FIN_WAIT?
> > > After upgrading from 3.4 to RC2, i'm noticing something that I never saw > > before: > > > > Active Internet connections (including servers) > > Proto Recv-Q Send-Q Local Address Foreign Address(state) > > tcp0 0 127.0.0.1.4954 127.0.0.1.4242 SYN_SENT > > tcp0 0 127.0.0.1.4953 127.0.0.1.4242 TIME_WAIT > > tcp0 0 127.0.0.1.4952 127.0.0.1.4242 TIME_WAIT > > tcp0 0 127.0.0.1.4951 127.0.0.1.4242 TIME_WAIT > > tcp0 0 127.0.0.1.4950 127.0.0.1.4242 TIME_WAIT > > tcp0 0 127.0.0.1.4949 127.0.0.1.4242 TIME_WAIT > > tcp0 0 127.0.0.1.4948 127.0.0.1.4242 LAST_ACK > > tcp0 0 127.0.0.1.4947 127.0.0.1.4242 CLOSE_WAIT > > tcp0 0 127.0.0.1.4945 127.0.0.1.4242 TIME_WAIT > > tcp0 0 127.0.0.1.4944 127.0.0.1.4242 TIME_WAIT > > tcp0 0 127.0.0.1.4942 127.0.0.1.4242 TIME_WAIT > > tcp0 0 127.0.0.1.4940 127.0.0.1.4242 FIN_WAIT_1 > > tcp0 0 127.0.0.1.4938 127.0.0.1.4242 FIN_WAIT_1 > > tcp0 0 127.0.0.1.4937 127.0.0.1.4242 TIME_WAIT > > tcp0 0 127.0.0.1.4936 127.0.0.1.4242 TIME_WAIT > > > > > > Are tcp connections going through lo0 ever supposed to end up like this? I > > thought everything that went through lo0 was supposed to be.. well.. > > instant and mostly lossless. Any ideas? > > > > Kevin > > Hi, > does that happen for any apps? > Could you please give me info about what is the apps which use > the port 4242? > > Thanks, > Yoshinobu Inoue > Right now, it only seems to be happening for bbd (part of /usr/ports/net/bb), when local connections are talking to bbd. (I moved bbd to port 4242, it's default is port 1984) Doing an ifconfig lo0 down ; ifconfig lo0 up seems to have cleared them, too. Kevin To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Recent kernel hangs during boot with pnp sio.
> > > Afaik all 3C509B's are PnP. At least here in the UK there is not > > shortage of those cards. > > If I can get a difinitive statement to this effect then I'll grab a > 3c509B. There was some question as to them actually being PnP though. > Yes, the 3c509B can have PnP turned on or off through a DOS utility. Either you set an IO/IRQ setting, or you set it to PnP and let the system do it. (I believe they come with PnP enabled now, but before the default was off) Kevin To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: make install trick
> > On Wed, Oct 06, 1999 at 02:57:23PM +1000, a little birdie told me > that Peter Jeremy remarked > > > > I guess we disagree on this. My feeling is that write activity on > > root should be minimised to minimise the risk that root will be > > inconsistent following a crash. > > Indeed. > Thus: > /dev/da0s1a on / (local, synchronous, writes: sync 32 async 15100) > ^^^ > > Though I'm still waiting for an explanation of WHY exactly I have async > writes on a sync partition. Nobody yet has said anything but 'that's > interesting...'. A direction to look would be helpful. > My understanding was that that was just a indication of writes that were able to be done asynchronously without any risk, so they were done async. (sync isn't purely sync, only synchronous when it's required for integrity) Kevin To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: vga driver and signal
> > Kind of complex though. Also the interrupt latency problem is still there. > > Not sure that this is as elegant as what you are suggesting , can > the kernel schedule a user level routine to be executed when an interrupt > occurs? I guess on Windoze land this is called a driver call-back. > > In a project I'm working on now (that some of you saw at FreeBSDCon) I had a need to sync a lot of things to a vsync interrupt. I ended up writing a small driver to attach to the video card. My program would do a blocking read on the device, which would put that process to sleep. The interrupt handler would shove one byte of data back to the process through the read (indicating interrupt status) and wake up the process. This works, but still has a problem if latency and missed interrupts if you aren't reading when the interrupt happens. (I've worked around those too, but that's quite a bit more involved to fix it). You'll probably need to end up changing the scheduler slightly, or playing with rtprio. Kevin To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Intel 810?
> > > I recently got a quote from a hardware vendor which made the following > > claim: > > > > > All Socket 370PGA Motherboards use either the 810 or [the] 810c chip > > > set which does not support FreeBSD because 16MB of the motherboard > > > memory is used for the display controller. There is no way to tell > > > the FreeBSD kernel not to use this memory so it will corrupt data. > > > > I find this statement rather dubious. Can anyone out there say with > > more certainty? > > I can say with certainty that there are S370PGA boards that don't use the > 810; we have a number inhouse here that use the 440BX for example. > > I'd be quite surprised if the 16MB shared video aperture wasn't correctly > described by the PnP data; this may require 4.x or 3.x with VM86 defined > to deal with it "right". If nobody else has any commentary on this, > we'll get one into the lab. > I'm using a Socket 370 board with a 440ZX and one with a 440BX, they both work fine... I've got an 810c and an 810e that I'll try later today. Kevin To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Intel 810?
> > < said: > > > As others have stated, Socket370 boards arent all 810/810c...my 4.0-Current > > The important issue to me is: will FreeBSD work on an 810 motherboard? > The reason I care is because I need the form-factor (a 1U-high > server); if I am to use some alternate motherboard, I'll need to be > certain in advance that it will fit in a MicroATX opening. > Have you considered NLX or LPX form factors? I can dig up the specs if you want, Intel makes motherboards in both form factors. Kevin To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Serious server-side NFS problem
> > In message <[EMAIL PROTECTED]>, Matthew Dillon writes: > > > >:>NFS uses the kernel 'boottime' structure to generate its version id. > >:>Now normally you might believe that this structure, once set, will > >:>never change. The authors of NFS certainly make that assumption! > >: > >:Is this another case of "lets assume the time of day is a random number" or > >:is there any underlying assumption about time in this ? > >: > >:-- > >:Poul-Henning Kamp FreeBSD coreteam member > >:[EMAIL PROTECTED] "Real hackers run -current on their laptop." > > > >It basically needs to be a unique for each server reboot in order > >to allow clients to resynchronize. > > Ok, then I suggest that you cache a copy of the boottime in the NFS > code for this purpose. > Ack, I was using this very same thing for several devices in an isolated peer-to-peer network to decide who the 'master' was. (Whoever had been up longest knew more about the state of the network) Having this change could cause weirdness for me too... I assumed (without checking *thwap*) that boottime was a constant. Perhaps a 'real_boottime' or 'unadjusted_boottime' that gets copied after 'boottime' gets initialized so that others can use it, not just NFS? :) Kevin To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Serious server-side NFS problem
> > > In message <[EMAIL PROTECTED]>, Kevin Day writes: > > > > >Ack, I was using this very same thing for several devices in an isolated > > >peer-to-peer network to decide who the 'master' was. (Whoever had been up > > >longest knew more about the state of the network) Having this change could > > >cause weirdness for me too... I assumed (without checking *thwap*) that > > >boottime was a constant. > > > > > >Perhaps a 'real_boottime' or 'unadjusted_boottime' that gets copied after > > >'boottime' gets initialized so that others can use it, not just NFS? :) > > > > no, I think that is a bad idea. In your case you want to use the > > "uptime" which *is* a measure of how long the system has been > > running. > > Uptime is also a constantly changing number. Forgive me for my > ignorance, but why does bootime constantly change? I would have thought > it would be a constant? I've got software that also uses this to > determine when a new copy of it exists (although I do keep a local cache > of the value in case my software crashes, since it can recover from a > crash, but not a reboot). > > I would think that boottime would be constant, since you didn't keep > booting at a different time... > Yeah, uptime is moving which makes it difficult for me too. When new machines enter the network, they need to announce a number which is used to decice who will become the master if the current master disappears. I could just announce currenttime-uptime, but that's got a slightly different meaning that I'll have to consider. Anyway, enough of my proprietary mess, but... I do see a few uses for a non-moving boottime, but won't argue here or now. :) This behaviour is documented in time(9) though, so I really can't complain. :) Kevin To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: MegaRAID jiggles clock?
> > I'm wondering if the AMI MegaRAID controller/driver might be the > reason that I'm getting a large number of clock resets from ntpd. > About every half hour, ntpd seems to feel the need to reset the clock > on the server by about 1/3 of a second. The server has a moderate NFS > load (going out through 12 dc interfaces) and an AMI MegaRAID 1400 > controller with 8 disks in a RAID-5 config. > > I have other servers with 12 dc ports, and havn't seen any > particularly bad time performance from them, which is why I'm > suspicious of the megaraid. This machine is also using a motherboard > common to many of our other machines. None of our other servers (we > have a "ring" of 5 time servers to which all our internal hosts > connect) or clients appear to have any issues. > > I have considered setting the option on ntpd to only adjust time by > adjusting the frequency ... to see if this is just a bogon clock chip > or somesuch. > > ideas? > > Dave. > Granted, this is an old 4.0-current machine(from around September), but I've seen heavy NFS server load affect the clocks on all three of my NFS servers. The heavier the load, the faster the clock seems to run. Mar 25 10:00:01 nfs ntpdate[75363]: adjust time server 192.160.127.90 offset -0.028636 Mar 25 11:00:02 nfs ntpdate[75406]: adjust time server 192.160.127.90 offset -0.033046 Mar 25 12:00:01 nfs ntpdate[75448]: adjust time server 192.160.127.90 offset -0.031371 Mar 25 13:00:01 nfs ntpdate[75490]: adjust time server 192.160.127.90 offset -0.030030 Mar 25 14:00:01 nfs ntpdate[75532]: adjust time server 192.160.127.90 offset -0.031346 Mar 25 15:00:00 nfs ntpdate[75573]: adjust time server 192.160.127.90 offset -0.030992 Mar 25 16:00:00 nfs ntpdate[75616]: adjust time server 192.160.127.90 offset -0.031654 Mar 25 17:00:00 nfs ntpdate[75657]: adjust time server 192.160.127.90 offset -0.031354 Because my NFS load isn't consistant throughout the day, xntpd seems to really freak out about trying to keep it balanced. Relevant info: Timecounter "i8254" frequency 1193182 Hz Timecounter "TSC" frequency 337194185 Hz CPU: AMD-K6(tm) 3D processor (337.19-MHz 586-class CPU) Origin = "AuthenticAMD" Id = 0x580 Stepping = 0 Features=0x8001bf AMD Features=0x8800 real memory = 134217728 (131072K bytes) avail memory = 126246912 (123288K bytes) vinum: loaded npx0: on motherboard npx0: INT 16 interface pcib0: on motherboard pci0: on pcib0 pcib1: at device 1.0 on pci0 pci1: on pcib1 isab0: at device 7.0 on pci0 isa0: on isab0 fxp0: irq 11 at device 14.0 on pci0 fxp0: Ethernet address 00:90:27:34:c1:ec ata-pci0: irq 0 at device 15.0 on pci0 ata-pci0: Busmastering DMA supported ata0 at 0x01f0 irq 14 on ata-pci0 ata1 at 0x0170 irq 15 on ata-pci0 vga-pci0: at device 16.0 on pci0 pn0: <82c169 PNIC 10/100BaseTX> irq 10 at device 18.0 on pci0 pn0: Ethernet address: 00:a0:cc:3e:c6:3d pn0: autoneg complete, link status good (full-duplex, 100Mbps) ahc0: irq 9 at device 20.0 on pci0 ahc0: aic7860 Single Channel A, SCSI Id=7, 3/255 SCBs ata0: master: setting up UDMA2 mode on Aladdin chip OK -- Kevin To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Load average calculation?
I'm not sure if this is -current fodder or not, but since it's still happening in -current, I'll ask. We recently upgraded a server from 2.2.8 to 4.0(the same behavior is shown on 5.0-current, too). Before, with the exact same load, we'd see load averages from between 0.20 and 0.30. Now, we're getting: load averages: 4.16, 4.23, 4.66 Top shows the same CPU percentages, just a much higher load average for the same work being done. Did the load average calculation change, or something with the scheduler differ? Customers are complaining that the load average is too high, which is kinda silly, since 4.0 seems noticably faster in some cases. Any ideas? Kevin To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Load average calculation?
> :We recently upgraded a server from 2.2.8 to 4.0(the same behavior is shown > :on 5.0-current, too). Before, with the exact same load, we'd see load > :averages from between 0.20 and 0.30. Now, we're getting: > : > :load averages: 4.16, 4.23, 4.66 > : > :Top shows the same CPU percentages, just a much higher load average for the > :same work being done. Did the load average calculation change, or something > :with the scheduler differ? Customers are complaining that the load average > :is too high, which is kinda silly, since 4.0 seems noticably faster in some > :cases. > : > :Any ideas? > : > :Kevin > > I believe the load average was changed quite a while ago to reflect not > only runnable processes but also processes stuck in disk-wait. It's > a more accurate measure of load. > Ahh, and since nearly everything is done on this system via NFS, I can imagine that several things are waiting for NFS responses. It's probably more accurate, but from a PR standpoint it makes it "look" like FreeBSD is choking under the load, when it really isn't. Or am I the only one that even cares about this? :) Kevin To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Load average calculation?
> > > I believe the load average was changed quite a while ago to reflect not > > > only runnable processes but also processes stuck in disk-wait. It's > > > a more accurate measure of load. > > > > Ahh, and since nearly everything is done on this system via NFS, I can > > imagine that several things are waiting for NFS responses. > > > > It's probably more accurate, but from a PR standpoint it makes it "look" > > like FreeBSD is choking under the load, when it really isn't. Or am I the > > only one that even cares about this? :) > > What does the man page for 'w' say about it? At least the change should be > reflected there I guess. getloadavg(3)(which 'w' and 'uptime' use) says: The getloadavg() function returns the number of processes in the system run queue averaged over various periods of time. The 'w' and 'uptime' manpages really don't mention anything relevant. Kevin To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Load average calculation?
> > < said: > > > It's probably more accurate, but from a PR standpoint it makes it "look" > > like FreeBSD is choking under the load, when it really isn't. > > Actually, you have it backwards -- it makes it look as if FreeBSD is > *not* choking under what appears to be a very heavy load > > -GAWollman > Well, my first impression was "Well, before doing this task the load average was only 0.20, now it's 4.0, obviously it can't keep up now." Which could probably be extended to "Under Linux the load average for running my database is only 0.20, FreeBSD's is 4.0, Linux must be faster." Granted it's flawed logic, but it's all a matter of perception at times. Kevin To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Panic with userquota(softupdates?)
I keep getting panics in dqget(ufs_quota.c), with a -current from a couple of days ago. I think this might be softupdates related, since I can't make it happen with softupdates turned off, although it's quite possible that it has nothing to do with it. Does anyone have any idea what might be causing this? Any other information that might be useful here? -- Kevin SMP 2 cpus IdlePTD 3813376 initial pcb at 3178c0 panicstr: page fault panic messages: --- Fatal trap 12: page fault while in kernel mode mp_lock = 0002; cpuid = 0; lapic.id = fault virtual address = 0x0 fault code = supervisor write, page not present instruction pointer = 0x8:0xc023d3d2 stack pointer = 0x10:0xd9176d28 frame pointer = 0x10:0xd9176d78 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 3384 (smbd) interrupt mask = none <- SMP: XXX trap number = 12 panic: page fault mp_lock = 0002; cpuid = 0; lapic.id = boot() called on cpu#0 syncing disks... done Uptime: 5h9m37s dumping to dev #da/0x20001, offset 1735462 dump 511 510 509 508 507 506 505 504 503 502 501 500 499 498 497 496 495 494 493 492 491 490 489 488 487 486 485 484 483 482 481 480 479 478 477 476 475 474 473 472 471 470 469 468 467 466 465 464 463 462 461 460 459 458 457 456 455 454 453 452 451 450 449 448 447 446 445 444 443 442 441 440 439 438 437 436 435 434 433 432 431 430 429 428 427 426 425 424 423 422 421 420 419 418 417 416 415 414 413 412 411 410 409 408 407 406 405 404 403 402 401 400 399 398 397 396 395 394 393 392 391 390 389 388 387 386 385 384 383 382 381 380 379 378 377 376 375 374 373 372 371 370 369 368 367 366 365 364 363 362 361 360 359 358 357 356 355 354 353 352 351 350 349 348 347 346 345 344 343 342 341 340 339 338 337 336 335 334 333 332 331 330 329 328 327 326 325 324 323 322 321 320 319 318 317 316 315 314 313 312 311 310 309 308 307 306 305 304 303 302 301 300 299 298 297 296 295 294 293 292 291 290 289 288 287 286 285 284 283 282 281 280 279 278 277 276 275 274 273 272 271 270 269 268 267 266 26! 5 264 263 262 261 260 259 258 257 256 255 254 253 252 251 250 249 248 247 246 245 244 243 242 241 240 239 238 237 236 235 234 233 232 231 230 229 228 227 226 225 224 223 222 221 220 219 218 217 216 215 214 213 212 211 210 209 208 207 206 205 204 203 202 201 200 199 198 197 196 195 194 193 192 191 190 189 188 187 186 185 184 183 182 181 180 179 178 177 176 175 174 173 172 171 170 169 168 167 166 165 164 163 162 161 160 159 158 157 156 155 154 153 152 151 150 149 148 147 146 145 144 143 142 141 140 139 138 137 136 135 134 133 132 131 130 129 128 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112 111 110 109 108 107 106 105 104 103 102 101 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 --- #0 boot (howto=256) at ../../kern/kern_shutdown.c:303 303 dumppcb.pcb_cr3 = rcr3(); (kgdb) bt #0 boot (howto=256) at ../../kern/kern_shutdown.c:303 #1 0xc016a2d9 in panic (fmt=0xc02c532f "page fault") at ../../kern/kern_shutdown.c:553 #2 0xc0282190 in trap_fatal (frame=0xd9176ce8, eva=0) at ../../i386/i386/trap.c:927 #3 0xc0281e01 in trap_pfault (frame=0xd9176ce8, usermode=0, eva=0) at ../../i386/i386/trap.c:820 #4 0xc028196b in trap (frame={tf_fs = 24, tf_es = 16, tf_ds = -652804080, tf_edi = -653726848, tf_esi = -1041411164, tf_ebp = -652776072, tf_isp = -652776172, tf_ebx = -1038103936, tf_edx = 0, tf_ecx = -1012515072, tf_eax = 0, tf_trapno = 12, tf_err = 2, tf_eip = -1071393838, tf_cs = 8, tf_eflags = 66118, tf_esp = -1012515072, tf_ss = -633865984}) at ../../i386/i386/trap.c:426 #5 0xc023d3d2 in dqget (vp=0xda37f900, id=65534, ump=0xc1f76200, type=0, dqp=0xc3a63f44) at ../../ufs/ufs/ufs_quota.c:763 #6 0xc023c796 in getinoquota (ip=0xc3a63f00) at ../../ufs/ufs/ufs_quota.c:95 #7 0xc023ddb5 in ufs_access (ap=0xd9176dfc) at ../../ufs/ufs/ufs_vnops.c:324 #8 0xc02408e9 in ufs_vnoperate (ap=0xd9176dfc) at ../../ufs/ufs/ufs_vnops.c:2287 #9 0xc019fc4b in vn_open (ndp=0xd9176ec8, fmode=3, cmode=484) at vnode_if.h:247 #10 0xc019bd8d in open (p=0xd916eac0, uap=0xd9176f80) at ../../kern/vfs_syscalls.c:995 #11 0xc02824c1 in syscall2 (frame={tf_fs = 47, tf_es = 47, tf_ds = -1078001617, tf_edi = -1077940192, tf_esi = 144677504, tf_ebp = -1077939168, tf_isp = -652775468, tf_ebx = 484, tf_edx = 2, tf_ecx = 135418240, tf_eax = 5, tf_trapno = 7, tf_err = 2, tf_eip = 673002960, tf_cs = 31, tf_eflags = 663, tf_esp = -1077941548, tf_ss = 47}) at ../../i386/i386/trap.c:1126 #12 0x
Re: mount_nfs/df bug?
> > Hello! > > Today I wanted to add a new NFS to my /etc/fstab, but forgot to add it > to /etc/exports on the server. > However, I did mount -a several times and always got a "Permission > denied" for the last one. > > Now look what I have here: > > Filesystem 1K-blocks UsedAvail Capacity Mounted on > /dev/ad2a 396895 2919047324080%/ > /dev/ad2e 5257421 4626154 21067496%/usr > procfs 440 100%/proc > /dev/ad0s1 4224828 3755464 46936489%/dos > neutron:/usr/ports 496367 3634489321080%/usr/ports > neutron:/usr/ports-distfiles 2482878 1191660 109258852% >/usr/ports-distfiles > neutron:/usr/home/ncvs 992439 9606843175597%/usr/home/ncvs > neutron:/usr/src928695 482371 37202956%/usr/src > neutron:/usr/home/mp3 9591515 9298876 29263997%/usr/home/mp3 > neutron:/usr/home/brenn 695311 5948384484993%/usr/home/brenn > neutron:/www/docs 297423 168669 10496162%/www > neutron:/usr/doc 2482878 1191660 109258852%/usr/doc > neutron:/usr/home/ncvs 992439 9606843175597%/usr/home/ncvs > neutron:/usr/home/mp3 9591515 9298876 29263997%/usr/home/mp3 > neutron:/usr/home/brenn 695311 5948384484993%/usr/home/brenn > neutron:/usr/home/ncvs 992439 9606843175597%/usr/home/ncvs > neutron:/usr/home/mp3 9591515 9298876 29263997%/usr/home/mp3 > neutron:/usr/home/brenn 695311 5948384484993%/usr/home/brenn > neutron:/usr/home/ncvs 992439 9606843175597%/usr/home/ncvs > neutron:/usr/home/mp3 9591515 9298876 29263997%/usr/home/mp3 > neutron:/usr/home/brenn 695311 5948384484993%/usr/home/brenn > neutron:/usr/home/ncvs 992439 9606843175597%/usr/home/ncvs > neutron:/usr/home/mp3 9591515 9298876 29263997%/usr/home/mp3 > neutron:/usr/home/brenn 695311 5948384484993%/usr/home/brenn > > Cute, isn't it? > > Not yet discovered why. > > Alex > -- > cat: /home/alex/.sig: No such file or directory > > > To Unsubscribe: send mail to [EMAIL PROTECTED] > with "unsubscribe freebsd-current" in the body of the message > This is probably similar to this: http://www.freebsd.org/cgi/query-pr.cgi?pr=6187 -- Kevin To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
function name collision on "getcontext" with ports/editors/joe
I'm the maintainer for ports/editors/joe, and just tried compiling it under -CURRENT. includes which includes ucontext.h > cc -O -pipe -c umath.c > In file included from b.h:6, > from bw.h:23, > from umath.c:5: > rc.h:41: conflicting types for `getcontext' > /usr/include/sys/ucontext.h:54: previous declaration of `getcontext' > *** Error code 1 > > Stop in /usr/ports/editors/joe/work/joe. I can rename getcontext in joe, but "getcontext" seems like a pretty common function name, I know I've used it in projects before. Not including signal.h isn't really an option either. I'm not familiar with any of the ucontext.h functions, are they complying with some kind of standard and can't be renamed or have a prefix added to it? -- Kevin To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
function name collision on "getcontext" with ports/editors/joe
I'm the maintainer for ports/editors/joe, and just tried compiling it under -CURRENT. includes which includes ucontext.h > cc -O -pipe -c umath.c > In file included from b.h:6, > from bw.h:23, > from umath.c:5: > rc.h:41: conflicting types for `getcontext' > /usr/include/sys/ucontext.h:54: previous declaration of `getcontext' > *** Error code 1 > > Stop in /usr/ports/editors/joe/work/joe. I can rename getcontext in joe, but "getcontext" seems like a pretty common function name, I know I've used it in projects before. Not including signal.h isn't really an option either. I'm not familiar with any of the ucontext.h functions, are they complying with some kind of standard and can't be renamed or have a prefix added to it? -- Kevin To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Filesystem deadlock
> On Mon, 22 Feb 1999, Alexander N. Kabaev wrote: > > > The following script reliably causes FreeBSD 4.0-CURRENT (and 3.1-STABLE > > as of today) to lookup. Shortly after this script is started, all disk > > activity > > > > stops and any attempt to create new process causes system to freese. While > > in DDB, ps command > > > > shows, that all ten fgrep processes are sleeping on inode, all xargs are in > > waitpid and > > > > all sh processes are in wait. > > You forget about all the processes (just a few, actually) stuck in "kmaw" > (kmem_alloc_wait). This is definitely reproducible :( Should be simple for > someone more knowledgeable to diagnose, as it looks to be a straight > vm/vfs(ufs/ffs) interaction. This is happening to me too, with a system that was from the 19th's SNAP, as well as today's kernel. (except I don't see anything in 'kmaw'). The process 'swapper' is stuck in 'inode', as well as anything else that's tried to touch the disk. Lots of 'sh's sitting in 'wait'. This machine is a heavy NFS client, but I'm not sure that it's related. Kevin To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-current" in the body of the message
Re: panic: zone: entry not free
> :> : > :> :This means that invariants need to add relatively little overhead. > :> : > :> :Peter > :> > :> which they do. > : > :You know, guys, for programmers, wanting immediate panics on stuff like > :this is great, but there isn't one user in a thousand that wants this. > :If you make this kinda stuff default on a version *other than* current > :(current being by definition, for programmers/developers only) then > :you're going to hear bloody murder, and you guys will be doing vast > :damage to FreeBSD's reputation. > : > :Users don't want panics, and they don't care why, they just want things > > No no no... you are missing the whole point. > > *IF* we put these kinds of checks in by default, the result is a > few more panics in the near term, but *NO* panics in the medium and > long term. > > In otherwords, by putting the checks in now, the kernel gets debugged > much more quickly --- to the point where a year down the line we no > longer get kernel panics at all. > Also, try commenting out a panic line in a known bug, and watch how quickly the kernel crashes anyway, in the same situation. Most of the time, the panic is dumping out (some) debugging information before crashing all over itself. Just taking out the diagnostic message is really just making crashes more obscure. If the error were recoverable, normally the system recovers from it. If it's not, it panic's and dies. Take out the panic, and all you've got left is a 'die', which probably will lead people on a wild goose chase as to where that section of memory really got trashed. Kevin To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-current" in the body of the message
Re: Games
> On Thu, 1 Apr 1999, Rod Taylor wrote: > > Just out of curiosity, why are there games included in the FreeBSD > > source tree? > > > > For a group of people that was so worried about including dhcp because > > it's extra code, don't you think it's time to make those games into > > ports only? > > > > I say this under the assumption that they're not required for FreeBSD to > > function. (Not like IE for windows ;) > > As far as I am concerned, things like fortune, pom, pig and banner > have been included with BSD-ish systems for ages... Tradition... > I wouldn't feel the same if I didn't get my fortune every login. > > also, don't you have the option to not have the games? It's not just a matter of turning them off though. A few of the games in the distro are trademark infringements. While the product I'm developing that uses FreeBSD doesn't have the games installed, it brought up the comment from our lawyers "What else are they infringing on that we *are* using?" (see trek, mille, boggle, tetris, wargames) Kevin To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-current" in the body of the message
Re: Games
> :It's not just a matter of turning them off though. A few of the games in the > :distro are trademark infringements. While the product I'm developing that > :uses FreeBSD doesn't have the games installed, it brought up the comment > :from our lawyers "What else are they infringing on that we *are* using?" > : > :(see trek, mille, boggle, tetris, wargames) > : > :Kevin > > Tetris hasn't been in the distribution for a while. > Oops, i did an 'ls /usr/games' on a machine that's been around since 2.2.2. Ok, forget tetris. > Your lawyers need a dose of reality if they think the existance of > the other games in a distribution could ever come back to haunt you > or your company. Tell them to screw their heads on straight and try > again. If your really worried, just delete them. > That wasn't really the issue though. It's that FreeBSD is infringing on trademarks in the games distribution, so what's to make them think that the vm system isn't an infringement? (i.e. "They're the type of people who don't care about trademarks/copyrights. How can we trust this code?") It also probably doesn't help things that i'm working for a video game company. /usr/games is deleted here, anyway. :) Kevin To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-current" in the body of the message
Using 4.3-RELEASE's libc on 5.0 causes hard lockups
We had a system running 4.3-RELEASE that I used the sysinstall upgrade mechanism to upgrade to 5.0-RELEASE. I installed "compat4x" to use our existing 4.x binaries. Immediately after rebooting, I noticed most old 4.x binaries were complaining about "_stdoutp" being an undefined symbol. However, the scary part was that when I started apache/mod_php4 the server crashed (hard lockup) within 10 seconds under load. This was easily reproducible, at least a dozen times while trying to debug this I started httpd, and the server locked up within 10 seconds. I recompiled all of apache, mod_php4 and all of its libraries, started up httpd and had no problems with that. Things were fine that night until an "analog" cron job ran, every time THAT ran, I also got a hard lockup of the server, OR between 100 and 500 of my httpd processes would suddenly SEGV. After a little more poking around, I saw in /usr/lib: lrwxr-xr-x 1 root wheel 9 Feb 1 00:18 libc.so -> libc.so.5 lrwxr-xr-x 1 root wheel 16 Jul 5 2002 libc.so.3 -> /usr/lib/libc.so -r--r--r-- 1 root wheel 571480 Aug 5 13:45 libc.so.4 -r--r--r-- 1 root wheel 836892 Feb 1 00:18 libc.so.5 Shouldn't libc.so.4 have been a symlink to libc.so after a compat4x install? In any case, doing that myself seemed to fix everything. My questions: 1) Shouldn't something along the way of doing a sysinstall upgrade or installing compat4x have fixed /usr/lib/libc.so.4 into a symlink? (That is the correct situation, right?) 2) Is it possible that some kernel interface has changed, and something isn't being validated in the kernel side? Non-root userland applications being able to lockup the server, and/or affect other processes simply by using a different libc would seem to indicate this. I know this is a pretty vague bug report, but this is a production server, so I wasn't able to play around too much with it. I do have a backup of the entire server before it was upgraded to 5.0 if you'd like me to check anything there. I did compile with INVARIANTS and WITNESS and got no debugging output when things did lock up. The keyboard and serial console were totally dead when this happened, so DDB isn't an option either. (originally emailed security-officer about this because of the possibility for a security issue, who told me to forward this here) To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Using 4.3-RELEASE's libc on 5.0 causes hard lockups
At 11:42 AM 2/2/2003, Jacques A. Vidrine wrote: On Sun, Feb 02, 2003 at 11:41:32AM -0600, Kevin Day wrote: > lrwxr-xr-x 1 root wheel 9 Feb 1 00:18 libc.so -> libc.so.5 > lrwxr-xr-x 1 root wheel 16 Jul 5 2002 libc.so.3 -> /usr/lib/libc.so ^ This is seriously messed up. See below. > -r--r--r-- 1 root wheel 571480 Aug 5 13:45 libc.so.4 > -r--r--r-- 1 root wheel 836892 Feb 1 00:18 libc.so.5 > > > Shouldn't libc.so.4 have been a symlink to libc.so after a compat4x > install? In any case, doing that myself seemed to fix everything. No, this would cause you major problems. Binaries that expected the libc.so.4 interface would be calling into libc.so.5, and probably causing very strange behaviour. Ok, I admit, no matter how it happened, an application using the wrong libc is a bad thing. But, how are things supposed to work? Apps that were using the old libc.so.4 complained about unresolved symbols(_stdoutp usually). If I removed /usr/lib/libc.so.4 they complained that they couldn't find libc, If I did create link libc.so.4 to libc.so.5 everything appeared to work just fine, but I know that's probably a fluke. In any case, a system lockup or being able to crash other user's processes just by having the wrong libc shouldn't be possible no matter what happens. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Using 4.3-RELEASE's libc on 5.0 causes hard lockups
At 11:54 AM 2/2/2003, Jacques A. Vidrine wrote: > Ok, I admit, no matter how it happened, an application using the wrong libc > is a bad thing. > > But, how are things supposed to work? Apps that need the old libc.so.4 will find it in /usr/lib/compat/libc.so.4 (or /usr/lib/libc.so.4 if you didn't remove it, for that matter). Well, things were definitely picking /usr/lib/libc.so.4 over anything in compat. Should sysinstall have nuked my /usr/lib/libc if it was putting the correct one in compat? > In any case, a system lockup or being able to crash other user's processes > just by having the wrong libc shouldn't be possible no matter what happens. Probably not, although if you have processes running as root and using the `wrong' libc, all bets are off. Well, after I recompiled httpd (which did have a single process owned by root) and rebooted, nothing at all owned by root touched anything that was compiled under 4.x. Non-privileged regular users owned the process owned by analog, which caused the same behavior. Me running analog under my normal account could kill processes owned by "nobody" with segfaults. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: DoS from local users (fwd)
> > : > :It should be possible to prevent a user from hogging a system if the system's > :naive scheduler is improved. > : > : Amancio > > No, it isn't. For a very simple reason: The resources users need to do > real work are very similar to the resources users need to hog the system. > > Saying that the system should somehow be able to magically make the > distinction between the two is a pipedream. It takes a human to make > the distinction. > > Short of restricting the resources you give to users to the point where > they can't even start a mail or news client, there is just no way to > prevent said users from loading down the machine if they choose to. > > -Matt > > On the shell servers I run, we've got 200-300 users running tasks. Occasionally, through intent or misconfiguration, a user either forkbombs, or gets a large number of processes running sucking lots of cpu. I'd like to see an option that makes all the processes run by one uid have the same weight as one process another uid is running. i.e. uid 1001 starts 40 processes eating as much cpu as they can. Then uid 1002 starts up one process. Uid 1002's process gets 50% cpu, and uid 1001's 40 processes get 50% cpu shared between them. This way, one errant user can't have as significant of an impact. Is this plausable? Kevin To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-current" in the body of the message
Re: DoS from local users (fwd)
> In message <199904102057.paa27...@home.dragondata.com> Kevin Day writes: > : i.e. uid 1001 starts 40 processes eating as much cpu as they can. Then uid > : 1002 starts up one process. Uid 1002's process gets 50% cpu, and uid 1001's > : 40 processes get 50% cpu shared between them. > > I've seen some experimental patches in the past that try to do just > this. However, there are some problems. What if uid 1002's process > does a sleep. Should the 40 processes that 1001 just get 50% of the > cpu? Or should there be other limits. It turns into an interesting > research problem in a hurry. > > Warner > I was thinking essentially just processes in the RUN state get applied to this. If the cpu would otherwise be sitting idle, by all means give it to someone. But, if two users have processes running, just because one user has 50 processes doesn't mean it should get 50x the cpu as one user who has one process running. If a process is in sleep or blocked(select, IO, whatever), it's taken out of consideration for the cpu, and the full cpu is given to those processes that actually have work to do. At least, that's my take on it. I run into this problem daily, and i get enough user complains of "User x has 50 processes running, eating as much cpu as they can, my compile just took 15 minutes". Kevin To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-current" in the body of the message
Re: Alright, who's the smart alleck that fixed NFS this last week? :) , WAS: Re: solid NFS patch #6 avail for -current - need tester
> yeah the clocks are not setup properly :) but otherwise i'm just > gonna say HOLY SH*T you fixed NFS! :) We all owe Matt big for this. :) > I'm using the default mount operations, as far as NFS server > not responding messages, i have no clue, but the server is still > up and i've seen that message happen when a lot of pressure is > being put on an NFS server even though everything is fine. Try mounting with -d... Can I make a guess that the NFS mount is going over 100MB ethernet? I have a strong theory that the dynamic retransmit timer needs rework for low latency connections, with high variability in their performance during high traffic. (lots of collisions) Kevin To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-current" in the body of the message
buildworld breaks in doscmd if no X installed?
cvsupped last night, and I don't have X installed. Is this a matter of "If you want to buildworld, install X"? I'm certain i've done this before though. :) Kevin cc -nostdinc -O -pipe -I. -I/usr/X11R6/include -DDISASSEMBLER -I/usr/obj/usr/src/tmp/usr/include -o doscmd AsyncIO.o ParseBuffer.o bios.o callback.o cpu.o dos.o cmos.o config.o cwd.o debug.o disktab.o doscmd.o ems.o emuint.o exe.o i386-pinsn.o int.o int10.o int13.o int14.o int16.o int17.o int1a.o int2f.o intff.o mem.o mouse.o net.o port.o setver.o signal.o timer.o trace.o trap.o tty.o xms.o -L/usr/X11R6/lib -lX11 tty.o: In function `video_setborder': tty.o(.text+0x22b): undefined reference to `XSetWindowBackground' tty.o: In function `setgc': tty.o(.text+0x2c1): undefined reference to `XChangeGC' tty.o: In function `video_update': tty.o(.text+0x4ca): undefined reference to `XDrawImageString' tty.o(.text+0x550): undefined reference to `XDrawImageString' tty.o(.text+0x64a): undefined reference to `XChangeGC' tty.o(.text+0x6d2): undefined reference to `XFillRectangle' tty.o(.text+0x77e): undefined reference to `XChangeGC' tty.o(.text+0x7bb): undefined reference to `XFillRectangle' tty.o(.text+0x7c9): undefined reference to `XFlush' tty.o: In function `debug_event': tty.o(.text+0xbf8): undefined reference to `XBell' tty.o(.text+0xc03): undefined reference to `XFlush' tty.o: In function `video_async_event': tty.o(.text+0x11c3): undefined reference to `XFlush' tty.o(.text+0x11e3): undefined reference to `XNextEvent' tty.o(.text+0x1257): undefined reference to `XFlush' tty.o(.text+0x1298): undefined reference to `XNextEvent' tty.o: In function `video_event': tty.o(.text+0x1774): undefined reference to `XLookupString' tty.o(.text+0x18f8): undefined reference to `XLookupString' tty.o: In function `tty_write': tty.o(.text+0x1f82): undefined reference to `XBell' tty.o: In function `KbdWrite': tty.o(.text+0x27da): undefined reference to `XBell' tty.o: In function `video_init': tty.o(.text+0x2b35): undefined reference to `XOpenDisplay' tty.o(.text+0x2b5c): undefined reference to `XDisplayName' tty.o(.text+0x2c39): undefined reference to `XAllocNamedColor' tty.o(.text+0x2c92): undefined reference to `XLoadQueryFont' tty.o(.text+0x2cae): undefined reference to `XLoadQueryFont' tty.o(.text+0x2d81): undefined reference to `XCreateSimpleWindow' tty.o(.text+0x2de4): undefined reference to `XCreateGC' tty.o(.text+0x2e1b): undefined reference to `XCreateGC' tty.o(.text+0x2e38): undefined reference to `XSetNormalHints' tty.o(.text+0x2e62): undefined reference to `XSelectInput' tty.o(.text+0x2e76): undefined reference to `XMapWindow' tty.o(.text+0x2e81): undefined reference to `XFlush' *** Error code 1 Stop. *** Error code 1 Stop. *** Error code 1 Stop. *** Error code 1 Stop. *** Error code 1 Stop. *** Error code 1 To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-current" in the body of the message
Re: solid NFS patch #6 avail for -current - need testers files)
> > > > To sum it all up is there any difference between the branches? > > Yes. We hope that people like you will help us by participating in the > testing of potential releases _before_ they go out as releases, not > _afterwards_. > > Sitting around doing nothing and then complaining after the fact > doesn't help anyone, least of all yourself. > This isn't meant in a bad way, but let me share with you my experiences. Before 3.0 was released, I said several times "Hey, NFS got a lot worse on -CURRENT. Is anyone looking at this?" and got several replies of "Duh, this is -CURRENT. Don't whine about it. If you're trying to use this in a production environment, you're crazy." After 3.0 was released, I said "Hey, 3.0 got released, and NFS was still broken", to which I got "Why didn't you bug us about this before the release?" and/or "Why didn't you test this before release?" I understand NFS is a 'special' problem, but for those of us not in the trenches coding, I think the '3-level' system would be better. -CURRENT for those who are coding, -BETA for people like me to test things and bring up what broke, and -RELEASE for everyone else. I honestly don't know when to bring up things like that, now. :) Kevin To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-current" in the body of the message
Re: solid NFS patch #6 avail for -current - need testers files)
> > I honestly don't know when to bring up things like that, now. :) > > For 3.2, _right_now_. What you're doing with Matt is the first stage; > the next involves bringing it back to the 3.2-beta tree and testing it > there. > > Please understand that if "you" (the community) aren't working on this, > nobody else is. We don't have enough people manning the trenches > because they're all sitting back in the chateau waiting for the > afternoon dispatches. This doesn't work. 8) > Can I propose something? I realize gnats does most of this, but... Suppose there's some central list where anyone who is having unresolved problems can post their e-mail address, section of code, and a brief explaination of the symptoms. Other people acn go to this list and tack on their e-mail address to other people's compalaints saying "I'm seeing this too.". Before each release, all of these people are e-mailed saying "Can you test to see if your problem still exists?" This will also be a bonus for developers to find people who are experiencing specific problems, to see if their fixes work. I know this is a lot like gnats, however: I don't think gnats wants a list of 'me too's in it. It's not easy to mail groups of people from gnats. There's no reason for anyone to add their e-mail address to a PR at the moment. I'm not sure if this'll make things more confusing or not, but... It'll stop people with legitimate problems from getting lost in the shuffle, and keeps PR's to more timely issues. Anyone have any comments? Really, i'm just picturing a list of people with specific problems... maybe gnats can be tuned a bit for this... Kevin To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-current" in the body of the message
-current deadlocks within 5 mins, over NFS
Matt, I told you about this before, but completely forgot about it. After doing considerable testing on my test servers, i thought -current was safe enough to try on our production shell servers. I installed -current on one of my servers, and to my dismay, it hung. :) Within 5 minutes of running, nearly every process is blocked on 'inode', with the exception of a single 'cp' stuck in vmopar. I have a very silly, *very* poorly written script i run out of cron, every 10 mins or so, to update my passwd and group files. #/bin/sh cp /home/private/passwd /etc cp /home/private/master.passwd /etc cp /home/private/group /etc rm /etc/spwd.db.tmp >/dev/null 2>&1 pwd_mkdb /etc/master.passwd This script is the only source of a 'cp' anywhere... If I turn this off, I was able to run for at least 30 mins(more, if i hadn't rebooted) /home is a UDP NFS2 mount. Moving the source of those cp's to a local drive also fixes the problem, but breaks the use for my script. :) What more info do you need for help in debugging this? Kevin To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-current" in the body of the message
Re: -current deadlocks within 5 mins, over NFS
> > Matt, I told you about this before, but completely forgot about it. After > doing considerable testing on my test servers, i thought -current was safe > enough to try on our production shell servers. I installed -current on one > of my servers, and to my dismay, it hung. :) > > Within 5 minutes of running, nearly every process is blocked on 'inode', > with the exception of a single 'cp' stuck in vmopar. > Just to add more to this, before someone replies. Ran all night, still hasn't crashed. On my test system, if I add that cron job back, it dies very quickly, so this shouldn't be hard to reproduce to someone who needs it. -current sources grabbed last night. Kevin To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-current" in the body of the message
-current NFS crash (out of mbuf clusters)
I'm sure by now Matt is gonna kill me. :) -current from 2 days ago. IdlePTD 3096576 initial pcb at 27ea40 panicstr: Out of mbuf clusters panic messages: --- panic: Out of mbuf clusters syncing disks... panic: Out of mbuf clusters dumping to dev 20001, offset 467137 dump 255 254 253 252 251 250 249 248 247 246 245 244 243 242 241 240 239 238 237 236 235 234 233 232 231 230 229 228 227 226 225 224 223 222 221 220 219 218 217 216 215 214 213 212 211 210 209 208 207 206 205 204 203 202 201 200 199 198 197 196 195 194 193 192 191 190 189 188 187 186 185 184 183 182 181 180 179 178 177 176 175 174 173 172 171 170 169 168 167 166 165 164 163 162 161 160 159 158 157 156 155 154 153 152 151 150 149 148 147 146 145 144 143 142 141 140 139 138 137 136 135 134 133 132 131 130 129 128 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112 111 110 109 108 107 106 105 104 103 102 101 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 --- #0 boot (howto=260) at ../../kern/kern_shutdown.c:288 288 dumppcb.pcb_cr3 = rcr3(); (kgdb) bt #0 boot (howto=260) at ../../kern/kern_shutdown.c:288 #1 0xc0145755 in panic () at ../../kern/kern_shutdown.c:450 #2 0xc015c382 in m_retryhdr (i=0, t=1) at ../../kern/uipc_mbuf.c:297 #3 0xc01b3013 in nfsm_rpchead (cr=0xc1096a00, nmflag=35328, procid=21, auth_type=1, auth_len=24, auth_str=0x0, verf_len=-1071591891, verf_str=0x0, mrest=0xc0acb900, mrest_len=44, mbp=0xcad93990, xidp=0xcad93994) at ../../nfs/nfs_subs.c:657 #4 0xc01b0317 in nfs_request (vp=0xca465400, mrest=0xc0acb900, procnum=21, procp=0xc02955a0, cred=0xc1096a00, mrp=0xcad939fc, mdp=0xcad93a00, dposp=0xcad93a04) at ../../nfs/nfs_socket.c:971 #5 0xc01c90d5 in nfs_commit (vp=0xca465400, offset=0, cnt=7463, cred=0xc1096a00, procp=0xc02955a0) at ../../nfs/nfs_vnops.c:2586 #6 0xc01c9620 in nfs_flush (vp=0xca465400, cred=0xc0a5e900, waitfor=2, p=0xc02955a0, commit=1) at ../../nfs/nfs_vnops.c:2846 #7 0xc01c9389 in nfs_fsync (ap=0xcad93b2c) at ../../nfs/nfs_vnops.c:2710 #8 0xc01b9489 in nfs_sync (mp=0xc0ecee00, waitfor=2, cred=0xc0a5e900, p=0xc02955a0) at vnode_if.h:499 #9 0xc016ceaf in sync (p=0xc02955a0, uap=0x0) at ../../kern/vfs_syscalls.c:543 #10 0xc014535a in boot (howto=256) at ../../kern/kern_shutdown.c:205 #11 0xc0145755 in panic () at ../../kern/kern_shutdown.c:450 #12 0xc015c2ca in m_retry (i=0, t=1) at ../../kern/uipc_mbuf.c:269 #13 0xc015c8e7 in m_copym (m=0xc0adf400, off0=0, len=10, wait=0) at ../../kern/uipc_mbuf.c:450 #14 0xc01b047e in nfs_request (vp=0xca346440, mrest=0xc0ded380, procnum=4, procp=0xcac83940, cred=0xc0a5e900, mrp=0xcad93ccc, mdp=0xcad93cd0, dposp=0xcad93cd4) at ../../nfs/nfs_socket.c:1024 #15 0xc01b99a9 in nfs_access (ap=0xcad93d84) at ../../nfs/nfs_vnops.c:357 #16 0xc01bb9cf in nfs_lookup (ap=0xcad93e30) at vnode_if.h:219 #17 0xc016930b in lookup (ndp=0xcad93eb4) at vnode_if.h:31 #18 0xc0168d12 in namei (ndp=0xcad93eb4) at ../../kern/vfs_lookup.c:152 #19 0xc016e878 in lstat (p=0xcac83940, uap=0xcad93f90) at ../../kern/vfs_syscalls.c:1702 #20 0xc020ec26 in syscall (frame={tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = -1, tf_esi = 0, tf_ebp = -1077947556, tf_isp = -891731996, tf_ebx = 671539952, tf_edx = -1077947508, tf_ecx = 0, tf_eax = 190, tf_trapno = 12, tf_err = 2, tf_eip = 671750724, tf_cs = 31, tf_eflags = 582, tf_esp = -1077947676, tf_ss = 47}) at ../../i386/i386/trap.c:1066 #21 0xc0203f20 in Xint0x80_syscall () #22 0x2806afc7 in ?? () #23 0x2806b24d in ?? () #24 0x2806aa82 in ?? () #25 0x804a56e in ?? () #26 0x804a246 in ?? () #27 0x804acc7 in ?? () #28 0x8049bd1 in ?? () #29 0x80499c1 in ?? () #30 0x80497c9 in ?? () Copyright (c) 1992-1999 The FreeBSD Project. Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. FreeBSD 4.0-CURRENT #0: Fri May 7 02:53:27 CDT 1999 toa...@shell3.dragondata.com:/usr/src/sys/compile/SHELL3 Timecounter "i8254" frequency 1193182 Hz CPU: AMD-K6(tm) 3D processor (300.68-MHz 586-class CPU) Origin = "AuthenticAMD" Id = 0x580 Stepping=0 Features=0x8001bf real memory = 268435456 (262144K bytes) sio0: system console avail memory = 258457600 (252400K bytes) Probing for PnP devices: npx0: on motherboard npx0: INT 16 interface pcib0: on motherboard pci0: on pcib0 chip0: at device 0.0 on pci0 pcib1: at device 1.0 on pci0 pci1: on pcib1 isab0: at device 7.0 on pci0 ide_pci0: at device 15.0 on pci0 fxp0: at device 16.0 on pci0 fxp0: interrupting at irq 12 fxp0: Ethernet address 00:90:27:34:b9:a7 fxp1: at device 18.0 on pci0 fxp1: interrupting at irq 10 fxp1: Ethernet address 00:90:27:34:c0:12 eisa0: on motherboard
Re: -current NFS crash (out of mbuf clusters)
> :I'm sure by now Matt is gonna kill me. :) > : > :-current from 2 days ago. > : > :IdlePTD 3096576 > :initial pcb at 27ea40 > :panicstr: Out of mbuf clusters > :panic messages: > :--- > :panic: Out of mbuf clusters > > This is probably not NFS related unless there is a leak somewhere. > > You may have to mess with the NMBCLUSTERS kernel config to increase > the number of mbuf clusters. FreeBSD tends to not allocate enough > by default in more heavily loaded larger-memory configurations. > > It should be possible to confirm that the problem is not NFS by taking > a general look at the state of the system at the time of the crash. You > can run 'ps' and 'netstat' on the core dump: > > cd /var/crash > ps -axl -M vmcore.XX -N kernel.XX UID PID PPID CPU PRI NI VSZ RSS WCHAN STAT TT TIME COMMAND 0 0 0 0 -18 0 00 sched DLs ??0:00.00 (swapper) 0 1 0 0 10 0 5000 wait Is??0:00.00 (init) 0 2 0 0 -18 0 00 psleep DL??0:00.00 (pagedaemon) 0 3 0 0 18 0 00 psleep DL??0:00.00 (vmdaemon) 0 4 0 0 -1 0 00 nfsrcv DL??0:00.00 (syncer) 039 1 30 18 0 2040 pause Is??0:00.00 (adjkerntz) 1 233 1 30 2 0 8320 select Is??0:00.00 (portmap) 0 268 1 7 29 0 10240 - Rs??0:00.00 (cron) 10079 384 1 0 -1 0 20920 nfsrcv D ??0:00.00 (eggdrop) 1200 703 1 0 -1 0 17400 nfsrcv D ??0:00.00 (eggdrop) 10039 706 1 0 -1 0 16560 nfsrcv D ??0:00.00 (eggdrop) 10173 711 1 0 2 0 18040 select S ??0:00.00 (eggdrop) 10336 1075 1 0 -1 0 19160 nfsrcv D ??0:00.00 (eggdrop) 10051 1245 1 0 -1 0 23560 nfsrcv D ??0:00.00 (eggdrop-1.3.23) 10467 1686 1 0 -1 0 18200 nfsrcv D ??0:00.00 (eggdrop) 10173 1697 1 0 2 0 17920 select S ??0:00.00 (eggdrop) 10387 1726 1 0 2 0 18000 select S ??0:00.00 (eggdrop) 10387 1727 1 0 2 0 17920 select S ??0:00.00 (eggdrop) 1279 1743 1 0 -1 0 22280 nfsrcv D ??0:00.00 (eggdrop) 10176 1745 1 6 2 0 24600 select S ??0:00.00 (eggdrop-1.3.26) 10051 2128 1 0 2 0 11600 select Ss??0:00.00 (ezbounce) 0 2200 268 0 -6 0 10560 piperd I ??0:00.00 (cron) 10002 2206 2200 1 28 0 00 - Z ??0:00.00 (sh) 0 2548 2200 5 -6 0 13280 piperd I ??0:00.00 (sendmail) 1292 2602 1 11 10 0 5000 wait Is??0:00.00 (sh) 10002 2655 1 0 2 0 8280 select Is??0:00.00 (bnc) 1392 2657 1 0 2 0 8600 select Is??0:00.00 (bnc) 10218 2658 1 0 2 0 8600 select Is??0:00.00 (bnc) 10177 2664 1 0 2 0 8600 select Is??0:00.00 (bnc) 10033 2666 1 0 2 0 8600 select Is??0:00.00 (bnc) 1294 2667 1 0 2 0 8760 select Is??0:00.00 (bnc) 10452 2673 1 0 -1 0 9560 nfsrcv D ??0:00.00 (mech) 1292 2688 2602 0 10 0 5040 wait I ??0:00.00 (sh) 10427 2726 1 0 -1 0 19880 nfsrcv D ??0:00.00 (eggdrop) 1292 2755 2688 0 -6 0 17400 piperd I ??0:00.00 (eggdrop) 1339 2762 1 0 -1 0 18520 nfsrcv D ??0:00.00 (BitchX) 1339 2772 1 0 -1 0 18400 nfsrcv D ??0:00.00 (bnc) 10391 2854 1 0 -1 0 17440 nfsrcv D ??0:00.00 (eggdrop) 10027 2858 1 0 -1 0 16960 nfsrcv D ??0:00.00 (eggdrop) 10027 2859 1 0 -1 0 16960 nfsrcv D ??0:00.00 (eggdrop) 10027 2860 1 0 -1 0 16960 nfsrcv D ??0:00.00 (eggdrop) 1272 2870 1 0 -1 0 20560 nfsrcv D ??0:00.00 (eggdrop) 10237 2871 1 0 2 0 16880 select S ??0:00.00 (eggdrop) 10169 2872 1 0 -1 0 18360 nfsrcv D ??0:00.00 (eggdrop) 1405 2874 1 0 -1 0 17000 nfsrcv D ??0:00.00 (eggdrop) 1285 2875 1 0 -1 0 22920 nfsrcv D ??0:00.00 (eggdrop) 10099 2877 1 0 -1 0 18320 nfsrcv D ??0:00.00 (eggdrop) 10112 2878 1 0 -1 0 19920 nfsrcv D ??0:00.00 (eggdrop) 10239 2879 1 0 -1 0 19080 nfsrcv D ??0:00.00 (eggdrop) 10385 2880 1 0 -1 0 15400 nfsrcv D ??0:00.00 (eggdrop) 1079 2891 1 0 -1 0 19000 nfsrcv D ??0:00.00 (eggdrop) 10002 2892 1 0 -1 0 21280 nfsrcv D ??0:00.00 (eggdrop) 10428 2900 1 0 -1 0 17800 nfsrcv D ??0:00.00 (eggdrop) 10428 2901 1 0 -1 0 1764
Re: -current NFS crash (out of mbuf clusters)
Erm, sorry guys, that huge message wasn't intended to go back to -current, just Matt. My apologies. :) Kevin > > :I'm sure by now Matt is gonna kill me. :) > > : > > :-current from 2 days ago. > > : > > :IdlePTD 3096576 > > :initial pcb at 27ea40 > > :panicstr: Out of mbuf clusters > > :panic messages: > > :--- > > :panic: Out of mbuf clusters > > > > This is probably not NFS related unless there is a leak somewhere. > > > > You may have to mess with the NMBCLUSTERS kernel config to increase > > the number of mbuf clusters. FreeBSD tends to not allocate enough > > by default in more heavily loaded larger-memory configurations. > > > > It should be possible to confirm that the problem is not NFS by taking > > a general look at the state of the system at the time of the crash. You > > can run 'ps' and 'netstat' on the core dump: > > > > cd /var/crash > > ps -axl -M vmcore.XX -N kernel.XX > > UID PID PPID CPU PRI NI VSZ RSS WCHAN STAT TT TIME COMMAND > 0 0 0 0 -18 0 00 sched DLs ??0:00.00 (swapper) > 0 1 0 0 10 0 5000 wait Is??0:00.00 (init) To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-current" in the body of the message
Re: -current NFS crash (out of mbuf clusters)
> :> netstat -m -M vmcore.XX -N kernel.XX > :> > : > :1014/2144 mbufs in use: > : 714 mbufs allocated to data > : 300 mbufs allocated to packet headers > :638/1324/1536 mbuf clusters in use (current/peak/max) > :2916 Kbytes allocated to network (48% in use) > :0 requests for memory denied > :0 requests for memory delayed > :0 calls to protocol drain routines > : > :What does this tell you? > : > :Kevin > > It tells me your userbase is out of control :-) From the looks > of it, hundreds of cron jobs are starting up simultaniously > and overloading some system resource. > Yeah, I wrote a patch to cron for a while that wouldn't allow that to happen, but it didn't apply cleanly to 4.0's cron, so I'm going to go check why. :) (It staggered the requests, only allow x to run per quantum) > I would also recommend: > > vmstat -m -M vmcore.XX -N kernel.XX > Memory statistics by bucket size Size In Use Free Requests HighWater Couldfree 16 883141 820528161280 0 32 7569 9711 561955 640253 6427568 1744 87910913 320253 128 1526202 80132397 1601021415 25620096 3616 244895 80151 512 78 2 1911 40 0 1K 277135 13673 20314 2K 32 12173 10 67 4K6 1 4585 5 0 8K0 1 2 5 0 16K3 0 3 5 0 32K4 0 4 5 0 64K5 0 5 5 0 128K1 0 1 5 0 256K1 0 1 5 0 512K0 0 2 5 0 Memory usage type by bucket size Size Type(s) 16 devbuf, temp, proc, sysctl, rman, soname, pcb, vnodes, ether_multi, routetbl, isa_devlist, atkbddev, devbuf, temp, proc, sysctl, rman, soname, pcb, vnodes, ether_multi, routetbl, isa_devlist, atkbddev, devbuf, temp, proc, sysctl, rman, soname, pcb, vnodes, ether_multi, routetbl 32 kld, sigio, devbuf, temp, pgrp, subproc, sysctl, SWAP, soname, pcb, cluster_save buffer, vnodes, ifaddr, ether_multi, routetbl, in_multi, NFS req, kld, sigio, devbuf, temp, pgrp, subproc, sysctl, SWAP, soname, pcb, cluster_save buffer, vnodes, ifaddr, ether_multi, routetbl, in_multi, NFS req, kld, sigio, devbuf, temp, pgrp, subproc, sysctl, SWAP, soname, pcb, cluster_save buffer, vnodes, ifaddr, ether_multi, routetbl, in_multi, NFS req 64 file, lockf, namecache, devbuf, temp, session, rman, soname, pcb, cluster_save buffer, vnodes, ifaddr, ether_multi, routetbl, NFS req, file, lockf, namecache, devbuf, temp, session, rman, soname, pcb, cluster_save buffer, vnodes, ifaddr, ether_multi, routetbl, NFS req, file, lockf, namecache, devbuf, temp, session, rman, soname, pcb, cluster_save buffer, vnodes, ifaddr, ether_multi, routetbl, NFS req 128 isadev, kld, timecounter, file desc, zombie, namecache, devbuf, temp, cred, ttys, soname, vnodes, ifaddr, routetbl, ZONE, isadev, kld, timecounter, file desc, zombie, namecache, devbuf, temp, cred, ttys, soname, vnodes, ifaddr, routetbl, ZONE, isadev, kld, timecounter, file desc, zombie, namecache, devbuf, temp, cred, ttys, soname, vnodes, ifaddr, routetbl, ZONE 256 file desc, devbuf, temp, proc, subproc, vnodes, ifaddr, routetbl, NFS srvsock, NFS daemon, FFS node, file desc, devbuf, temp, proc, subproc, vnodes, ifaddr, routetbl, NFS srvsock, NFS daemon, FFS node, file desc, devbuf, temp, proc, subproc, vnodes, ifaddr, routetbl, NFS srvsock, NFS daemon, FFS node 512 file desc, devbuf, temp, ioctlops, BIO buffer, mount, NFSV3 diroff, UFS mount, isa_devlist, file desc, devbuf, temp, ioctlops, BIO buffer, mount, NFSV3 diroff, UFS mount, isa_devlist, file desc, devbuf, temp, ioctlops, BIO buffer, mount, NFSV3 diroff, UFS mount 1K devbuf, temp, proc, BIO buffer, NQNFS Lease, devbuf, temp, proc, BIO buffer, NQNFS Lease, devbuf, temp, proc, BIO buffer, NQNFS Lease 2K devbuf, temp, pcb, BIO buffer, UFS mount, mbuf, isa_devlist, devbuf, temp, pcb, BIO buffer, UFS mount, mbuf, isa_devlist, devbuf, temp, pcb, BIO buffer, UFS mount 4K devbuf, temp, UFS mount, devbuf, temp, UFS mount, devbuf, temp, UFS mount 8K temp, temp, temp 16K devbuf, devbuf, devbuf 32K devbuf, temp, MSDOSFS mount, devbuf, temp, MSDOSFS mount, devbuf, temp, MSDOSFS mount 64K ISOFS mount, NFS hash, UFS ihash, UFS quota, VM pgdata, ISOFS mount, NFS hash, UFS ihash, UFS quota, VM pgdata, ISOFS mount, NFS hash, UFS ihash, UFS quota, VM pgdata 128K name
Incorrect memory sizes reported
I'm not sure if this is related to the bug I found in 3.1, regarding mmaping devices, then forking, but with my -current NFS server: PID USERNAME PRI NICE SIZERES STATETIME WCPUCPU COMMAND 139 root 2 0 257M 452K select 0:00 0.00% 0.00% rpc.statd 257M? :) ps shows similar info... Kevin To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-current" in the body of the message
Re: Incorrect memory sizes reported
> > This is normal. It's using a lot of virtual memory. Fortunately, virtual > memory is cheap. > > DS > > > I'm not sure if this is related to the bug I found in 3.1, > > regarding mmaping > > devices, then forking, but with my -current NFS server: > > > > PID USERNAME PRI NICE SIZERES STATETIME WCPUCPU COMMAND > > 139 root 2 0 257M 452K select 0:00 0.00% 0.00% rpc.statd > > > > 257M? :) ps shows similar info... > > > > > > Kevin > > Ok, I stand corrected then I hadn't seen this before... 2.2.8: root 14127 0.0 0.1 176 492 ?? Ss5:14PM0:00.00 rpc.statd 3.1: root 853 0.0 0.7 172 416 ?? Ss7:18AM 0:00.00 rpc.statd There still is the issue I described a while back that would make children show negative numbers in 'size' though, that i can confirm isn't sucking that much VM. Kevin To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-current" in the body of the message
-current page fault at 0xdeadc0de
I had two systems reboot at nearly the same time. (30 seconds apart), and are completely unrelated. One system was running 2.2.8, and my core file presents me with this: su-2.02# gdb -k GDB is free software and you are welcome to distribute copies of it under certain conditions; type "show copying" to see the conditions. There is absolutely no warranty for GDB; type "show warranty" for details. GDB 4.16 (i386-unknown-freebsd), Copyright 1996 Free Software Foundation, Inc. (kgdb) exec-file kernel.0 (kgdb) symbol-file kernel.0.debug Reading symbols from kernel.0.debug...done. (kgdb) core-file vmcore.0 IdlePTD 24a000 current pcb at 202bfc #0 0x14 in ?? () (kgdb) bt #0 0x14 in ?? () #1 0x3404 in ?? () Cannot access memory at address 0x7205c76a. Were things just trashed, or am I doing something wrong? The other system was running -current, and gives me: su-2.02# gdb -k GDB is free software and you are welcome to distribute copies of it under certain conditions; type "show copying" to see the conditions. There is absolutely no warranty for GDB; type "show warranty" for details. GDB 4.16 (i386-unknown-freebsd), Copyright 1996 Free Software Foundation, Inc. (kgdb) exec-file kernel.2 (kgdb) symbol-file kernel.2.debug Reading symbols from kernel.2.debug...done. (kgdb) core-file vmcore.2 IdlePTD 3096576 initial pcb at 27ea40 panicstr: page fault panic messages: --- Fatal trap 12: page fault while in kernel mode fault virtual address = 0xdeadc0de fault code = supervisor read, page not present instruction pointer = 0x8:0xdeadc0de stack pointer = 0x10:0xcb4adec0 frame pointer = 0x10:0xcb4adefc code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 40969 (eggdrop) interrupt mask = trap number = 12 panic: page fault syncing disks... Fatal trap 12: page fault while in kernel mode fault virtual address = 0xdeadc126 fault code = supervisor read, page not present instruction pointer = 0x8:0xc018e3d8 stack pointer = 0x10:0xcb4ad91c frame pointer = 0x10:0xcb4ad93c code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 40969 (eggdrop) interrupt mask = trap number = 12 panic: page fault dumping to dev 20001, offset 467137 dump 255 254 253 252 251 250 249 248 247 246 245 244 243 242 241 240 239 238 237 236 235 234 233 232 231 230 229 228 227 226 225 224 223 222 221 220 219 218 217 216 215 214 213 212 211 210 209 208 207 206 205 204 203 202 201 200 199 198 197 196 195 194 193 192 191 190 189 188 187 186 185 184 183 182 181 180 179 178 177 176 175 174 173 172 171 170 169 168 167 166 165 164 163 162 161 160 159 158 157 156 155 154 153 152 151 150 149 148 147 146 145 144 143 142 141 140 139 138 137 136 135 134 133 132 131 130 129 128 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112 111 110 109 108 107 106 105 104 103 102 101 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 --- #0 boot (howto=260) at ../../kern/kern_shutdown.c:288 288 dumppcb.pcb_cr3 = rcr3(); (kgdb) bt #0 boot (howto=260) at ../../kern/kern_shutdown.c:288 #1 0xc0145755 in panic () at ../../kern/kern_shutdown.c:450 #2 0xc020e9e2 in trap_fatal (frame=0xcb4ad8dc, eva=3735929126) at ../../i386/i386/trap.c:917 #3 0xc020e695 in trap_pfault (frame=0xcb4ad8dc, usermode=0, eva=3735929126) at ../../i386/i386/trap.c:810 #4 0xc020e2d7 in trap (frame={tf_fs = 16, tf_es = 16, tf_ds = 16, tf_edi = 0, tf_esi = 0, tf_ebp = -884287172, tf_isp = -884287224, tf_ebx = 16384, tf_edx = -559038242, tf_ecx = -1059309536, tf_eax = -1053816960, tf_trapno = 12, tf_err = 0, tf_eip = -1072110632, tf_cs = 8, tf_eflags = 66182, tf_esp = -1062703744, tf_ss = -911937724}) at ../../i386/i386/trap.c:436 (kgdb) Not exactly a lot to go on... Mean anything to anyone? Any more info I can provide? Kevin To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-current" in the body of the message
Alladdin IDE slow?
I'm using an Alladin chipset in a -current machine... CPU: AMD-K6(tm) 3D processor (337.19-MHz 586-class CPU) Origin = "AuthenticAMD" Id = 0x580 Stepping=0 Features=0x8001bf real memory = 134217728 (131072K bytes) avail memory = 126808064 (123836K bytes) chip0: at device 0.0 on pci0 pcib1: at device 1.0 on pci0 pci1: on pcib1 isab0: at device 7.0 on pci0 ata-pci0: at device 15.0 on pci0 ata-pci0: Busmastering DMA supported ata0 at 0x01f0 irq 14 on ata-pci0 ata1 at 0x0170 irq 15 on ata-pci0 ata0: master: settting up UDMA2 mode on Aladdin chip OK ad0: ATA-3 disk at ata0 as master ad0: 3079MB (6306048 sectors), 6256 cyls, 16 heads, 63 S/T, 512 B/S ad0: piomode=4, dmamode=2, udmamode=2 ad0: 16 secs/int, 0 depth queue, DMA mode ata0: slave: settting up UDMA2 mode on Aladdin chip OK ad1: ATA-3 disk at ata0 as slave ad1: 3079MB (6306048 sectors), 6256 cyls, 16 heads, 63 S/T, 512 B/S ad1: piomode=4, dmamode=2, udmamode=2 ad1: 16 secs/int, 0 depth queue, DMA mode ata1: master: settting up UDMA2 mode on Aladdin chip OK ad2: ATA-3 disk at ata1 as master ad2: 3079MB (6306048 sectors), 6256 cyls, 16 heads, 63 S/T, 512 B/S ad2: piomode=4, dmamode=2, udmamode=2 ata1: slave: settting up UDMA2 mode on Aladdin chip OK ad3: ATA-4 disk at ata1 as slave ad3: 16479MB (33750864 sectors), 33483 cyls, 16 heads, 63 S/T, 512 B/S ad3: piomode=4, dmamode=2, udmamode=2 ad3: 16 secs/int, 0 depth queue, DMA mode ad3 is the one getting the heaviest use, from me... However, I notice a few things from when I went to the ata driver, from a 3.1 kernel using the wd0 driver. The drive is now much slower... While I don't have numbers either way, this system acts as a nfs server. Not only are the NFS clients acting slower after my switch, but nearly all my nfsd's are sitting in biord or biowr now, where before they were usually idle. Also, the IDE LED on the case/motherboard is now acting kinda erratic. I can hear the HD doing accesses when the light is off, and at times the light seems to stay on for 2-3 seconds, when there's no activity. (This didn't happen under wd0)... Is this a case of DMA just not working well for me, or is there a magic flag I'm missing? This is -current from about a week ago. Kevin To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-current" in the body of the message
-Current still leaking mbuf's
I've got two systems that panic about every 48 hours, saying they're out of mbuf's. I've tried raising maxusers. (It's at 128 now, but i've gone up to 256 and still seen the same thing). I believe it's a leak, since it's pretty consistant how long it will stay up before it runs out. I've tried raising NBMCLUSTERs, but this just seems to prolong it before it finally panic's. The only unusual thing about these two machines are that they're very heavy NFS client users. Is there anything any of you would like to see, if someone's willing to try to debug this? vmstat -m doesn't show anything too out of the ordinary, but I've got several coredumps waiting. :) Kevin To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-current" in the body of the message
Re: -Current still leaking mbuf's
> On Thu, 27 May 1999, Kevin Day wrote: > > > I've got two systems that panic about every 48 hours, saying they're out of > > mbuf's. I've tried raising maxusers. (It's at 128 now, but i've gone up to > > 256 and still seen the same thing). > > > > I believe it's a leak, since it's pretty consistant how long it will stay up > > before it runs out. > > > > I've tried raising NBMCLUSTERs, but this just seems to prolong it before it > > finally panic's. > > How high do you have it set? > I tried doubling whatever it was that putting maxusers at 256 set it at. (I can get the exact number later). I'm running with no NMBCLUSTERS setting, just with maxusers at 128 at the moment. > You might want to collect some netstat -m stats as time goes on. In > addition to being easier to read, it may give you some hints as to how > high you want to go with mbuf clusters. I added a cron job to to netstat -m every half hour... Right now, after 10 hours of being up: 494/2624 mbufs in use: 160 mbufs allocated to data 334 mbufs allocated to packet headers 130/1686/2560 mbuf clusters in use (current/peak/max) 3700 Kbytes allocated to network (8% in use) 0 requests for memory denied 0 requests for memory delayed 0 calls to protocol drain routines > > The only unusual thing about these two machines are that they're very heavy > > NFS client users. > > That might do it by itself irrespective of any bugs. > This didn't happen in 2.2.8 or 3.1, so I'm trying to figure out what's causing it. :) Here's a typical panic: IdlePTD 3096576 initial pcb at 27ea40 panicstr: Out of mbuf clusters panic messages: --- panic: Out of mbuf clusters syncing disks... panic: Out of mbuf clusters dumping to dev 20001, offset 467137 dump 255 254 253 252 251 250 249 248 247 246 245 244 243 242 241 240 239 238 237 236 235 234 233 232 231 230 229 228 227 226 225 224 223 222 221 220 219 218 217 216 215 214 213 212 211 210 209 208 207 206 205 204 203 202 201 200 199 198 197 196 195 194 193 192 191 190 189 188 187 186 185 184 183 182 181 180 179 178 177 176 175 174 173 172 171 170 169 168 167 166 165 164 163 162 161 160 159 158 157 156 155 154 153 152 151 150 149 148 147 146 145 144 143 142 141 140 139 138 137 136 135 134 133 132 131 130 129 128 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112 111 110 109 108 107 106 105 104 103 102 101 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 --- #0 boot (howto=260) at ../../kern/kern_shutdown.c:288 288 dumppcb.pcb_cr3 = rcr3(); (kgdb) bt #0 boot (howto=260) at ../../kern/kern_shutdown.c:288 #1 0xc0145755 in panic () at ../../kern/kern_shutdown.c:450 #2 0xc015c2ca in m_retry (i=0, t=1) at ../../kern/uipc_mbuf.c:269 #3 0xc01b2c77 in nfsm_reqh (vp=0xcb6bddc0, procid=21, hsiz=68, bposp=0xcb2cecdc) at ../../nfs/nfs_subs.c:599 #4 0xc01c8e13 in nfs_commit (vp=0xcb6bddc0, offset=0, cnt=8192, cred=0xc13b0200, procp=0xc02955a0) at ../../nfs/nfs_vnops.c:2580 #5 0xc01c9620 in nfs_flush (vp=0xcb6bddc0, cred=0xc0a5f900, waitfor=2, p=0xc02955a0, commit=1) at ../../nfs/nfs_vnops.c:2846 #6 0xc01c9389 in nfs_fsync (ap=0xcb2cedfc) at ../../nfs/nfs_vnops.c:2710 #7 0xc01b9489 in nfs_sync (mp=0xc113bc00, waitfor=2, cred=0xc0a5f900, p=0xc02955a0) at vnode_if.h:499 #8 0xc016ceaf in sync (p=0xc02955a0, uap=0x0) at ../../kern/vfs_syscalls.c:543 #9 0xc014535a in boot (howto=256) at ../../kern/kern_shutdown.c:205 #10 0xc0145755 in panic () at ../../kern/kern_shutdown.c:450 #11 0xc015c382 in m_retryhdr (i=0, t=1) at ../../kern/uipc_mbuf.c:297 #12 0xc015de2b in sosend (so=0xc9a55000, addr=0x0, uio=0xcb2cef00, top=0x0, control=0x0, flags=0, p=0xcb2692e0) at ../../kern/uipc_socket.c:499 #13 0xc016093f in sendit (p=0xcb2692e0, s=5, mp=0xcb2cef40, flags=0) at ../../kern/uipc_syscalls.c:514 #14 0xc0160a2d in sendto (p=0xcb2692e0, uap=0xcb2cef90) at ../../kern/uipc_syscalls.c:564 #15 0xc020ec26 in syscall (frame={tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = -1077951456, tf_esi = 0, tf_ebp = -1077951564, tf_isp = -886247452, tf_ebx = 538075232, tf_edx = 682064, tf_ecx = 0, tf_eax = 133, tf_trapno = 7, tf_err = 7, tf_eip = 537941473, tf_cs = 31, tf_eflags = 534, tf_esp = -1077951596, tf_ss = 47}) at ../../i386/i386/trap.c:1066 #16 0x7 in ?? () (kgdb) #10 0xc0145755 in panic () at ../../kern/kern_shutdown.c:450 450 boot(bootopt); (kgdb) #11 0xc015c382 in m_retryhdr (i=0, t=1) at ../../kern/uipc_mbuf.c:297 297 panic("Out of mbuf clusters"); Look
More NFS woes
Grabbed another -current, and are still seeing a few problems yet that Matt and others haven't solved. I'm not pushing anyone, just reminding that these are still here, and still problems. 1) The 'inode/vmopar' lockup that Matt is aware of, and apparently tracked down. 2) Processes starting to runaway doing this: nfs_getpages: error 70 vm_fault: pager read error, pid 1251 (eggdrop) nfs_getpages: error 70 vm_fault: pager read error, pid 1251 (eggdrop) nfs_getpages: error 70 vm_fault: pager read error, pid 1251 (eggdrop) nfs_getpages: error 70 vm_fault: pager read error, pid 1251 (eggdrop) nfs_getpages: error 70 No, i don't know what the user in question did to make this happen, if anything. The process was eating about 70% cpu when i killed it, syslogd was eating the other 30 logging all this. :) 3) Weirdly high load averages. I have two systems, of similar hardware, and similar jobs run on it. System A runs 2.2.8, and has about 300 processes running. System B runs -current, and has about 250 processes running. The processes are doing virtually the same things, and both are heavy NFS clients. System A's load average is about .10, Syetem B's load average hovers around 3.0-5.0. System A: last pid: 1933; load averages: 0.31, 0.10, 0.11 CPU states: 6.2% user, 0.0% nice, 1.9% system, 1.6% interrupt, 90.3% idle 302 processes: 1 running, 294 sleeping, 7 zombie System B: last pid: 77084; load averages: 3.64, 3.70, 4.00 CPU states: 7.0% user, 0.0% nice, 2.7% system, 0.4% interrupt, 89.9% idle 256 processes: 1 running, 254 sleeping, 1 zombie Has something changed in the load average calculation, between 2.x and -current, or is there something actually different going on here? The only hardware differences are that system A uses two de0 cards, and system B uses two fxp0 cards. System A is a PII, and sytem B is a K6-2. (similar speeds) Kevin To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-current" in the body of the message
qt30 build under -CURRENT fails in rtld
I"m not sure if this is a known problem, but I sent this to the maintainer of the qt30 port, who suggested I post this here. I couldn't find anything related in the archives about this problem. I'm attempting to build qt30 (for kde3) under -CURRENT (ports and kernel/userland from yesterday). It's dying in: gmake[3]: Entering directory /usr/ports/x11-toolkits/qt30/work/qt-x11-free-3.0.3/tools/designer/designer' /usr/ports/x11-toolkits/qt30/work/qt-x11-free-3.0.3/bin/uic dbconnections.ui -o dbconnections.h gmake[3]: *** [dbconnections.h] Bus error (core dumped) gmake[3]: Leaving directory /usr/ports/x11-toolkits/qt30/work/qt-x11-free-3.0.3/tools/designer/designer' gmake[2]: *** [sub-designer] Error 2 gmake[2]: Leaving directory /usr/ports/x11-toolkits/qt30/work/qt-x11-free-3.0.3/tools/designer' gmake[1]: *** [sub-designer] Error 2 gmake[1]: Leaving directory /usr/ports/x11-toolkits/qt30/work/qt-x11-free-3.0.3/tools' gmake: *** [sub-tools] Error 2 *** Error code 2 Stop in /usr/ports/x11-toolkits/qt30. # gdb GNU gdb 5.2.0 (FreeBSD) 20020627 Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-undermydesk-freebsd". (gdb) exec-file ../../../bin/uic (gdb) core-file uic.core Core was generated by ic'. Program terminated with signal 10, Bus error. Reading symbols from /usr/lib/libz.so.2...(no debugging symbols found)...done. Loaded symbols for /usr/lib/libz.so.2 Reading symbols from /usr/ports/x11-toolkits/qt30/work/qt-x11-free-3.0.3/lib/libqt-mt.so.3...(no debugging symbols found)...done. Loaded symbols for /usr/ports/x11-toolkits/qt30/work/qt-x11-free-3.0.3/lib/libqt-mt.so.3 Reading symbols from /usr/X11R6/lib/libICE.so.6...(no debugging symbols found)...done. Loaded symbols for /usr/X11R6/lib/libICE.so.6 Reading symbols from /usr/X11R6/lib/libSM.so.6...(no debugging symbols found)...done. Loaded symbols for /usr/X11R6/lib/libSM.so.6 Reading symbols from /usr/X11R6/lib/libXext.so.6...(no debugging symbols found)...done. Loaded symbols for /usr/X11R6/lib/libXext.so.6 Reading symbols from /usr/X11R6/lib/libX11.so.6...(no debugging symbols found)...done. Loaded symbols for /usr/X11R6/lib/libX11.so.6 Reading symbols from /usr/X11R6/lib/libXrender.so.1...(no debugging symbols found)...done. Loaded symbols for /usr/X11R6/lib/libXrender.so.1 Reading symbols from /usr/X11R6/lib/libXft.so.1...(no debugging symbols found)...done. Loaded symbols for /usr/X11R6/lib/libXft.so.1 Reading symbols from /usr/local/lib/libfreetype.so.9...(no debugging symbols found)...done. Loaded symbols for /usr/local/lib/libfreetype.so.9 Reading symbols from /usr/lib/libstdc++.so.4...(no debugging symbols found)...done. Loaded symbols for /usr/lib/libstdc++.so.4 Reading symbols from /usr/lib/libm.so.2...(no debugging symbols found)...done. Loaded symbols for /usr/lib/libm.so.2 Reading symbols from /usr/lib/libc_r.so.5...(no debugging symbols found)...done. Loaded symbols for /usr/lib/libc_r.so.5 Reading symbols from /usr/lib/libc.so.5...(no debugging symbols found)...done. Loaded symbols for /usr/lib/libc.so.5 Reading symbols from /usr/local/lib/libmng.so.1...(no debugging symbols found)...done. Loaded symbols for /usr/local/lib/libmng.so.1 Reading symbols from /usr/local/lib/libjpeg.so.9...(no debugging symbols found)...done. Loaded symbols for /usr/local/lib/libjpeg.so.9 Reading symbols from /usr/local/lib/libpng.so.5...(no debugging symbols found)...done. Loaded symbols for /usr/local/lib/libpng.so.5 Reading symbols from /usr/X11R6/lib/libXThrStub.so.6...(no debugging symbols found)...done. Loaded symbols for /usr/X11R6/lib/libXThrStub.so.6 Reading symbols from /usr/local/lib/liblcms.so.1...(no debugging symbols found)...done. Loaded symbols for /usr/local/lib/liblcms.so.1 Reading symbols from /usr/libexec/ld-elf.so.1...(no debugging symbols found)...done. Loaded symbols for /usr/libexec/ld-elf.so.1 #0 0x28099094 in reloc_non_plt () from /usr/libexec/ld-elf.so.1 (gdb) bt #0 0x28099094 in reloc_non_plt () from /usr/libexec/ld-elf.so.1 #1 0x28096a4e in find_symdef () from /usr/libexec/ld-elf.so.1 #2 0x28095602 in _rtld () from /usr/libexec/ld-elf.so.1 I'm building this on an extremely slow system, which took a better part of today to get this far, so I haven't rebuilt everything with -g yet. Is this a known problem? If not, I can attempt to rebuild with -g to get a full backtrace and symbols if needed. -- Kevin To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Current panic on boot on H11DSI motherboard with epyc cpu (nexus_add_irq: failed)
> On Apr 18, 2018, at 1:42 PM, John Baldwin wrote: >> >> Chenged made for it was >> >> Index: sys/x86/x86/nexus.c >> === >> --- sys/x86/x86/nexus.c (revision 332663) >> +++ sys/x86/x86/nexus.c (working copy) >> @@ -698,7 +698,7 @@ >> { >> >>if (rman_manage_region(&irq_rman, irq, irq) != 0) >> - panic("%s: failed", __func__); >> + panic("%s: failed irq is: %lu", __func__, irq); >> } > > O, this is a different issue. Sorry. As a hack, try changing > 'FIRST_MSI_INT' to 512 in sys/amd64/include/intr_machdep.h. The issue > is that some systems now include more than 256 interrupt pins on I/O > APICs, so IRQ 256 is already reserved for use by one of those > interrupt pins. The real fix is that I need to make FIRST_MSI_INT > dynamic instead of a constant and just define it as the first free IRQ > after the I/O APICs have probed. I'm testing a very large AMD Epyc system, and I had to change FIRST_MSI_INT to 768, but that fixed this issue for me. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"