Re: FreeBSD DNS Resolver Issues?
On Mon, 23 Apr 2007, Howard Leadmon wrote: > OK, now I am a bit stumped, so wanted to post here in hopes someone might > have an idea. First off the FBSD machine in question is an x86 server running > 6.2-STABLE from a supped from a few weeks ago, so is fairly current. > > I use said machine to handle all of my eMail and things in general seem to > work great, though I have this one mystery. > > I we try and send mail to [EMAIL PROTECTED] the mail will just set in the > queue forever, until it's returned as a failure. Talking with the admins at > wtplaw they are swearing their configs are correct, and it's something on our > side. Looking at the mailq, I see: > > l3NEqolY01112428697 Mon Apr 23 10:52 <[EMAIL PROTECTED]> > (Deferred: Name server: mail.wtplaw.com.: host name lookup > fa) > <[EMAIL PROTECTED]> > > So as it's quick an easy I used dig and did a lookup: > > $ host wtplaw.com > wtplaw.com has address 69.20.43.246 > wtplaw.com mail is handled by 10 mail.wtplaw.com. > > > Then on mail.wtplaw.com: > > $ host mail.wtplaw.com > mail.wtplaw.com has address 65.111.69.228 > mail.wtplaw.com has address 66.166.181.163 > Host mail.wtplaw.com not found: 2(SERVFAIL) > ;; connection timed out; no servers could be reached I'm getting the same results here, using dig rather than host (FWIW). I'm also seing inconsistent results re the listed NS for that domain: === smithi on paqi% dig wtplaw.com any ; <<>> DiG 9.3.4 <<>> wtplaw.com any ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 15237 ;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 2, ADDITIONAL: 2 ;; QUESTION SECTION: ;wtplaw.com.IN ANY ;; ANSWER SECTION: wtplaw.com. 86363 IN NS ns1.airband.net. wtplaw.com. 86363 IN NS ns2.airband.net. wtplaw.com. 86363 IN A 69.20.43.246 ;; AUTHORITY SECTION: wtplaw.com. 86363 IN NS ns2.airband.net. wtplaw.com. 86363 IN NS ns1.airband.net. ;; ADDITIONAL SECTION: ns1.airband.net.5687IN A 216.138.97.246 ns2.airband.net.5687IN A 216.138.119.6 ;; Query time: 236 msec ;; SERVER: 192.168.1.1#53(192.168.1.1) ;; WHEN: Tue Apr 24 15:54:01 2007 ;; MSG SIZE rcvd: 123 === but: === smithi on paqi% dig mail.wtplaw.com ; <<>> DiG 9.3.4 <<>> mail.wtplaw.com ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 33923 ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 2, ADDITIONAL: 0 ;; QUESTION SECTION: ;mail.wtplaw.com. IN A ;; ANSWER SECTION: mail.wtplaw.com.3 IN A 65.111.69.228 mail.wtplaw.com.3 IN A 66.166.181.163 ;; AUTHORITY SECTION: mail.wtplaw.com.86399 IN NS lp1.wtplaw.com. mail.wtplaw.com.86399 IN NS lp2.wtplaw.com. ;; Query time: 466 msec === Note different NS, with the As for mail.wtplaw.com returned. Further: === smithi on paqi% dig wtplaw.com mx ; <<>> DiG 9.3.4 <<>> wtplaw.com mx ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 21494 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 2 ;; QUESTION SECTION: ;wtplaw.com.IN MX ;; ANSWER SECTION: wtplaw.com. 86400 IN MX 10 mail.wtplaw.com. ;; AUTHORITY SECTION: wtplaw.com. 62547 IN NS ns1.airband.net. wtplaw.com. 62547 IN NS ns2.airband.net. ;; ADDITIONAL SECTION: ns1.airband.net.5671IN A 216.138.97.246 ns2.airband.net.5671IN A 216.138.119.6 ;; Query time: 1021 msec === Here no A is retuened for mail.wtplaw.com, and note the airband.net NS. Pretty sure sendmail does an MX request, so that's what it'll get then, which explains your mailq response. At (one set of) the listed NServers: === ; <<>> DiG 9.3.4 <<>> @lp1.wtplaw.com. mail.wtplaw.com. ; (1 server found) ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 24202 ;; flags: qr aa; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;mail.wtplaw.com. IN A ;; ANSWER SECTION: mail.wtplaw.com.3 IN A 66.166.181.163 mail.wtplaw.com.3 IN A 65.111.69.228 ;; Query time: 268 msec ;; SERVER: 65.111.69.226#53(65.111.69.226) ;; WHEN: Tue Apr 24 15:57:00 2007 ;; MSG SIZE rcvd: 65 === Note no A record provided for mail.wtplaw.com; same digging @lp2.wtplaw.com. So trying the 'other' listed NServers above: === smithi on paqi% dig @ns1.airband.net. wtplaw.com. any ; <<>> DiG 9.3.4 <<>> @ns1.airband.net. wtplaw.com. any ; (1 server found)
question: +swap_pager_getswapspace(16): failed
Hello, I have a little understanding problem: My box has 128MB memory, far enough for the task. After a few days I always see some processes dying because: +swap_pager_getswapspace(2): failed +pid 48211 (perl5.8.8), uid 58, was killed: out of swap space Why won't for example the 21MB Buf get freed before more swap space gets requested than available (swap is very low, it's FlashDisk!)? Is there a way to find out what process is swapped? Thanks for any hints. My only way to circumvent this problem is to reboot the machine daily. -Harry ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: FreeBSD DNS Resolver Issues?
Sorry following up on my own post: a correction and some further info: On Tue, 24 Apr 2007, Ian Smith wrote: [..] > At (one set of) the listed NServers: > > === > ; <<>> DiG 9.3.4 <<>> @lp1.wtplaw.com. mail.wtplaw.com. > ; (1 server found) > ;; global options: printcmd > ;; Got answer: > ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 24202 > ;; flags: qr aa; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0 > > ;; QUESTION SECTION: > ;mail.wtplaw.com. IN A > > ;; ANSWER SECTION: > mail.wtplaw.com.3 IN A 66.166.181.163 > mail.wtplaw.com.3 IN A 65.111.69.228 > > ;; Query time: 268 msec > ;; SERVER: 65.111.69.226#53(65.111.69.226) > ;; WHEN: Tue Apr 24 15:57:00 2007 > ;; MSG SIZE rcvd: 65 > === > > Note no A record provided for mail.wtplaw.com; same digging > @lp2.wtplaw.com. So trying the 'other' listed NServers above: That's wrong of course; it is returning two A RRs for mail.wtplaw.com. however a) they always show 3 (three!) seconds TTL on those records, and b) these two NS, lp1.wtplaw.com. and lp1.wtplaw.com. , aren't shown as authoritative, and c) aren't even auth. / don't work for themselves! === smithi on paqi% dig @lp1.wtplaw.com. lp1.wtplaw.com. ; <<>> DiG 9.3.4 <<>> @lp1.wtplaw.com. lp1.wtplaw.com. ; (1 server found) ;; global options: printcmd ;; connection timed out; no servers could be reached smithi on paqi% dig @lp2.wtplaw.com. lp2.wtplaw.com. ; <<>> DiG 9.3.4 <<>> @lp2.wtplaw.com. lp2.wtplaw.com. ; (1 server found) ;; global options: printcmd ;; connection timed out; no servers could be reached === Cheers, Ian ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: FreeBSD DNS Resolver Issues?
On Tue, 24 Apr 2007, Mark Andrews wrote: > It's a broken load balancer which is returning the > parent's SOA record for queries for mail.wtplaw.com. > Named correctly rejects this as a garbage response. > > It also appears to only responds to A/ queries. > > As for Solaris' host it may/may not be making queries. Ah, out of my depth again. Thanks Mark. Sorry for the noise then. Cheers, Ian ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: FreeBSD DNS Resolver Issues?
It's a broken load balancer which is returning the parent's SOA record for queries for mail.wtplaw.com. Named correctly rejects this as a garbage response. It also appears to only responds to A/ queries. As for Solaris' host it may/may not be making queries. Mark -- Mark Andrews, ISC 1 Seymour St., Dundas Valley, NSW 2117, Australia PHONE: +61 2 9871 4742 INTERNET: [EMAIL PROTECTED] ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: 6.2-STABLE deadlock?
Kostik Belousov wrote: > On Mon, Apr 23, 2007 at 03:56:32AM +0100, Adrian Wontroba wrote: >> On Tue, Mar 13, 2007 at 02:08:48PM +, Adrian Wontroba wrote: >>> At work, amoungst my stable of old computers running FreeBSD, I have a >>> Fujitsu M800 - a 4 Zeon SMP processor with 4 GB of memory. This >>> primarily runs Nagios and a small and lightly used MySQL database, along >>> with a few inbound FTP transfers per minute. It has a Mylex card based >>> disc subsystem, ruling out crash dumps. >>> >>> At some point during 5.5-STABLE this machine started to occasionally hang >>> ... >> Another 6-STABLE (cvsupped on 27/03/07) example, with diagnostics taken >> rather sooner after the hang. Processes with wmesg=ufs feature often in >> the ps output. >> >> http://www.stade.co.uk/crash1/ > > I would suspect the mlx controller. There is several processes (for instance, > 988, 50918) waiting for completion of block read, and processes in the "ufs" > states are the result of the lock cascade, IMHO. I'm not very sure if this is specific to one disk controller. Actually I got some occasional reports about similar hangs on amd64 6.2-RELEASE (slightly patched version) that most of processes stuck in the 'ufs' state, under very light load, the box was equipped with amr(4) RAID. I was not able to reproduce the problem at my lab, though, it's still unknown that how to trigger the livelock :-( Still need some investigate on their production system. Cheers, -- Xin LI <[EMAIL PROTECTED]> http://www.delphij.net/ FreeBSD - The Power to Serve! signature.asc Description: OpenPGP digital signature
[bsdcan-announce] BSDCan - less than four weeks! (fwd)
Dear FreeBSD users and developers: The BSD Canada Conference (BSDCan) is just a few weeks away -- May 18-19 in Ottawa, Canada. This is a great opportunity to meet up with other FreeBSD developers and users, learn about exciting work taking place in FreeBSD, and it's also a chance to talk about your own work. You'll hear talks on a broad range of FreeBSD-related topics, including Autofs, FreeBSD/PPC, FreeBSD SD/MMC support, FreeBSD security features, FreeNAS, network stack virtualization, PC-BSD, FreeBSD interrupt handling, ZFS, portsnap, FreeBSD clustering, IPv6 security, the FreeBSD security officer, and much, much more. It's also a great social event :-). You can learn more about the conference at the conference website: http://www.bsdcan.org/ If you're not already considering attending (and registered), please consider doing so. See you in Ottawa!. Robert N M Watson Computer Laboratory University of Cambridge -- Forwarded message -- Date: Tue, 24 Apr 2007 06:58:00 -0400 From: Dan Langille <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Subject: [bsdcan-announce] BSDCan - less than four weeks! Gidday, BSDCan 2007 is now less than four weeks away. We have another strong lineup of talks. I hope you've finished your travel plans. It is not too late to book now. New this year: lunches on SITE. Yes. Really. Less money for you to spend. More time spent schmoozing. And for those staying in residence: breakfast is included with your accommodation. See http://www.uottawa.ca/services/matmgmt/hospitality/food.html As always, registration will start in the Royal Oak. You can pick your registration pack up between 3:30 and 7pm. The Royal Oak is very close to residence. See http://tinyurl.com/jxelk We have not picked a spot for mass gatherings on Friday and Saturday night. There are many to choose from. As always, BSDCan is both a social and a learning event. :) Sometimes the two are concurrent. See you at BSDCan 2007! -- Dan Langille two conferences, one trip, great value: May 2007 BSDCan - The BSD Conference - http://www.bsdcan.org/ PGCon - The PostgreSQL Conference - http://www.pgcon.org/ ___ bsdcan-announce mailing list [EMAIL PROTECTED] http://lists.bsdcan.org/mailman/listinfo/bsdcan-announce ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: [kde-freebsd] problem hal - k3b ?
> This problem appear in my system after updating system and ports on > April, 06. > K3b hangs either after loading splash screen or after eject wrote media > from device. Aside that new atapi-cam.c is proven to work, I'd like to know if command line works or not? K3b needs cdrtools in background. What if you make iso file using mkisofs and burn it with cdrecord? Zoran ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
mesg errors
my dmesg has errors i guess. i am sorta new @ this so umm.. here is a dmesg output: Copyright (c) 1992-2006 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 6.1-RELEASE #0: Sun May 7 04:32:43 UTC 2006 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Pentium(R) M processor 1.73GHz (600.02-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x6d8 Stepping = 8 Features=0xafe9fbff Features2=0x180 AMD Features=0x10 real memory = 536084480 (511 MB) avail memory = 515223552 (491 MB) kbd1 at kbdmux0 acpi0: on motherboard ACPI-0356: *** Error: Region EmbeddedControl(3) has no handler ACPI-1304: *** Error: Method execution failed [\\_SB_.PCI0.SBRG.EC0_.ACS_] (Node 0xc33998c0), AE_NOT_EXIST ACPI-1304: *** Error: Method execution failed [\\_SB_.AC__._INI] (Node 0xc33993e0), AE_NOT_EXIST ACPI-0356: *** Error: Region EmbeddedControl(3) has no handler ACPI-1304: *** Error: Method execution failed [\\_SB_.PCI0.SBRG.EC0_.BATS] (Node 0xc33998a0), AE_NOT_EXIST ACPI-1304: *** Error: Method execution failed [\\_SB_.BAT0._STA] (Node 0xc339d720), AE_NOT_EXIST ACPI-0239: *** Error: Method execution failed [\\_SB_.BAT0._STA] (Node 0xc339d720), AE_NOT_EXIST acpi0: Power Button (fixed) acpi_ec0: port 0x62,0x66 on acpi0 Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0 cpu0: on acpi0 acpi_throttle0: on cpu0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 agp0: mem 0xe000-0xefff at device 0.0 on pci0 pcib1: at device 1.0 on pci0 pci1: on pcib1 pci1: at device 0.0 (no driver attached) uhci0: port 0xe800-0xe81f irq 11 at device 29.0 on pci0 uhci0: [GIANT-LOCKED] usb0: on uhci0 usb0: USB revision 1.0 uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered uhci1: port 0xe880-0xe89f irq 5 at device 29.1 on pci0 uhci1: [GIANT-LOCKED] usb1: on uhci1 usb1: USB revision 1.0 uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub1: 2 ports with 2 removable, self powered uhci2: port 0xec00-0xec1f irq 10 at device 29.2 on pci0 uhci2: [GIANT-LOCKED] usb2: on uhci2 usb2: USB revision 1.0 uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub2: 2 ports with 2 removable, self powered ehci0: mem 0xffaffc00-0xffaf irq 10 at device 29.7 on pci0 ehci0: [GIANT-LOCKED] usb3: EHCI version 1.0 usb3: companion controllers, 2 ports each: usb0 usb1 usb2 usb3: on ehci0 usb3: USB revision 2.0 uhub3: Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1 uhub3: 6 ports with 6 removable, self powered pcib2: at device 30.0 on pci0 pci2: on pcib2 bge0: mem 0xff9f-0xff9f irq 4 at device 0.0 on pci2 miibus0: on bge0 brgphy0: on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto bge0: Ethernet address: 00:11:d8:22:c9:91 cbb0: at device 1.0 on pci2 cardbus0: on cbb0 pccard0: <16-bit PCCard bus> on cbb0 cbb1: at device 1.1 on pci2 cardbus1: on cbb1 pccard1: <16-bit PCCard bus> on cbb1 fwohci0: mem 0xff9ef800-0xff9e irq 10 at device 1.2 on pci2 fwohci0: OHCI version 1.0 (ROM=1) fwohci0: No. of Isochronous channels is 4. fwohci0: EUI64 00:e0:18:00:03:26:4c:e9 fwohci0: Phy 1394a available S400, 2 ports. fwohci0: Link S400, max_rec 2048 bytes. firewire0: on fwohci0 fwe0: on firewire0 if_fwe0: Fake Ethernet address: 02:e0:18:26:4c:e9 fwe0: Ethernet address: 02:e0:18:26:4c:e9 fwe0: if_start running deferred for Giant sbp0: on firewire0 fwohci0: Initiate bus reset fwohci0: node_id=0xc800ffc0, gen=1, CYCLEMASTER mode firewire0: 1 nodes, maxhop <= 0, cable IRM = 0 (me) firewire0: bus manager 0 (me) pci2: at device 2.0 (no driver attached) isab0: at device 31.0 on pci0 isa0: on isab0 atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at device 31.1 on pci0 ata0: on atapci0 ata1: on atapci0 pci0: at device 31.5 (no driver attached) pci0: at device 31.6 (no driver attached) acpi_lid0: on acpi0 acpi_button0: on acpi0 acpi_acad0: on acpi0 battery0: on acpi0 battery1: on acpi0 acpi_button1: on acpi0 acpi_tz0: on acpi0 atkbdc0: port 0x60,0x64 irq 1 on acpi0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] psm0: irq 12 on atkbdc0 psm0: [GIANT-LOCKED] psm0: model Generic PS/2 mouse, device ID 0 sio0: configured irq 3 not in bitmap of probed irqs 0 sio0: port may not be enabled sio0 port 0x2f8-0x2ff irq 3 drq 1 flags 0x10 on acpi0 sio0: type 16550A ppc0: port 0x378-0x37f,0x778-0x77f irq 7 drq 3 on acpi0 ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode ppc0: FIFO with 16/16/8 bytes threshold ppbus0: on ppc0 plip0: on ppbus0 lpt0: on ppbus0 lpt0: Interrupt-driven port ppi0: on ppbus0 pmtimer0 on isa0 orm0: at iomem 0xc000
Re: 6.2-STABLE deadlock?
Цитирую LI Xin <[EMAIL PROTECTED]>: > Kostik Belousov wrote: > > On Mon, Apr 23, 2007 at 03:56:32AM +0100, Adrian Wontroba wrote: > >> On Tue, Mar 13, 2007 at 02:08:48PM +, Adrian Wontroba wrote: > >>> At work, amoungst my stable of old computers running FreeBSD, I have > a > >>> Fujitsu M800 - a 4 Zeon SMP processor with 4 GB of memory. This > >>> primarily runs Nagios and a small and lightly used MySQL database, > along > >>> with a few inbound FTP transfers per minute. It has a Mylex card > based > >>> disc subsystem, ruling out crash dumps. > >>> > >>> At some point during 5.5-STABLE this machine started to occasionally > hang ... > >> Another 6-STABLE (cvsupped on 27/03/07) example, with diagnostics > taken > >> rather sooner after the hang. Processes with wmesg=ufs feature often > in > >> the ps output. > >> > >> http://www.stade.co.uk/crash1/ > > > > I would suspect the mlx controller. There is several processes (for > instance, > > 988, 50918) waiting for completion of block read, and processes in the > "ufs" > > states are the result of the lock cascade, IMHO. > > I'm not very sure if this is specific to one disk controller. Actually > I got some occasional reports about similar hangs on amd64 6.2-RELEASE > (slightly patched version) that most of processes stuck in the 'ufs' > state, under very light load, the box was equipped with amr(4) RAID. > > I was not able to reproduce the problem at my lab, though, it's still > unknown that how to trigger the livelock :-( Still need some > investigate on their production system. I reported simular issue for FreeBSD 6.2 in audit-trail for kern/104406: http://www.freebsd.org/cgi/query-pr.cgi?pr=104406&cat= and there should be a thread related to this. Briefly, I suspects that this is related to nullfs filesystems on my server and when I cvsuped to FreeBSD 6.2- STABLE with Daichi's unionfs-related patches and replaced nullfs-mounted fs with unionfs-mounted (that was done 10.03.07) problem is gone (seems to be so, at least). ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: [kde-freebsd] problem hal - k3b ?
As a data point, I was seeing the same problems, but reverting to atapi-cam.c rev 1.42.2.2 works here too. Eric ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Dell SAS5 Performance Issue
Matthew Jacob wrote: Is there any news on the performance of this card? I personally have not been able to reproduce the problem. It seems to occur whether in Integrated Raid or not. It seems to be related to specific backplanes and drives. It's an important problem to solve I agree. We have a HP Proliant DL140 g3 that exhibits this (or a somewhat related) problem, to which we can give you remote access (including remote KVM) if that helps? Cheers, Martin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: question: +swap_pager_getswapspace(16): failed
On 2007-Apr-24 09:32:06 +0200, Harald Schmalzbauer <[EMAIL PROTECTED]> wrote: >My box has 128MB memory, far enough for the task. If you are regularly running out of space, then maybe not - at least without tuning some parameters. How much swap space do you actually have and what is your box trying to do? My firewall also has 128MB and it's only paged out 575 pages in the last 9.7 days. >After a few days I always see some processes dying because: > >+swap_pager_getswapspace(2): failed >+pid 48211 (perl5.8.8), uid 58, was killed: out of swap space > >Why won't for example the 21MB Buf get freed before more swap space gets >requested than available (swap is very low, it's FlashDisk!)? vfs.bufspace is inside a feedback loop that tries to keep it between vfs.lobufspace and vfs.hibufspace - which are tuned based on memory size by default (for 128MB RAM, hibufspace should be ~22MB). You could try seting kern.nbuf (in /boot/loader.conf) to reduce the buffer space allocated (each buffer is 16KB). >Is there a way to find out what process is swapped? The ps output will include 'W'. Note that 'swapped' is a special state and normally processes are just paged. In top and ps, the difference between 'size' and 'res' reflects memory space that the process has allocated to it but is not resident. Unfortunately, this includes both text area (which is vnode backed) and space that has never been touched (and therefore doesn't exist anywhere) as well as swap space. Offhand, I don't know of any tool to report the swap utilisation by process on FreeBSD. (Though I have written such a tool for Tru64). -- Peter Jeremy pgpWLbTKgGRf1.pgp Description: PGP signature
RE: 6.2-STABLE deadlock?
LI Xin wrote: > Kostik Belousov wrote: > > On Mon, Apr 23, 2007 at 03:56:32AM +0100, Adrian Wontroba wrote: > >> On Tue, Mar 13, 2007 at 02:08:48PM +, Adrian Wontroba wrote: > >>> At work, amoungst my stable of old computers running > FreeBSD, I have a > >>> Fujitsu M800 - a 4 Zeon SMP processor with 4 GB of memory. This > >>> primarily runs Nagios and a small and lightly used MySQL > database, along > >>> with a few inbound FTP transfers per minute. It has a > Mylex card based > >>> disc subsystem, ruling out crash dumps. > >>> > >>> At some point during 5.5-STABLE this machine started to > occasionally hang ... > >> Another 6-STABLE (cvsupped on 27/03/07) example, with > diagnostics taken > >> rather sooner after the hang. Processes with wmesg=ufs > feature often in > >> the ps output. > >> > >> http://www.stade.co.uk/crash1/ > > > > I would suspect the mlx controller. There is several > processes (for instance, > > 988, 50918) waiting for completion of block read, and > processes in the "ufs" > > states are the result of the lock cascade, IMHO. > > I'm not very sure if this is specific to one disk controller. > Actually > I got some occasional reports about similar hangs on amd64 6.2-RELEASE > (slightly patched version) that most of processes stuck in the 'ufs' > state, under very light load, the box was equipped with amr(4) RAID. > > I was not able to reproduce the problem at my lab, though, it's still > unknown that how to trigger the livelock :-( Still need some > investigate on their production system. I have seen something similar once, on a machine with an Areca (arcmsr) controller, running 6.2-RELEASE (with unionfs patches). Processes stuck in "ufs", and the machine needed physical intervention to reboot. I haven't seen it since. From memory, it happened during startup of the applications and jails on the machine. Jan. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
RE: 6.2-STABLE deadlock?
Oleg Derevenetz wrote: > [ ... ] > I reported simular issue for FreeBSD 6.2 in audit-trail for > kern/104406: > > http://www.freebsd.org/cgi/query-pr.cgi?pr=104406&cat= > > and there should be a thread related to this. Briefly, I > suspects that this is > related to nullfs filesystems on my server and when I cvsuped > to FreeBSD 6.2- > STABLE with Daichi's unionfs-related patches and replaced > nullfs-mounted fs > with unionfs-mounted (that was done 10.03.07) problem is gone > (seems to be so, > at least). Interesting. In the instance I saw, there were also nullfs filesystems mounted. Regards, Jan. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: 6.2-STABLE deadlock?
Kostik Belousov wrote: > I would suspect the mlx controller. There is several processes (for instance, > 988, 50918) waiting for completion of block read, and processes in the "ufs" > states are the result of the lock cascade, IMHO. It may be possible that controller is not guilty. You can easily reproduce lock in "ufs" state with commands from the "How-To-Repeat" section of: http://www.FreeBSD.org/cgi/query-pr.cgi?pr=kern/107439 The PR is closed but the problem still exists in recent 6.2-STABLE. GENERIC has the problem too, GENERIC+INVARIANTS panices at once instead of producing locked processes. Eugene Grosbein. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: 6.2-STABLE deadlock?
Hi, Oleg, Oleg Derevenetz wrote: > Цитирую LI Xin <[EMAIL PROTECTED]>: [...] >> I'm not very sure if this is specific to one disk controller. Actually >> I got some occasional reports about similar hangs on amd64 6.2-RELEASE >> (slightly patched version) that most of processes stuck in the 'ufs' >> state, under very light load, the box was equipped with amr(4) RAID. >> >> I was not able to reproduce the problem at my lab, though, it's still >> unknown that how to trigger the livelock :-( Still need some >> investigate on their production system. > > I reported simular issue for FreeBSD 6.2 in audit-trail for kern/104406: > > http://www.freebsd.org/cgi/query-pr.cgi?pr=104406&cat= > > and there should be a thread related to this. Briefly, I suspects that this > is > related to nullfs filesystems on my server and when I cvsuped to FreeBSD 6.2- > STABLE with Daichi's unionfs-related patches and replaced nullfs-mounted fs > with unionfs-mounted (that was done 10.03.07) problem is gone (seems to be > so, > at least). Hmm... Seems to be different issues. The problem I have received was a pgsql server (no nullfs/unionfs involved), and the hang always happen when it is not being heavily loaded (usually in the morning, for instance, and there is no special configuration, like scheduled tasks which can generate disk load, etc., only the entropy harvesting), so this is quite confusing. Cheers, -- Xin LI <[EMAIL PROTECTED]> http://www.delphij.net/ FreeBSD - The Power to Serve! signature.asc Description: OpenPGP digital signature
Re: 6.2-STABLE deadlock?
On Wed, Apr 25, 2007 at 11:53:32AM +1000, Jan Mikkelsen wrote: > LI Xin wrote: > > Kostik Belousov wrote: > > > On Mon, Apr 23, 2007 at 03:56:32AM +0100, Adrian Wontroba wrote: > > >> On Tue, Mar 13, 2007 at 02:08:48PM +, Adrian Wontroba wrote: > > >>> At work, amoungst my stable of old computers running > > FreeBSD, I have a > > >>> Fujitsu M800 - a 4 Zeon SMP processor with 4 GB of memory. This > > >>> primarily runs Nagios and a small and lightly used MySQL > > database, along > > >>> with a few inbound FTP transfers per minute. It has a > > Mylex card based > > >>> disc subsystem, ruling out crash dumps. > > >>> > > >>> At some point during 5.5-STABLE this machine started to > > occasionally hang ... > > >> Another 6-STABLE (cvsupped on 27/03/07) example, with > > diagnostics taken > > >> rather sooner after the hang. Processes with wmesg=ufs > > feature often in > > >> the ps output. > > >> > > >> http://www.stade.co.uk/crash1/ > > > > > > I would suspect the mlx controller. There is several > > processes (for instance, > > > 988, 50918) waiting for completion of block read, and > > processes in the "ufs" > > > states are the result of the lock cascade, IMHO. > > > > I'm not very sure if this is specific to one disk controller. > > Actually > > I got some occasional reports about similar hangs on amd64 6.2-RELEASE > > (slightly patched version) that most of processes stuck in the 'ufs' > > state, under very light load, the box was equipped with amr(4) RAID. > > > > I was not able to reproduce the problem at my lab, though, it's still > > unknown that how to trigger the livelock :-( Still need some > > investigate on their production system. > > I have seen something similar once, on a machine with an Areca (arcmsr) > controller, running 6.2-RELEASE (with unionfs patches). Processes stuck in > "ufs", and the machine needed physical intervention to reboot. I haven't > seen it since. From memory, it happened during startup of the applications > and jails on the machine. Sounds like one of the known unionfs bugs. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
How to report bugs (Re: 6.2-STABLE deadlock?)
On Wed, Apr 25, 2007 at 10:53:08AM +0800, LI Xin wrote: > Hi, Oleg, > > Oleg Derevenetz wrote: > > ??? LI Xin <[EMAIL PROTECTED]>: > [...] > >> I'm not very sure if this is specific to one disk controller. Actually > >> I got some occasional reports about similar hangs on amd64 6.2-RELEASE > >> (slightly patched version) that most of processes stuck in the 'ufs' > >> state, under very light load, the box was equipped with amr(4) RAID. > >> > >> I was not able to reproduce the problem at my lab, though, it's still > >> unknown that how to trigger the livelock :-( Still need some > >> investigate on their production system. > > > > I reported simular issue for FreeBSD 6.2 in audit-trail for kern/104406: > > > > http://www.freebsd.org/cgi/query-pr.cgi?pr=104406&cat= > > > > and there should be a thread related to this. Briefly, I suspects that this > > is > > related to nullfs filesystems on my server and when I cvsuped to FreeBSD > > 6.2- > > STABLE with Daichi's unionfs-related patches and replaced nullfs-mounted fs > > with unionfs-mounted (that was done 10.03.07) problem is gone (seems to be > > so, > > at least). > > Hmm... Seems to be different issues. The problem I have received was a > pgsql server (no nullfs/unionfs involved), and the hang always happen > when it is not being heavily loaded (usually in the morning, for > instance, and there is no special configuration, like scheduled tasks > which can generate disk load, etc., only the entropy harvesting), so > this is quite confusing. Yes, a large part of the confusion is the unfortunate tendency of people to do the following: my system hangs/panics/etc my system hangs/panics/etc too; it must be the same problem! What we really need is for every FreeBSD user who encounters a hang/panic/etc to avoid jumping to conclusions -- no matter how many superficial similarities there may seem to you -- and instead go through the relevant steps described here: http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug.html Until you (or a developer) have analyzed the resulting information, you cannot definitively determine whether or not your problem is the same as a given random other problem, and you may just confuse the issue by making claims of similarity when you are really reporting a completely separate problem. Thanks, Kris pgp3OkN96LYEW.pgp Description: PGP signature