Re: FreeBSD DNS Resolver Issues?

2007-04-24 Thread Ian Smith
On Mon, 23 Apr 2007, Howard Leadmon wrote:

 >  OK, now I am a bit stumped, so wanted to post here in hopes someone might
 > have an idea. First off the FBSD machine in question is an x86 server running
 > 6.2-STABLE from a supped from a few weeks ago, so is fairly current.
 > 
 >  I use said machine to handle all of my eMail and things in general seem to
 > work great, though I have this one mystery.  
 > 
 >  I we try and send mail to [EMAIL PROTECTED] the mail will just set in the
 > queue forever, until it's returned as a failure.  Talking with the admins at
 > wtplaw they are swearing their configs are correct, and it's something on our
 > side. Looking at the mailq, I see:
 > 
 > l3NEqolY01112428697 Mon Apr 23 10:52 <[EMAIL PROTECTED]>
 >  (Deferred: Name server: mail.wtplaw.com.: host name lookup
 > fa)
 >  <[EMAIL PROTECTED]>
 > 
 > So as it's quick an easy I used dig and did a lookup:
 > 
 > $ host wtplaw.com
 > wtplaw.com has address 69.20.43.246
 > wtplaw.com mail is handled by 10 mail.wtplaw.com.
 > 
 > 
 > Then on mail.wtplaw.com:
 > 
 > $ host mail.wtplaw.com   
 > mail.wtplaw.com has address 65.111.69.228
 > mail.wtplaw.com has address 66.166.181.163
 > Host mail.wtplaw.com not found: 2(SERVFAIL)
 > ;; connection timed out; no servers could be reached

I'm getting the same results here, using dig rather than host (FWIW).

I'm also seing inconsistent results re the listed NS for that domain:

===
smithi on paqi% dig wtplaw.com any

; <<>> DiG 9.3.4 <<>> wtplaw.com any
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 15237
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 2, ADDITIONAL: 2

;; QUESTION SECTION:
;wtplaw.com.IN  ANY

;; ANSWER SECTION:
wtplaw.com. 86363   IN  NS  ns1.airband.net.
wtplaw.com. 86363   IN  NS  ns2.airband.net.
wtplaw.com. 86363   IN  A   69.20.43.246

;; AUTHORITY SECTION:
wtplaw.com. 86363   IN  NS  ns2.airband.net.
wtplaw.com. 86363   IN  NS  ns1.airband.net.

;; ADDITIONAL SECTION:
ns1.airband.net.5687IN  A   216.138.97.246
ns2.airband.net.5687IN  A   216.138.119.6

;; Query time: 236 msec
;; SERVER: 192.168.1.1#53(192.168.1.1)
;; WHEN: Tue Apr 24 15:54:01 2007
;; MSG SIZE  rcvd: 123
===

but:

===
smithi on paqi% dig mail.wtplaw.com

; <<>> DiG 9.3.4 <<>> mail.wtplaw.com
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 33923
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 2, ADDITIONAL: 0

;; QUESTION SECTION:
;mail.wtplaw.com.   IN  A

;; ANSWER SECTION:
mail.wtplaw.com.3   IN  A   65.111.69.228
mail.wtplaw.com.3   IN  A   66.166.181.163

;; AUTHORITY SECTION:
mail.wtplaw.com.86399   IN  NS  lp1.wtplaw.com.
mail.wtplaw.com.86399   IN  NS  lp2.wtplaw.com.

;; Query time: 466 msec
===

Note different NS, with the As for mail.wtplaw.com returned.  Further: 

===
smithi on paqi% dig wtplaw.com mx

; <<>> DiG 9.3.4 <<>> wtplaw.com mx
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 21494
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 2

;; QUESTION SECTION:
;wtplaw.com.IN  MX

;; ANSWER SECTION:
wtplaw.com. 86400   IN  MX  10 mail.wtplaw.com.

;; AUTHORITY SECTION:
wtplaw.com. 62547   IN  NS  ns1.airband.net.
wtplaw.com. 62547   IN  NS  ns2.airband.net.

;; ADDITIONAL SECTION:
ns1.airband.net.5671IN  A   216.138.97.246
ns2.airband.net.5671IN  A   216.138.119.6

;; Query time: 1021 msec
===

Here no A is retuened for mail.wtplaw.com, and note the airband.net NS. 
Pretty sure sendmail does an MX request, so that's what it'll get then,
which explains your mailq response.

At (one set of) the listed NServers:

===
; <<>> DiG 9.3.4 <<>> @lp1.wtplaw.com. mail.wtplaw.com.
; (1 server found)
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 24202
;; flags: qr aa; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;mail.wtplaw.com.   IN  A

;; ANSWER SECTION:
mail.wtplaw.com.3   IN  A   66.166.181.163
mail.wtplaw.com.3   IN  A   65.111.69.228

;; Query time: 268 msec
;; SERVER: 65.111.69.226#53(65.111.69.226)
;; WHEN: Tue Apr 24 15:57:00 2007
;; MSG SIZE  rcvd: 65
===

Note no A record provided for mail.wtplaw.com; same digging
@lp2.wtplaw.com. So trying the 'other' listed NServers above:

===
smithi on paqi% dig @ns1.airband.net. wtplaw.com. any

; <<>> DiG 9.3.4 <<>> @ns1.airband.net. wtplaw.com. any
; (1 server found)

question: +swap_pager_getswapspace(16): failed

2007-04-24 Thread Harald Schmalzbauer
Hello,

I have a little understanding problem:
My box has 128MB memory, far enough for the task.
After a few days I always see some processes dying because:

+swap_pager_getswapspace(2): failed
+pid 48211 (perl5.8.8), uid 58, was killed: out of swap space

Why won't for example the 21MB Buf get freed before more swap space gets 
requested than available (swap is very low, it's FlashDisk!)?
Is there a way to find out what process is swapped?

Thanks for any hints. My only way to circumvent this problem is to reboot the 
machine daily.

-Harry
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: FreeBSD DNS Resolver Issues?

2007-04-24 Thread Ian Smith
Sorry following up on my own post: a correction and some further info: 

On Tue, 24 Apr 2007, Ian Smith wrote:
[..]
 > At (one set of) the listed NServers:
 > 
 > ===
 > ; <<>> DiG 9.3.4 <<>> @lp1.wtplaw.com. mail.wtplaw.com.
 > ; (1 server found)
 > ;; global options:  printcmd
 > ;; Got answer:
 > ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 24202
 > ;; flags: qr aa; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0
 > 
 > ;; QUESTION SECTION:
 > ;mail.wtplaw.com.   IN  A
 > 
 > ;; ANSWER SECTION:
 > mail.wtplaw.com.3   IN  A   66.166.181.163
 > mail.wtplaw.com.3   IN  A   65.111.69.228
 > 
 > ;; Query time: 268 msec
 > ;; SERVER: 65.111.69.226#53(65.111.69.226)
 > ;; WHEN: Tue Apr 24 15:57:00 2007
 > ;; MSG SIZE  rcvd: 65
 > ===
 > 
 > Note no A record provided for mail.wtplaw.com; same digging
 > @lp2.wtplaw.com. So trying the 'other' listed NServers above:

That's wrong of course; it is returning two A RRs for mail.wtplaw.com.
however a) they always show 3 (three!) seconds TTL on those records, and
b) these two NS, lp1.wtplaw.com. and lp1.wtplaw.com. , aren't shown as
authoritative, and c) aren't even auth. / don't work for themselves!

===
smithi on paqi% dig @lp1.wtplaw.com. lp1.wtplaw.com.

; <<>> DiG 9.3.4 <<>> @lp1.wtplaw.com. lp1.wtplaw.com.
; (1 server found)
;; global options:  printcmd
;; connection timed out; no servers could be reached

smithi on paqi% dig @lp2.wtplaw.com. lp2.wtplaw.com.

; <<>> DiG 9.3.4 <<>> @lp2.wtplaw.com. lp2.wtplaw.com.
; (1 server found)
;; global options:  printcmd
;; connection timed out; no servers could be reached
===

Cheers, Ian

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: FreeBSD DNS Resolver Issues?

2007-04-24 Thread Ian Smith
On Tue, 24 Apr 2007, Mark Andrews wrote:

 >  It's a broken load balancer which is returning the
 >  parent's SOA record for  queries for mail.wtplaw.com.
 >  Named correctly rejects this as a garbage response.
 > 
 >  It also appears to only responds to A/ queries.
 > 
 >  As for Solaris' host it may/may not be making  queries.

Ah, out of my depth again.  Thanks Mark.  Sorry for the noise then.

Cheers, Ian

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: FreeBSD DNS Resolver Issues?

2007-04-24 Thread Mark Andrews

It's a broken load balancer which is returning the
parent's SOA record for  queries for mail.wtplaw.com.
Named correctly rejects this as a garbage response.

It also appears to only responds to A/ queries.

As for Solaris' host it may/may not be making  queries.

Mark
-- 
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742 INTERNET: [EMAIL PROTECTED]
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: 6.2-STABLE deadlock?

2007-04-24 Thread LI Xin
Kostik Belousov wrote:
> On Mon, Apr 23, 2007 at 03:56:32AM +0100, Adrian Wontroba wrote:
>> On Tue, Mar 13, 2007 at 02:08:48PM +, Adrian Wontroba wrote:
>>> At work, amoungst my stable of old computers running FreeBSD, I have a
>>> Fujitsu M800 - a 4 Zeon SMP processor with 4 GB of memory. This
>>> primarily runs Nagios and a small and lightly used MySQL database, along
>>> with a few inbound FTP transfers per minute. It has a Mylex card based
>>> disc subsystem, ruling out crash dumps.
>>>
>>> At some point during 5.5-STABLE this machine started to occasionally hang 
>>> ...
>> Another 6-STABLE (cvsupped on 27/03/07) example, with diagnostics taken
>> rather sooner after the hang.  Processes with wmesg=ufs feature often in
>> the ps output.
>>
>> http://www.stade.co.uk/crash1/
> 
> I would suspect the mlx controller. There is several processes (for instance,
> 988, 50918) waiting for completion of block read, and processes in the "ufs"
> states are the result of the lock cascade, IMHO.

I'm not very sure if this is specific to one disk controller.  Actually
I got some occasional reports about similar hangs on amd64 6.2-RELEASE
(slightly patched version) that most of processes stuck in the 'ufs'
state, under very light load, the box was equipped with amr(4) RAID.

I was not able to reproduce the problem at my lab, though, it's still
unknown that how to trigger the livelock :-(  Still need some
investigate on their production system.

Cheers,
-- 
Xin LI <[EMAIL PROTECTED]>  http://www.delphij.net/
FreeBSD - The Power to Serve!



signature.asc
Description: OpenPGP digital signature


[bsdcan-announce] BSDCan - less than four weeks! (fwd)

2007-04-24 Thread Robert Watson


Dear FreeBSD users and developers:

The BSD Canada Conference (BSDCan) is just a few weeks away -- May 18-19 in 
Ottawa, Canada.  This is a great opportunity to meet up with other FreeBSD 
developers and users, learn about exciting work taking place in FreeBSD, and 
it's also a chance to talk about your own work.  You'll hear talks on a broad 
range of FreeBSD-related topics, including Autofs, FreeBSD/PPC, FreeBSD SD/MMC 
support, FreeBSD security features, FreeNAS, network stack virtualization, 
PC-BSD, FreeBSD interrupt handling, ZFS, portsnap, FreeBSD clustering, IPv6 
security, the FreeBSD security officer, and much, much more.  It's also a 
great social event :-).


You can learn more about the conference at the conference website:

  http://www.bsdcan.org/

If you're not already considering attending (and registered), please consider 
doing so.  See you in Ottawa!.


Robert N M Watson
Computer Laboratory
University of Cambridge

-- Forwarded message --
Date: Tue, 24 Apr 2007 06:58:00 -0400
From: Dan Langille <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Subject: [bsdcan-announce] BSDCan - less than four weeks!

Gidday,

BSDCan 2007 is now less than four weeks away.  We have another strong
lineup of talks.  I hope you've finished your travel plans.  It is
not too late to book now.

New this year: lunches on SITE.  Yes.  Really.  Less money for you to
spend.  More time spent schmoozing. And for those staying in
residence: breakfast is included with your accommodation.   See
http://www.uottawa.ca/services/matmgmt/hospitality/food.html

As always, registration will start in the Royal Oak.  You can pick
your registration pack up between 3:30 and 7pm.  The Royal Oak is
very close to residence.  See http://tinyurl.com/jxelk

We have not picked a spot for mass gatherings on Friday and Saturday
night.  There are many to choose from.  As always, BSDCan is both a
social and a learning event.  :)  Sometimes the two are concurrent.

See you at BSDCan 2007!

--
Dan Langille
two conferences, one trip, great value: May 2007
BSDCan - The BSD Conference - http://www.bsdcan.org/
PGCon - The PostgreSQL Conference - http://www.pgcon.org/


___
bsdcan-announce mailing list
[EMAIL PROTECTED]
http://lists.bsdcan.org/mailman/listinfo/bsdcan-announce
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: [kde-freebsd] problem hal - k3b ?

2007-04-24 Thread Zoran Kolic
> This problem appear in my system after updating system and ports on
> April, 06.
> K3b hangs either after loading splash screen or after eject wrote media
> from device.

Aside that new atapi-cam.c is proven to work, I'd like to know if command
line works or not? K3b needs cdrtools in background. What if you make iso
file using mkisofs and burn it with cdrecord?

Zoran

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


mesg errors

2007-04-24 Thread KAYVEN RIESE


my dmesg has errors i guess.  i am sorta new @ this so umm..


here is a dmesg output:

Copyright (c) 1992-2006 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 6.1-RELEASE #0: Sun May  7 04:32:43 UTC 2006
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Pentium(R) M processor 1.73GHz (600.02-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0x6d8  Stepping = 8

Features=0xafe9fbff
  Features2=0x180
  AMD Features=0x10
real memory  = 536084480 (511 MB)
avail memory = 515223552 (491 MB)
kbd1 at kbdmux0
acpi0:  on motherboard
ACPI-0356: *** Error: Region EmbeddedControl(3) has no handler
ACPI-1304: *** Error: Method execution failed 
[\\_SB_.PCI0.SBRG.EC0_.ACS_] (Node 0xc33998c0), AE_NOT_EXIST
ACPI-1304: *** Error: Method execution failed [\\_SB_.AC__._INI] (Node 
0xc33993e0), AE_NOT_EXIST

ACPI-0356: *** Error: Region EmbeddedControl(3) has no handler
ACPI-1304: *** Error: Method execution failed 
[\\_SB_.PCI0.SBRG.EC0_.BATS] (Node 0xc33998a0), AE_NOT_EXIST
ACPI-1304: *** Error: Method execution failed [\\_SB_.BAT0._STA] (Node 
0xc339d720), AE_NOT_EXIST
ACPI-0239: *** Error: Method execution failed [\\_SB_.BAT0._STA] (Node 
0xc339d720), AE_NOT_EXIST

acpi0: Power Button (fixed)
acpi_ec0:  port 0x62,0x66 on acpi0
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0
cpu0:  on acpi0
acpi_throttle0:  on cpu0
pcib0:  port 0xcf8-0xcff on acpi0
pci0:  on pcib0
agp0:  mem 0xe000-0xefff at device 
0.0 on pci0

pcib1:  at device 1.0 on pci0
pci1:  on pcib1
pci1:  at device 0.0 (no driver attached)
uhci0:  port 0xe800-0xe81f irq 
11 at device 29.0 on pci0

uhci0: [GIANT-LOCKED]
usb0:  on uhci0
usb0: USB revision 1.0
uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1:  port 0xe880-0xe89f irq 
5 at device 29.1 on pci0

uhci1: [GIANT-LOCKED]
usb1:  on uhci1
usb1: USB revision 1.0
uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
uhci2:  port 0xec00-0xec1f irq 
10 at device 29.2 on pci0

uhci2: [GIANT-LOCKED]
usb2:  on uhci2
usb2: USB revision 1.0
uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub2: 2 ports with 2 removable, self powered
ehci0:  mem 
0xffaffc00-0xffaf irq 10 at device 29.7 on pci0

ehci0: [GIANT-LOCKED]
usb3: EHCI version 1.0
usb3: companion controllers, 2 ports each: usb0 usb1 usb2
usb3:  on ehci0
usb3: USB revision 2.0
uhub3: Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
uhub3: 6 ports with 6 removable, self powered
pcib2:  at device 30.0 on pci0
pci2:  on pcib2
bge0:  mem 
0xff9f-0xff9f irq 4 at device 0.0 on pci2

miibus0:  on bge0
brgphy0:  on miibus0
brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 
1000baseTX-FDX, auto

bge0: Ethernet address: 00:11:d8:22:c9:91
cbb0:  at device 1.0 on pci2
cardbus0:  on cbb0
pccard0: <16-bit PCCard bus> on cbb0
cbb1:  at device 1.1 on pci2
cardbus1:  on cbb1
pccard1: <16-bit PCCard bus> on cbb1
fwohci0:  mem 0xff9ef800-0xff9e irq 10 at device 1.2 on 
pci2

fwohci0: OHCI version 1.0 (ROM=1)
fwohci0: No. of Isochronous channels is 4.
fwohci0: EUI64 00:e0:18:00:03:26:4c:e9
fwohci0: Phy 1394a available S400, 2 ports.
fwohci0: Link S400, max_rec 2048 bytes.
firewire0:  on fwohci0
fwe0:  on firewire0
if_fwe0: Fake Ethernet address: 02:e0:18:26:4c:e9
fwe0: Ethernet address: 02:e0:18:26:4c:e9
fwe0: if_start running deferred for Giant
sbp0:  on firewire0
fwohci0: Initiate bus reset
fwohci0: node_id=0xc800ffc0, gen=1, CYCLEMASTER mode
firewire0: 1 nodes, maxhop <= 0, cable IRM = 0 (me)
firewire0: bus manager 0 (me)
pci2:  at device 2.0 (no driver attached)
isab0:  at device 31.0 on pci0
isa0:  on isab0
atapci0:  port 
0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at device 31.1 on pci0

ata0:  on atapci0
ata1:  on atapci0
pci0:  at device 31.5 (no driver attached)
pci0:  at device 31.6 (no driver attached)
acpi_lid0:  on acpi0
acpi_button0:  on acpi0
acpi_acad0:  on acpi0
battery0:  on acpi0
battery1:  on acpi0
acpi_button1:  on acpi0
acpi_tz0:  on acpi0
atkbdc0:  port 0x60,0x64 irq 1 on acpi0
atkbd0:  irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
psm0:  irq 12 on atkbdc0
psm0: [GIANT-LOCKED]
psm0: model Generic PS/2 mouse, device ID 0
sio0: configured irq 3 not in bitmap of probed irqs 0
sio0: port may not be enabled
sio0 port 0x2f8-0x2ff irq 3 drq 1 flags 0x10 on acpi0
sio0: type 16550A
ppc0:  port 0x378-0x37f,0x778-0x77f irq 7 drq 3 
on acpi0

ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode
ppc0: FIFO with 16/16/8 bytes threshold
ppbus0:  on ppc0
plip0:  on ppbus0
lpt0:  on ppbus0
lpt0: Interrupt-driven port
ppi0:  on ppbus0
pmtimer0 on isa0
orm0:  at iomem 0xc000

Re: 6.2-STABLE deadlock?

2007-04-24 Thread Oleg Derevenetz
Цитирую LI Xin <[EMAIL PROTECTED]>:

> Kostik Belousov wrote:
> > On Mon, Apr 23, 2007 at 03:56:32AM +0100, Adrian Wontroba wrote:
> >> On Tue, Mar 13, 2007 at 02:08:48PM +, Adrian Wontroba wrote:
> >>> At work, amoungst my stable of old computers running FreeBSD, I have
> a
> >>> Fujitsu M800 - a 4 Zeon SMP processor with 4 GB of memory. This
> >>> primarily runs Nagios and a small and lightly used MySQL database,
> along
> >>> with a few inbound FTP transfers per minute. It has a Mylex card
> based
> >>> disc subsystem, ruling out crash dumps.
> >>>
> >>> At some point during 5.5-STABLE this machine started to occasionally
> hang ...
> >> Another 6-STABLE (cvsupped on 27/03/07) example, with diagnostics
> taken
> >> rather sooner after the hang.  Processes with wmesg=ufs feature often
> in
> >> the ps output.
> >>
> >> http://www.stade.co.uk/crash1/
> > 
> > I would suspect the mlx controller. There is several processes (for
> instance,
> > 988, 50918) waiting for completion of block read, and processes in the
> "ufs"
> > states are the result of the lock cascade, IMHO.
> 
> I'm not very sure if this is specific to one disk controller.  Actually
> I got some occasional reports about similar hangs on amd64 6.2-RELEASE
> (slightly patched version) that most of processes stuck in the 'ufs'
> state, under very light load, the box was equipped with amr(4) RAID.
> 
> I was not able to reproduce the problem at my lab, though, it's still
> unknown that how to trigger the livelock :-(  Still need some
> investigate on their production system.

I reported simular issue for FreeBSD 6.2 in audit-trail for kern/104406:

http://www.freebsd.org/cgi/query-pr.cgi?pr=104406&cat=

and there should be a thread related to this. Briefly, I suspects that this is 
related to nullfs filesystems on my server and when I cvsuped to FreeBSD 6.2-
STABLE with Daichi's unionfs-related patches and replaced nullfs-mounted fs 
with unionfs-mounted (that was done 10.03.07) problem is gone (seems to be so, 
at least).
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: [kde-freebsd] problem hal - k3b ?

2007-04-24 Thread ejc

As a data point, I was seeing the same problems, but reverting to
atapi-cam.c rev 1.42.2.2 works here too.

Eric
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Dell SAS5 Performance Issue

2007-04-24 Thread J. Martin Petersen

Matthew Jacob wrote:


Is there any news on the performance of this card?



I personally have not been able to reproduce the problem. It seems to
occur whether in Integrated Raid or not. It seems to be related to
specific backplanes and drives. It's an important problem to solve I
agree.


We have a HP Proliant DL140 g3 that exhibits this (or a somewhat 
related) problem, to which we can give you remote access (including 
remote KVM) if that helps?


Cheers, Martin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: question: +swap_pager_getswapspace(16): failed

2007-04-24 Thread Peter Jeremy
On 2007-Apr-24 09:32:06 +0200, Harald Schmalzbauer <[EMAIL PROTECTED]> wrote:
>My box has 128MB memory, far enough for the task.

If you are regularly running out of space, then maybe not - at least
without tuning some parameters.

How much swap space do you actually have and what is your box trying
to do?  My firewall also has 128MB and it's only paged out 575 pages
in the last 9.7 days.

>After a few days I always see some processes dying because:
>
>+swap_pager_getswapspace(2): failed
>+pid 48211 (perl5.8.8), uid 58, was killed: out of swap space
>
>Why won't for example the 21MB Buf get freed before more swap space gets 
>requested than available (swap is very low, it's FlashDisk!)?

vfs.bufspace is inside a feedback loop that tries to keep it between
vfs.lobufspace and vfs.hibufspace - which are tuned based on memory
size by default (for 128MB RAM, hibufspace should be ~22MB).  You
could try seting kern.nbuf (in /boot/loader.conf) to reduce the buffer
space allocated (each buffer is 16KB).

>Is there a way to find out what process is swapped?

The ps output will include 'W'.  Note that 'swapped' is a special
state and normally processes are just paged.

In top and ps, the difference between 'size' and 'res' reflects memory
space that the process has allocated to it but is not resident.
Unfortunately, this includes both text area (which is vnode backed)
and space that has never been touched (and therefore doesn't exist
anywhere) as well as swap space.

Offhand, I don't know of any tool to report the swap utilisation by
process on FreeBSD.  (Though I have written such a tool for Tru64).

-- 
Peter Jeremy


pgpWLbTKgGRf1.pgp
Description: PGP signature


RE: 6.2-STABLE deadlock?

2007-04-24 Thread Jan Mikkelsen
LI Xin wrote:
> Kostik Belousov wrote:
> > On Mon, Apr 23, 2007 at 03:56:32AM +0100, Adrian Wontroba wrote:
> >> On Tue, Mar 13, 2007 at 02:08:48PM +, Adrian Wontroba wrote:
> >>> At work, amoungst my stable of old computers running 
> FreeBSD, I have a
> >>> Fujitsu M800 - a 4 Zeon SMP processor with 4 GB of memory. This
> >>> primarily runs Nagios and a small and lightly used MySQL 
> database, along
> >>> with a few inbound FTP transfers per minute. It has a 
> Mylex card based
> >>> disc subsystem, ruling out crash dumps.
> >>>
> >>> At some point during 5.5-STABLE this machine started to 
> occasionally hang ...
> >> Another 6-STABLE (cvsupped on 27/03/07) example, with 
> diagnostics taken
> >> rather sooner after the hang.  Processes with wmesg=ufs 
> feature often in
> >> the ps output.
> >>
> >> http://www.stade.co.uk/crash1/
> > 
> > I would suspect the mlx controller. There is several 
> processes (for instance,
> > 988, 50918) waiting for completion of block read, and 
> processes in the "ufs"
> > states are the result of the lock cascade, IMHO.
> 
> I'm not very sure if this is specific to one disk controller. 
>  Actually
> I got some occasional reports about similar hangs on amd64 6.2-RELEASE
> (slightly patched version) that most of processes stuck in the 'ufs'
> state, under very light load, the box was equipped with amr(4) RAID.
> 
> I was not able to reproduce the problem at my lab, though, it's still
> unknown that how to trigger the livelock :-(  Still need some
> investigate on their production system.

I have seen something similar once, on a machine with an Areca (arcmsr)
controller, running 6.2-RELEASE (with unionfs patches).  Processes stuck in
"ufs", and the machine needed physical intervention to reboot.  I haven't
seen it since.  From memory, it happened during startup of the applications
and jails on the machine.

Jan.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


RE: 6.2-STABLE deadlock?

2007-04-24 Thread Jan Mikkelsen
Oleg Derevenetz wrote:
> [ ... ] 
> I reported simular issue for FreeBSD 6.2 in audit-trail for 
> kern/104406:
> 
> http://www.freebsd.org/cgi/query-pr.cgi?pr=104406&cat=
> 
> and there should be a thread related to this. Briefly, I 
> suspects that this is 
> related to nullfs filesystems on my server and when I cvsuped 
> to FreeBSD 6.2-
> STABLE with Daichi's unionfs-related patches and replaced 
> nullfs-mounted fs 
> with unionfs-mounted (that was done 10.03.07) problem is gone 
> (seems to be so, 
> at least).

Interesting.  In the instance I saw, there were also nullfs filesystems
mounted.

Regards,

Jan.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: 6.2-STABLE deadlock?

2007-04-24 Thread Eugene Grosbein
Kostik Belousov wrote:

> I would suspect the mlx controller. There is several processes (for instance,
> 988, 50918) waiting for completion of block read, and processes in the "ufs"
> states are the result of the lock cascade, IMHO.

It may be possible that controller is not guilty.

You can easily reproduce lock in "ufs" state with commands from
the "How-To-Repeat" section of:
http://www.FreeBSD.org/cgi/query-pr.cgi?pr=kern/107439

The PR is closed but the problem still exists in recent 6.2-STABLE.
GENERIC has the problem too, GENERIC+INVARIANTS panices at once
instead of producing locked processes.

Eugene Grosbein.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: 6.2-STABLE deadlock?

2007-04-24 Thread LI Xin
Hi, Oleg,

Oleg Derevenetz wrote:
> Цитирую LI Xin <[EMAIL PROTECTED]>:
[...]
>> I'm not very sure if this is specific to one disk controller.  Actually
>> I got some occasional reports about similar hangs on amd64 6.2-RELEASE
>> (slightly patched version) that most of processes stuck in the 'ufs'
>> state, under very light load, the box was equipped with amr(4) RAID.
>>
>> I was not able to reproduce the problem at my lab, though, it's still
>> unknown that how to trigger the livelock :-(  Still need some
>> investigate on their production system.
> 
> I reported simular issue for FreeBSD 6.2 in audit-trail for kern/104406:
> 
> http://www.freebsd.org/cgi/query-pr.cgi?pr=104406&cat=
> 
> and there should be a thread related to this. Briefly, I suspects that this 
> is 
> related to nullfs filesystems on my server and when I cvsuped to FreeBSD 6.2-
> STABLE with Daichi's unionfs-related patches and replaced nullfs-mounted fs 
> with unionfs-mounted (that was done 10.03.07) problem is gone (seems to be 
> so, 
> at least).

Hmm...  Seems to be different issues.  The problem I have received was a
pgsql server (no nullfs/unionfs involved), and the hang always happen
when it is not being heavily loaded (usually in the morning, for
instance, and there is no special configuration, like scheduled tasks
which can generate disk load, etc., only the entropy harvesting), so
this is quite confusing.

Cheers,
-- 
Xin LI <[EMAIL PROTECTED]>  http://www.delphij.net/
FreeBSD - The Power to Serve!



signature.asc
Description: OpenPGP digital signature


Re: 6.2-STABLE deadlock?

2007-04-24 Thread Kris Kennaway
On Wed, Apr 25, 2007 at 11:53:32AM +1000, Jan Mikkelsen wrote:
> LI Xin wrote:
> > Kostik Belousov wrote:
> > > On Mon, Apr 23, 2007 at 03:56:32AM +0100, Adrian Wontroba wrote:
> > >> On Tue, Mar 13, 2007 at 02:08:48PM +, Adrian Wontroba wrote:
> > >>> At work, amoungst my stable of old computers running 
> > FreeBSD, I have a
> > >>> Fujitsu M800 - a 4 Zeon SMP processor with 4 GB of memory. This
> > >>> primarily runs Nagios and a small and lightly used MySQL 
> > database, along
> > >>> with a few inbound FTP transfers per minute. It has a 
> > Mylex card based
> > >>> disc subsystem, ruling out crash dumps.
> > >>>
> > >>> At some point during 5.5-STABLE this machine started to 
> > occasionally hang ...
> > >> Another 6-STABLE (cvsupped on 27/03/07) example, with 
> > diagnostics taken
> > >> rather sooner after the hang.  Processes with wmesg=ufs 
> > feature often in
> > >> the ps output.
> > >>
> > >> http://www.stade.co.uk/crash1/
> > > 
> > > I would suspect the mlx controller. There is several 
> > processes (for instance,
> > > 988, 50918) waiting for completion of block read, and 
> > processes in the "ufs"
> > > states are the result of the lock cascade, IMHO.
> > 
> > I'm not very sure if this is specific to one disk controller. 
> >  Actually
> > I got some occasional reports about similar hangs on amd64 6.2-RELEASE
> > (slightly patched version) that most of processes stuck in the 'ufs'
> > state, under very light load, the box was equipped with amr(4) RAID.
> > 
> > I was not able to reproduce the problem at my lab, though, it's still
> > unknown that how to trigger the livelock :-(  Still need some
> > investigate on their production system.
> 
> I have seen something similar once, on a machine with an Areca (arcmsr)
> controller, running 6.2-RELEASE (with unionfs patches).  Processes stuck in
> "ufs", and the machine needed physical intervention to reboot.  I haven't
> seen it since.  From memory, it happened during startup of the applications
> and jails on the machine.

Sounds like one of the known unionfs bugs.

Kris
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


How to report bugs (Re: 6.2-STABLE deadlock?)

2007-04-24 Thread Kris Kennaway
On Wed, Apr 25, 2007 at 10:53:08AM +0800, LI Xin wrote:
> Hi, Oleg,
> 
> Oleg Derevenetz wrote:
> > ??? LI Xin <[EMAIL PROTECTED]>:
> [...]
> >> I'm not very sure if this is specific to one disk controller.  Actually
> >> I got some occasional reports about similar hangs on amd64 6.2-RELEASE
> >> (slightly patched version) that most of processes stuck in the 'ufs'
> >> state, under very light load, the box was equipped with amr(4) RAID.
> >>
> >> I was not able to reproduce the problem at my lab, though, it's still
> >> unknown that how to trigger the livelock :-(  Still need some
> >> investigate on their production system.
> > 
> > I reported simular issue for FreeBSD 6.2 in audit-trail for kern/104406:
> > 
> > http://www.freebsd.org/cgi/query-pr.cgi?pr=104406&cat=
> > 
> > and there should be a thread related to this. Briefly, I suspects that this 
> > is 
> > related to nullfs filesystems on my server and when I cvsuped to FreeBSD 
> > 6.2-
> > STABLE with Daichi's unionfs-related patches and replaced nullfs-mounted fs 
> > with unionfs-mounted (that was done 10.03.07) problem is gone (seems to be 
> > so, 
> > at least).
> 
> Hmm...  Seems to be different issues.  The problem I have received was a
> pgsql server (no nullfs/unionfs involved), and the hang always happen
> when it is not being heavily loaded (usually in the morning, for
> instance, and there is no special configuration, like scheduled tasks
> which can generate disk load, etc., only the entropy harvesting), so
> this is quite confusing.

Yes, a large part of the confusion is the unfortunate tendency of
people to do the following:

 my system hangs/panics/etc
 my system hangs/panics/etc too; it must be the same problem!

What we really need is for every FreeBSD user who encounters a
hang/panic/etc to avoid jumping to conclusions -- no matter how many
superficial similarities there may seem to you -- and instead go
through the relevant steps described here:

  
http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug.html

Until you (or a developer) have analyzed the resulting information,
you cannot definitively determine whether or not your problem is the
same as a given random other problem, and you may just confuse the
issue by making claims of similarity when you are really reporting a
completely separate problem.

Thanks,
Kris

pgp3OkN96LYEW.pgp
Description: PGP signature