.MAKEFLAGS confuses buildworld

2006-07-14 Thread [LoN]Kamikaze
# make -j 5 buildworld

works fine on my Releng_6 system, but

# make buildworld

with

.MAKEFLAGS= -j 5

in my make.conf stops when buildworld arrives at the legacy target. According 
to the man page of make, it should be exactly the same.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: "swiN: clock sio" process taking 75% CPU

2006-07-14 Thread Gareth McCaughan
I wrote, inter alia,

> About 6 minutes after booting (on two occasions; I don't
> guarantee that this doesn't vary), a process that appears
> in the output of "ps" as "[swi4: clock sio]" begins to
> use about 3/4 of the machine's CPU. I think it does so
> more or less instantaneously. It continues to do so
> indefinitely, so far as I can tell.

David Wolfskill e-mailed me off-list to suggest looking at
the output of "vmstat -i". Answer: the interrupt rates all
appear to be normal, or at least similar to those he observes
on his machines which don't exhibit my problem. More specifically ...

-- excerpt from my reply to David begins --
I get this:

  | interrupt  total   rate
  | irq1: atkbd0   3  0
  | irq6: fdc010  0
  | irq14: ata0 2913  1
  | irq15: ata1   47  0
  | irq17: xl0  7342  4
  | cpu0: timer   302649199
  | Total 312964206

(so the rate of timer interrupts doesn't appear to be
insane)

and

  |  7:56PM  up 26 mins, 1 user, load averages: 1.87, 1.45, 1.08

(so the cost in CPU cycles of servicing them -- if that's what
the rogue process is doing, which seems somewhat plausible --
*does* appear to be insane).
-- excerpt from my reply to David ends --

-- 
g

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Problem in src/lib/libpam/modules/pam_ssh/Makefile

2006-07-14 Thread Nenad Gavrilovic
New src/lib/libpam/modules/pam_ssh/Makefile Revision 1.20.2.1 have 
changes that isn't OK.


And becouse of that compile failed!!!
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Problem in src/lib/libpam/modules/pam_ssh/Makefile

2006-07-14 Thread Nenad Gavrilovic

Nenad Gavrilovic wrote:

New src/lib/libpam/modules/pam_ssh/Makefile Revision 1.20.2.1 have 
changes that isn't OK.


And becouse of that compile failed!!!
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"



Looking in other Makefiles I think that correction:

CFLAGS+= -I${SSHSRC} -include ssh_namespace.

to:

CFLAGS+= -I${SSHDIR} -include ssh_namespace.

bye
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Problem in src/lib/libpam/modules/pam_ssh/Makefile

2006-07-14 Thread David Wolfskill
[-current removed from recipient list, since rev. 1.20.2.1 of the file
in question is on the RELENG_6 branch, which is not CURRENT. -- dhw]

On Fri, Jul 14, 2006 at 03:27:14PM +0200, Nenad Gavrilovic wrote:
> ...
> Looking in other Makefiles I think that correction:
> 
> CFLAGS+= -I${SSHSRC} -include ssh_namespace.
> 
> to:
> 
> CFLAGS+= -I${SSHDIR} -include ssh_namespace.

That change allowed my buildworld to complete; my laptop is now running
the newly-built & -installed system:

g1-18(6.1-S)[1] uname -a
FreeBSD g1-18.catwhisker.org. 6.1-STABLE FreeBSD 6.1-STABLE #121: Fri Jul 14 
08:56:01 PDT 2006 [EMAIL PROTECTED]:/common/S1/obj/usr/src/sys/LAPTOP_30W  
i386
g1-18(6.1-S)[2] 

Peace,
david
-- 
David H. Wolfskill  [EMAIL PROTECTED]
Doing business with spammers only encourages them.  Please boycott spammers.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.


pgpNsNdvbWd0J.pgp
Description: PGP signature


Re: Problem in src/lib/libpam/modules/pam_ssh/Makefile

2006-07-14 Thread Ruslan Ermilov
On Fri, Jul 14, 2006 at 09:27:15AM -0700, David Wolfskill wrote:
> [-current removed from recipient list, since rev. 1.20.2.1 of the file
> in question is on the RELENG_6 branch, which is not CURRENT. -- dhw]
> 
> On Fri, Jul 14, 2006 at 03:27:14PM +0200, Nenad Gavrilovic wrote:
> > ...
> > Looking in other Makefiles I think that correction:
> > 
> > CFLAGS+= -I${SSHSRC} -include ssh_namespace.
> > 
> > to:
> > 
> > CFLAGS+= -I${SSHDIR} -include ssh_namespace.
> 
> That change allowed my buildworld to complete; my laptop is now running
> the newly-built & -installed system:
> 
I've just committed a more complete fix for this.


Cheers,
-- 
Ruslan Ermilov
[EMAIL PROTECTED]
FreeBSD committer


pgp19j0HmMmhs.pgp
Description: PGP signature


Re: ATA problems again ...

2006-07-14 Thread Miroslav Lachman

Robert Watson wrote:
I don't have a whole lot to add to this thread, but have changed the 
subject to make sure that the right people are reading this.  This is 
likely either a hardware problem (motherboard/cable/drive) or driver 
problem.  GEOM and the mirror driver seems to be behaving as desired (it 
detaches a drive reported by the driver as being bad).  Could you post 
the dmesg -v output for the probing of the ata controller and driver?


Same problem here first (ad4) or second (ad8) disk disappear from the 
system about once a day. Independent of disk / CPU load. Sometimes 
without any load, today when I was stress testing the disks with copying 
/usr/ports to another slice in cycle - after 3 hours I got:


Jul 14 19:05:45 track kernel: ad8: FAILURE - device detached
Jul 14 19:05:45 track kernel: subdisk8: detached
Jul 14 19:05:45 track kernel: ad8: detached
Jul 14 19:05:45 track kernel: GEOM_MIRROR: Device gm0: provider ad8 
disconnected.
Jul 14 19:05:45 track kernel: 
g_vfs_done():mirror/gm0s1h[READ(offset=6345932800, length=65536)]error = 6

Jul 14 19:05:45 track kernel: vnode_pager_getpages: I/O read error
Jul 14 19:05:45 track kernel: vm_fault: pager read error, pid 5108 (cp)


After reboot (command reboot), system boot up with both disks attached 
and start autosynchronization. I do not know, if this is hw or sw error, 
I got two same machines with almost equal SW setup and realy equal HW 
setup, but this errors ocurres on one of them only.



dmesg.boot before ad8 failure (rebuilding ad4 from previous failure):

Copyright (c) 1992-2006 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 6.1-RELEASE #0: Sun May  7 04:42:56 UTC 2006
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/SMP
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Pentium(R) 4 CPU 3.00GHz (3000.12-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0xf43  Stepping = 3

Features=0xbfebfbff
  Features2=0x649d>
  AMD Features=0x2010
  Logical CPUs per core: 2
real memory  = 1073610752 (1023 MB)
avail memory = 1041489920 (993 MB)
ACPI APIC Table: 
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
ioapic0: Changing APIC ID to 2
ioapic1: Changing APIC ID to 3
ioapic0  irqs 0-23 on motherboard
ioapic1  irqs 24-47 on motherboard
kbd1 at kbdmux0
acpi0:  on motherboard
acpi_bus_number: can't get _ADR
acpi_bus_number: can't get _ADR
acpi_bus_number: can't get _ADR
acpi_bus_number: can't get _ADR
acpi0: Power Button (fixed)
acpi_bus_number: can't get _ADR
acpi_bus_number: can't get _ADR
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
cpu0:  on acpi0
acpi_throttle0:  on cpu0
cpu1:  on acpi0
pcib0:  port 0xcf8-0xcff on acpi0
pci0:  on pcib0
pcib1:  irq 16 at device 28.0 on pci0
pci1:  on pcib1
pcib2:  at device 0.0 on pci1
pci2:  on pcib2
pcib3:  irq 16 at device 28.4 on pci0
pci3:  on pcib3
bge0:  mem 
0xfc8f-0xfc8f irq 16 at device 0.0 on pci3

miibus0:  on bge0
brgphy0:  on miibus0
brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 
1000baseTX-FDX, auto

bge0: Ethernet address: 00:15:f2:ec:43:69
pcib4:  irq 17 at device 28.5 on pci0
pci4:  on pcib4
bge1:  mem 
0xfc9f-0xfc9f irq 17 at device 0.0 on pci4

miibus1:  on bge1
brgphy1:  on miibus1
brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 
1000baseTX-FDX, auto

bge1: Ethernet address: 00:15:f2:ec:43:6a
uhci0:  port 0xec00-0xec1f irq 23 at 
device 29.0 on pci0

uhci0: [GIANT-LOCKED]
usb0:  on uhci0
usb0: USB revision 1.0
uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1:  port 0xe880-0xe89f irq 19 at 
device 29.1 on pci0

uhci1: [GIANT-LOCKED]
usb1:  on uhci1
usb1: USB revision 1.0
uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
ehci0:  mem 
0xfebffc00-0xfebf irq 23 at device 29.7 on pci0

ehci0: [GIANT-LOCKED]
usb2: EHCI version 1.0
usb2: companion controllers, 2 ports each: usb0 usb1
usb2:  on ehci0
usb2: USB revision 2.0
uhub2: Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
uhub2: 4 ports with 4 removable, self powered
pcib5:  at device 30.0 on pci0
pci5:  on pcib5
pci5:  at device 2.0 (no driver attached)
isab0:  at device 31.0 on pci0
isa0:  on isab0
atapci0:  port 
0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at device 31.1 on pci0

ata0:  on atapci0
ata1:  on atapci0
atapci1:  port 
0xe800-0xe807,0xe480-0xe483,0xe400-0xe407,0xe080-0xe083,0xe000-0xe00f 
mem 0xfebff800-0xfebffbff irq 19 at device 31.2 on pci0

ata2:  on atapci1
ata3:  on atapci1
ata4:  on atapci1
ata5:  on atapci1
pci0:  at device 31.3 (no driver attached)
acpi_button0:  on acpi0
sio0: configured irq 4 not in bitmap of probed irqs 0
sio0: port may not be enable

Intel ICH7R RAID controller working on 6.1/STABLE?

2006-07-14 Thread H. Wade Minter
I'm considering ordering some of these servers to run FreeBSD 6.1 or 
6-STABLE on, but they have Intel ICH7R RAID controllers on them.  Googling 
around, I'm seeing conflicting information as to whether or they work, or 
work well enough to use in production.


The server specs are here:

http://store.ebizpc.com/su5018.html

Can anyone confirm or deny its support under FreeBSD?

Thanks,
Wade
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Intel ICH7R RAID controller working on 6.1/STABLE?

2006-07-14 Thread Andras Got

Hi,

I found these:

http://www.freebsd.org/cgi/man.cgi?query=ata&apropos=0&sektion=0&manpath=FreeBSD+6.1-RELEASE&format=html
http://www.freebsd.org/cgi/man.cgi?query=ataraid&apropos=0&sektion=0&manpath=FreeBSD+6.1-RELEASE&format=html

These look promising imho. Of course real world experience would be appriciated.

Andras


H. Wade Minter wrote:
I'm considering ordering some of these servers to run FreeBSD 6.1 or 
6-STABLE on, but they have Intel ICH7R RAID controllers on them.  
Googling around, I'm seeing conflicting information as to whether or 
they work, or work well enough to use in production.


The server specs are here:

http://store.ebizpc.com/su5018.html

Can anyone confirm or deny its support under FreeBSD?

Thanks,
Wade
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Intel ICH7R RAID controller working on 6.1/STABLE?

2006-07-14 Thread Mike Jakubik

H. Wade Minter wrote:
I'm considering ordering some of these servers to run FreeBSD 6.1 or 
6-STABLE on, but they have Intel ICH7R RAID controllers on them.  
Googling around, I'm seeing conflicting information as to whether or 
they work, or work well enough to use in production.


The chipset is supported, but i wouldn't recommend onboard raid for any 
production server. Get a real raid controller, or use gmirror if you 
plan to mirror. I use several of these board sin production with gmirror.



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Intel ICH7R RAID controller working on 6.1/STABLE?

2006-07-14 Thread H. Wade Minter

On Fri, 14 Jul 2006, Mike Jakubik wrote:


H. Wade Minter wrote:
I'm considering ordering some of these servers to run FreeBSD 6.1 or 
6-STABLE on, but they have Intel ICH7R RAID controllers on them.  Googling 
around, I'm seeing conflicting information as to whether or they work, or 
work well enough to use in production.


The chipset is supported, but i wouldn't recommend onboard raid for any 
production server. Get a real raid controller, or use gmirror if you plan to 
mirror. I use several of these board sin production with gmirror.


So if I run the disks in non-RAID mode, and/or use software RAID, they 
should work?


--Wade
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: 6.1 quota issues

2006-07-14 Thread Charles Sprickman

On Mon, 10 Jul 2006, Kostik Belousov wrote:


On Mon, Jul 10, 2006 at 01:39:01AM -0400, Charles Sprickman wrote:

On Mon, 10 Jul 2006, Kostik Belousov wrote:


On Mon, Jul 10, 2006 at 12:41:07AM -0400, Charles Sprickman wrote:

On Sat, 8 Jul 2006, Matthew D. Fuller wrote:


On Fri, Jul 07, 2006 at 10:56:47PM -0400 I heard the voice of
Charles Sprickman, and lo! it spake thus:


Trying again, it reported the same inconsistencies then sat there
for more than an hour taking up all the available CPU on the box
until I killed it.  The mtime on quota.user had not changed during
the run.


FWIW, I saw this on a box I setup running a late November -CURRENT
last year; I could never get the quotas setup and running right
because the check always just looped itself up.  The partition they're
on has about 3 gig used out of ~45, with maybe a dozen users.  I never
spent much time on it, since it's just a personal box, and the quotas
are mostly just to provide a handy measure of who's using what (no
limits set).  I just gave it up and decided to worry about it later.


What should I do here?  It's consistently failing.  What information
should I gather to forumulate a PR that won't burden the assignee with
lots of troubleshooting mess?  The machine is not in production, but there
is user data on it.  I could allow a trusted developer access to it, or
even create another jail to illustrate the problem.


It is not clear from your report whether you run fsck on the problem
partition. I think (and my view is backed by "unexpected inconsistencies"
message) that this is the must.


Sorry about that, I did not mention it, but thinking the same thing you
did, I unmounted the partition and fsck'd twice for good measure.  Both
runs came back clean.  I think its quotacheck complaining about the
quota.user file...

Ok, please, show me uname -a, dmesg, /etc/fstab, mount -v.


Rather than clutter the thread with all that, I'll link it up:

http://www.bway.net/~spork/quota-info.html

There's also a link to the bzipped quota.user file there, as I'm fairly 
certain that holds some secrets.  Are there any utilities to poke around 
that file with?


Thanks,

Charles

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Problem in src/lib/libpam/modules/pam_ssh/Makefile

2006-07-14 Thread Dag-Erling Smørgrav
Ruslan Ermilov <[EMAIL PROTECTED]> writes:
> I've just committed a more complete fix for this.

Thanks.  I fat-fingered 'ncvs diff -r1' and thougt what I committed
was identical to what was in HEAD.  Anyone got a towel to wipe the egg
off my face?

DES
-- 
Dag-Erling Smørgrav - [EMAIL PROTECTED]
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Intel ICH7R RAID controller working on 6.1/STABLE?

2006-07-14 Thread Javier Henderson
* Mike Jakubik <[EMAIL PROTECTED]> [060714 17:15]:
> H. Wade Minter wrote:
> >I'm considering ordering some of these servers to run FreeBSD 6.1 or 
> >6-STABLE on, but they have Intel ICH7R RAID controllers on them.  
> >Googling around, I'm seeing conflicting information as to whether or 
> >they work, or work well enough to use in production.
> 
> The chipset is supported, but i wouldn't recommend onboard raid for any 
> production server. Get a real raid controller, or use gmirror if you 
> plan to mirror. I use several of these board sin production with gmirror.

Why do you recommend against on-board RAID controllers?

-jav
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Intel ICH7R RAID controller working on 6.1/STABLE?

2006-07-14 Thread Javier Henderson
* H. Wade Minter <[EMAIL PROTECTED]> [060714 17:01]:
> I'm considering ordering some of these servers to run FreeBSD 6.1 or 
> 6-STABLE on, but they have Intel ICH7R RAID controllers on them.  Googling 
> around, I'm seeing conflicting information as to whether or they work, or 
> work well enough to use in production.
> 
> The server specs are here:
> 
> http://store.ebizpc.com/su5018.html
> 
> Can anyone confirm or deny its support under FreeBSD?

I've 6.1-RELEASE on an Intel D945PVS, which has the ICH7R with four
320 GB SATA drives in a RAID 5 configuration. It works fine, and I did
put it through its paces before I started storing valuable data.

-jav
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


vm_map.c lock up (Was: Re: NFS Locking Issue)

2006-07-14 Thread User Freebsd



On Wed, 5 Jul 2006, Robert Watson wrote:

If you can get into DDB when the hang has occurred, output via serial console 
for the following commands would be very helpful:


show pcpu
show allpcpu
ps
trace
traceall
show locks
show alllocks
show uma
show malloc
show lockedvnods


'k, after 16 days uptime, the server that I got all the debugging turned 
on for finally hung up solid ... I was able to break into DDB over the 
serial link, and have run all of the above on it ... and the output is 
attached ...


One thing to note is that the ps listing is not complete ... there are >6k 
processes running at the time, and I don't know how to get rid of the 
'--more--' prompt :(  After 1k processes, I just hit 'q' and went onto the 
other commands ...


Also, traceall gave me a 'No such command' error ... now that I think 
about it, my luck, it was supposed to be 'trace all'?


If this doesn't provide enough information, please let me know what else I 
should do the next time through, besides the above commands ...


Oh, and how do you get DDB to 'dump core' in 6.x?  Back in 4.x days, I'd 
just do 'panic' (maybe twice) at the DDB prompt, but that didn't work with 
6.x ... it just gave me a stacktrace and then the DDB> prompt both times 
...



Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664

typescript.gz
Description: Binary data
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: vm_map.c lock up (Was: Re: NFS Locking Issue)

2006-07-14 Thread User Freebsd

On Sat, 15 Jul 2006, User Freebsd wrote:


On Wed, 5 Jul 2006, Robert Watson wrote:

If you can get into DDB when the hang has occurred, output via serial 
console for the following commands would be very helpful:


show pcpu
show allpcpu
ps
trace
traceall
show locks
show alllocks
show uma
show malloc
show lockedvnods


'k, after 16 days uptime, the server that I got all the debugging turned on 
for finally hung up solid ... I was able to break into DDB over the serial 
link, and have run all of the above on it ... and the output is attached ...


One thing to note is that the ps listing is not complete ... there are >6k 
processes running at the time, and I don't know how to get rid of the 
'--more--' prompt :(  After 1k processes, I just hit 'q' and went onto the 
other commands ...


Also, traceall gave me a 'No such command' error ... now that I think about 
it, my luck, it was supposed to be 'trace all'?


If this doesn't provide enough information, please let me know what else I 
should do the next time through, besides the above commands ...


Oh, and how do you get DDB to 'dump core' in 6.x?  Back in 4.x days, I'd just 
do 'panic' (maybe twice) at the DDB prompt, but that didn't work with 6.x ... 
it just gave me a stacktrace and then the DDB> prompt both times ...


Quick appendum ... the kernel on this server is from June 28th of this 
year ...



Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: vm_map.c lock up (Was: Re: NFS Locking Issue)

2006-07-14 Thread Kostik Belousov
On Sat, Jul 15, 2006 at 12:10:29AM -0300, User Freebsd wrote:
> 
> 
> On Wed, 5 Jul 2006, Robert Watson wrote:
> 
> >If you can get into DDB when the hang has occurred, output via serial 
> >console for the following commands would be very helpful:
> >
> >show pcpu
> >show allpcpu
> >ps
> >trace
> >traceall
> >show locks
> >show alllocks
> >show uma
> >show malloc
> >show lockedvnods
> 
> 'k, after 16 days uptime, the server that I got all the debugging turned 
> on for finally hung up solid ... I was able to break into DDB over the 
> serial link, and have run all of the above on it ... and the output is 
> attached ...
> 
> One thing to note is that the ps listing is not complete ... there are >6k 
> processes running at the time, and I don't know how to get rid of the 
> '--more--' prompt :(  After 1k processes, I just hit 'q' and went onto the 
> other commands ...
set lines=0
> 
> Also, traceall gave me a 'No such command' error ... now that I think 
> about it, my luck, it was supposed to be 'trace all'?
It is alltrace.
> 
> If this doesn't provide enough information, please let me know what else I 
> should do the next time through, besides the above commands ...
Missing alltrace output seems to be critical. If this is not feasible,
please, provide at least the output of the bt  for each pid
shown in the "show lockedvnods" and "show alllocks". In you case,
bt 64880 was the most interesting. It is pity that you had reset the
machine.

Just in case, do you use mlocked mappings ? Also, why so huge number
of crons exist in the system ? The are all forking now. It may be
(can not say definitely without further investigation) just a fork bomb.


pgpGRGY1ljkXo.pgp
Description: PGP signature


Re: vm_map.c lock up (Was: Re: NFS Locking Issue)

2006-07-14 Thread User Freebsd

On Sat, 15 Jul 2006, Kostik Belousov wrote:


On Sat, Jul 15, 2006 at 12:10:29AM -0300, User Freebsd wrote:



On Wed, 5 Jul 2006, Robert Watson wrote:


If you can get into DDB when the hang has occurred, output via serial
console for the following commands would be very helpful:

show pcpu
show allpcpu
ps
trace
traceall
show locks
show alllocks
show uma
show malloc
show lockedvnods


'k, after 16 days uptime, the server that I got all the debugging turned
on for finally hung up solid ... I was able to break into DDB over the
serial link, and have run all of the above on it ... and the output is
attached ...

One thing to note is that the ps listing is not complete ... there are >6k
processes running at the time, and I don't know how to get rid of the
'--more--' prompt :(  After 1k processes, I just hit 'q' and went onto the
other commands ...

set lines=0


Also, traceall gave me a 'No such command' error ... now that I think
about it, my luck, it was supposed to be 'trace all'?

It is alltrace.


If this doesn't provide enough information, please let me know what else I
should do the next time through, besides the above commands ...

Missing alltrace output seems to be critical. If this is not feasible,
please, provide at least the output of the bt  for each pid
shown in the "show lockedvnods" and "show alllocks". In you case,
bt 64880 was the most interesting. It is pity that you had reset the
machine.


Was down for too long as it was ... it, of course, happened while I was 
out with the family :(


Will keep all of this in mind next time I get a chance to run through 
things ...


Any idea why 'panic' doesn't produce core like it used to?

Just in case, do you use mlocked mappings ? Also, why so huge number of 
crons exist in the system ? The are all forking now. It may be (can not 
say definitely without further investigation) just a fork bomb.


mlocked mappings?  What are they? :)

re: crons ... this, I'm not sure of, but my suspicion was that the crons 
weren't able to complete, since the file system was locked up, but the 
next one was being attempted to run ... *shrug*



Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: vm_map.c lock up (Was: Re: NFS Locking Issue)

2006-07-14 Thread Antony Mawer

On 14/07/2006 6:08 PM, User Freebsd wrote:
Just in case, do you use mlocked mappings ? Also, why so huge number 
of crons exist in the system ? The are all forking now. It may be (can 
not say definitely without further investigation) just a fork bomb.


re: crons ... this, I'm not sure of, but my suspicion was that the crons 
weren't able to complete, since the file system was locked up, but the 
next one was being attempted to run ... *shrug*


This seems consistent with behaviour I've seen in on several 6.0-RELEASE 
machines.. from the limited information I've been able to get from the 
machines, there has appeared to be multiple tasks from cron all piled up 
upon one another. In particular, the daily periodic tasks that run the 
various 'find' were one of the things I noticed (although we run 
numerous tasks out of cron)...


If something is blocking the filesystem and causing find (and possibly 
other processes) to become stuck, these would just keep mounting up 
until it all falls over (with numerous maxproc exceeded etc errors).


These are on machines without NFS, but the symptoms are very very 
similar.. NWFS and SMBFS are commonly used on a number of the machines 
I've seen the problem on, which may be relevant -- perhaps it affects 
more than just NFS?


I may experiment with building up a test server locally and trying to 
reproduce similar loads to see if I can trigger the problem in-house.. 
at least that way I can hook up a serial console and get some more 
detailed information...


Regards
Antony

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Intel ICH7R RAID controller working on 6.1/STABLE?

2006-07-14 Thread Matthew Seaman
Javier Henderson wrote:
> * Mike Jakubik <[EMAIL PROTECTED]> [060714 17:15]:

>> The chipset is supported, but i wouldn't recommend onboard raid for any 
>> production server. Get a real raid controller, or use gmirror if you 
>> plan to mirror. I use several of these board sin production with gmirror.
 
> Why do you recommend against on-board RAID controllers?

Think about what happens if one of your disks dies.  Sure, the machine will
carry on running.  With an on-board controller there are two problems:

   i) How do you get notified that a disk has died
  ii) How do you replace the drive

(i) you'ld likely only find out about at reboot time, or by noticing a
change in the pattern of blinken-lights on the machine.  (Don't laugh --
it happens)

(ii) is not just about having to power off the machine and swap out the
hardware: it's not uncommon for on-board RAID-1 setups to be unable to
rebuild a mirror by duplicating the good disk onto the replacement one.  That
means blowing everything away and recovering from backup.  By which time
you've had so much downtime that you might as well not have bothered with
RAID in the first place.

The advantage of a good RAID controller -- like one of the 3ware cards
-- or of gmirror is that combined with hot-swap disk (and pretty much all
SATA drives nowadays have hot-swap capability; you just need to find a
chassis with the right sort of drive bays) then you can take out the dead
disk, replace it with a good one and rebuild the array *without taking the
machine down*.

gmirror will alert you to failures in the nightly e-mail if you enable
the 406.status-gmirror periodic script.  Similarly a good hardware RAID
controller will have a system level control application to let you interface
with the card from the OS level, and it will have some mechanism for alerting
the admin to problems.

Cheers,

Matthew

-- 
Dr Matthew J Seaman MA, D.Phil.   7 Priory Courtyard
  Flat 3
PGP: http://www.infracaninophile.co.uk/pgpkey Ramsgate
  Kent, CT11 9PW



signature.asc
Description: OpenPGP digital signature