cluster type
I fixed this awhile back in my local sources. A 12 core Supermicro
MB system I'm building here was hitting the bug 100% of the time during
startup. Patch attached.
-DG
Dr. David G. Lawrence
President
Download Technologies, Inc. - http://www.downloadtech.com - (866) 399
ssing. It should
be possible to scale to 1+ members if kernel UDP processing had optimal
concurrency.
Anyway, thumbs up (and not for the middle-eastern meaning :-)) - I'm
looking forward to the MFC.
-DG
Dr. David G. Lawrence
President
Download Technologies, Inc. - http://www.downloadtech
ntpd, powerd, sshd,
> sendmail, cron, moused and xdm. That is all.
You might want to try disabling powerd and see if that mitigates the
problem. powerd is going to be messing with the CPU clock when it is near
idle. Your system would be less idle with lock profiling enab
ready uses a marker vnode. It is hidden and obfuscated in
the MNT_VNODE_FOREACH macro, further hidden in the __mnt_vnode_first/next
functions, so it should be safe from vnode reclaimation/free problems.
-DG
David G. Lawrence
President
Download Technologies, Inc. - http://www.downloadtech.com - (
y
when you have files modified through the mmap interface (kind of rare
on most systems). Obviously I have mixed feelings about vfs_msync, but
I'm not suggesting here that we should get rid of it as any sort of
solution.
-DG
David G. Lawrence
President
Download Technologies,
truct,
that has on it any vnodes that need to be synced. Unfortuantely, such a
change would be extensive, scattered throughout much of the ufs/ffs code.
-DG
David G. Lawrence
President
Download Technologies, Inc. - http://www.downloadtech.com - (866) 399 8500
The Fre
cond version is that uio_yield doesn't lower the
priority enough for the other threads to run. Forcing it to msleep for
a tick will eliminate the priority from the consideration.
-DG
David G. Lawrence
President
Download Technologies, Inc. - http://www.downloadtech.com - (866) 399 8500
T
ificantly increase the overhead of the loop.
The solution provided by Kostik Belousov that uses uio_yield looks like
a find solution. I intend to try it out on some servers RSN.
-DG
David G. Lawrence
President
Download Technologies, Inc. - http://www.downloadtech.com - (866) 399 8500
The Fr
might be low enough to
not trigger the problem, but also be high enough to not significantly
affect system I/O performance.
-DG
David G. Lawrence
President
Download Technologies, Inc. - http://www.downloadtech.com - (866) 399 8500
The FreeBSD Project - http://www.freebsd.org
Pave the road of l
in problem is to figure out why PREEMPTION doesn't work. I'm
> not working on this directly since I'm running ~5.2 where nearly-full
> kernel preemption doesn't work due to Giant locking.
I don't understand how PREEMPTION is supposed to work (I mean
to any sig
1 and then
setting it back to whatever it was previously (probably 6-10).
If the problem then goes away for awhile, that would be another good
indicator.
-DG
David G. Lawrence
President
Download Technologies, Inc. - http://www.downloadtech.com - (866) 399 8500
The FreeBSD Project - ht
ittle bit). I guess
that got removed when the size of the vnode pool was dramatically
increased.
-DG
David G. Lawrence
President
Download Technologies, Inc. - http://www.downloadtech.com - (866) 399 8500
The FreeBSD Project - http://www.freebsd.org
Pave the road of life
> On Tue, 18 Dec 2007, David G Lawrence wrote:
>
> >>>I got an almost identical delay (with 64000 vnodes).
> >>>
> >>>Now, 17ms isn't much.
> >>
> >> Says you. On modern systems, trying to run a pseudo real-time
> >>
runs at 150MHz when it is in the lowest running CPU power save mode.
At that speed, this bug causes a delay of more than 300ms and is enough
to cause loss of keyboard input. I have to switch into high speed mode
before I try to type anything, else I end up with random typos. Very
annoying.
-DG
Dav
.
I'm going to have to bow out of this discussion now. I just don't have
the time for it.
-DG
David G. Lawrence
President
Download Technologies, Inc. - http://www.downloadtech.com - (866) 399 8500
The FreeBSD Project - http://www.freebsd.org
P
> On Tue, 18 Dec 2007, David G Lawrence wrote:
>
> >>Thanks. Have a kernel building now. It takes about a day of uptime
> >>after reboot before I'll see the problem.
> >
> > You may also wish to try to get the problem to occur sooner after boot
>
instead of /dev/null, if you use GNU tar, to disable its
"optimization"). You can stop it after it has gone through a 100K files.
Verify by looking at "sysctl vfs.numvnodes".
Doing this would help to further prove that lots of allocated vnodes
is the prerequisite for the p
order to get the vnodes allocated. As
I mentioned previously, I suspect that either ip->i_flag is not getting
completely cleared in ffs_syncvnode or its children or
v_bufobj.bo_dirty.bv_cnt accounting is broken.
-DG
David G. Lawrence
President
Download Technologies, Inc
MNT_ILOCK(mp);
+ if (flushed_count++ > 500) {
+ flushed_count = 0;
+ msleep(&flushed_count, MNT_MTX(mp), PZERO, "syncw", 1);
+ }
}
MNT_IUNLOCK(mp);
/*
-DG
David G.
continue;
}
...like the i_flag flags aren't ever getting properly cleared (or bv_cnt
is always non-zero).
...but I don't have the time to chase this down.
-DG
David G. Lawrence
President
Download Technologies, Inc. - http://www.downloadtech.com - (866) 399
o production
> quality.
I'm not using SCHED_ULE on any of the machines that I'm seeing the
timeout problem with em and fxp devices. I suspect the problem has to do
with interrupt thread scheduling; maybe SCHED_ULE just somehow makes the
problem worse?
-DG
David G. Lawrence
President
D
shows the problem has one fxp and one em and
the timeouts occur on both interfaces.
-DG
David G. Lawrence
President
Download Technologies, Inc. - http://www.downloadtech.com - (866) 399 8500
The FreeBSD Project - http://www.freebsd.org
Pave the road of life with opportunities.
_
> Are you enabling an option, like IPv6, that puts Giant over the network
> stack?
>From dmesg:
WARNING: debug.mpsafenet forced to 0 as ipsec requires Giant
WARNING: MPSAFE network stack disabled, expect reduced performance.
...the kernel has IPSEC.
-DG
David G. Lawrence
to occur first on
the machine to put it into a state that makes it suspectible to the
program.
-DG
David G. Lawrence
President
Download Technologies, Inc. - http://www.downloadtech.com - (866) 399 8500
The FreeBSD Project - http://www.freebsd.org
what the common
denominator is.
> try building a non SMP kernel for this machine if I can.
Do you have any history of seeing the watchdog timeout problem on your
machine?
-DG
David G. Lawrence
President
Download Technologies, Inc. - http://www.downloadtech.com - (866) 399 8500
The FreeBSD Proje
> On Fri, Sep 29, 2006 at 12:27:41AM -0700, David G Lawrence wrote:
> >Attached is a simple user program that will immediately cause pretty much
> > all of the network drivers (at least the ones I own) to stop working and
> > get watchdog timeouts.
> >
> > WA
have here is a production machine that I can't test
this on right now.
If running this on an SMP machine doesn't show the problem, then try
running multiple copies of it (one for each CPU).
-DG
David G. Lawrence
President
Download Technologies, Inc. - http://www.downloadtech.com - (8
the console and can ctrl-C
it!
-DG
David G. Lawrence
President
Download Technologies, Inc. - http://www.downloadtech.com - (866) 399 8500
The FreeBSD Project - http://www.freebsd.org
Pave the road of life with opportunities.
#include
main()
{
struct pollfd pfd;
pfd.fd = 1
Note that the watchdog timeout for the network drivers is usually 8000ms
(8 seconds), so this is unlikely to be related to that problem.
-DG
David G. Lawrence
President
Download Technologies, Inc. - http://www.downloadtech.com - (866) 399 8500
The FreeBSD Project - htt
e0: link state changed to UP
> Mar 24 19:40:14 worf kernel: nve0: device timeout (1)
The problem is the watchdog timeout itself. I've attached am email that
I sent a few months ago which describes the problem, along with a simple
patch which disables the watchdog timer.
-DG
David G. Lawr
; on the manpage.
> Forcing lighttpd to not use sendfile fixes the problem,
> but i would really like to use it...
>
> Any suggestions?
What version of FreeBSD?
-DG
David G. Lawrence
President
Download Technologies, Inc. - http://www.downloadtech.com - (866) 399 8500
TeraSolutio
w this in the tftpd manual page?
BUGS
Files larger than 33488896 octets (65535 blocks) cannot be transferred
without client and server supporting blocksize negotiation (RFC1783).
Many tftp clients will not transfer files over 16744448 octets (32767
blocks).
-DG
David
> On Fri, Feb 25, 2005 at 12:18:51PM -0800, David G. Lawrence wrote:
> > Answer
> > Problem:
> > WD EIDE drives are dropped from an IDE RAID array or system after several
> > days or weeks of error-free operation.
>
> Of course I looked at 3ware, at WD
configuration, download the IDE RAID Compatibility Upgrade Utility for 3Ware
7500-X controllers cards.
-DG
David G. Lawrence
President
Download Technologies, Inc. - http://www.downloadtech.com - (866) 399 8500
TeraSolutions, Inc. - http://www.terasolutio
completes, re-boot the system.
The update is complete.
Related Resources
Technical information, FAQ, and related answers from the knowledge base
Upload Date 04/01/2004
-DG
David G. Lawrence
President
Download Technologies, Inc. - http://www.downloadtech.com - (866) 399 8500
TeraSolu
ed block back out to the new (reassigned) block. This
may seem pretty basic for RAID, but many controllers we've tested
actually don't do this.
-DG
David G. Lawrence
President
Download Technologies, Inc. - http://www.downloadtech.com - (866) 399 8500
TeraSolutions, Inc. - http://www.te
gate SATA drives also seem to be reliable, although they don't perform
very well.
-DG
David G. Lawrence
President
Download Technologies, Inc. - http://www.downloadtech.com - (866) 399 8500
TeraSolutions, Inc. - http://www.terasolutions.com - (888) 346 7175
The FreeBSD Project - htt
dow
size (bandwidth * delay) would be a significant limiting factor across a
gig-e LAN.
I too am seeing low NFS performance (both TCP and UDP) with non-SMP
5.3, but on the same systems I can measure raw TCP performance (using
ttcp) of >850Mbps. It looks to me like there is something wrong w
134MB of virtual
memory being consumed, and in most FreeBSD kernel configurations, this would
cause it to run out.
-DG
David G. Lawrence
President
Download Technologies, Inc. - http://www.downloadtech.com - (866) 399 8500
TeraSolutions, Inc. - http://www.terasolutions.com - (888) 3
uld be promptly removed (and should have been when the first signs of
trouble showed up).
-DG
David G. Lawrence
Download Technologies, Inc. - http://www.downloadtech.com - (866) 399 8500
TeraSolutions, Inc. - http://www.terasolutions.com - (888) 346 7175
The FreeBSD Project
operates at the SMTP level. In this scheme, if the mail
system hasn't seen email from your server IP + email address before, then
it defers reception of the email for a few hours. This stops spam from
people doing drive-by spamming since they don't try to re-deliver on
temporarly failures.
>David G. Lawrence wrote:
>>Michael Sierchio wrote:
>[ ... ]
>>>No, the real issue is that there are scads of virii/worms in the wild
>>>which forge message envelope senders. It is absurd to send
>>>autoresponder messages to a mailing list.
>>
>>
a particular email address. Aside from changing email
addresses or shutting off email entirely, there are few alternative
solutions that are effective. I despise third party blacklists, especially
spews which I've been victimized by several times.
-DG
David G. Lawrence
Download Te
43 matches
Mail list logo