Hi.
Kris Kennaway wrote:
After some time of running under high load disk performance become
expremely poor. At that periods 'systat -vm 1' shows something like
this:
This web service is similiar to YouTube. This server is video store. I
have around 200G of *.flv (flash video) files on the server
I run lighttpd as a web server. Disk load is usually around 50%,
network
output 100Mbit/s, 100 simultaneous connections. CPU is mostly idle.
This is very unlikely, because I have 5 another video storage servers
of the same hardware and software configurations and they feel good.
Clearly something is different about them, though. If you can
characterize exactly what that is then it will help.
I can't see any difference but a date of installation. Really I compared
all parameters and got nothing interesting.
At first glance one can say that problem is in Dell's x850 series or
amr(4), but we run this hardware on many other projects and they work
well. Also Linux on them works.
OK but there is no evidence in what you posted so far that amr is
involved in any way. There is convincing evidence that it is the mbuf
issue.
Why are you sure this is the mbuf issue? For example, if there is a real
problem with amr or VM causing disk slowdown, then when it occurs the
network subsystem will have another load pattern. Instead of just quick
sending large amounts of data, the system will have to accept large
amount of sumultaneous connections waiting for data. Can this cause high
mbuf contention?
And few hours ago I received feed back from Andrzej Tobola, he has the
same problem on FreeBSD 7 with Promise ATA software mirror:
Well, he didnt provide any evidence yet that it is the same problem, so
let's not become confused by feelings :)
I think he is telling about 100% disk busy while processing ~5
transfers/sec.
So I can conclude that FreeBSD has a long standing bug in VM that
could be triggered when serving large amount of static data (much
bigger than memory size) on high rates. Possibly this only applies to
large files like mp3 or video.
It is possible, we have further work to do to conclude this though.
I forgot to mention I have pmc and kgmon profiling for good and bad
times. But I have not enough knowledge to interpret it right and not
sure if it can help.
Also now I run nginx instead of lighttpd on one of the problematic
servers. It seems to work much better - sometimes there is a peaks in
disk load, but disk does not become very slow and network output does
not change. The difference of nginx is that it runs in multiple
processes, while lighttpd by default has only one process. Now I
configured lighttpd on other server to run in multiple workers. I'll see
if it helps.
What else can i try?
With best regards,
Alexey Popov
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"