W dniu 2010-12-30 14:36, Claudio Jeker pisze:
On Thu, Dec 30, 2010 at 01:51:29PM +0100, RLW wrote:
W dniu 2010-11-19 00:24, Stuart Henderson pisze:
On 2010-11-18, RLW<seran...@o2.pl> wrote:
W dniu 2010-11-18 17:41, Claudio Jeker pisze:
No the problem is altq. Altq(4) was written when 100Mbps was common and
people shaped traffic in the low megabit range. It seems to hit a wall when
doing hundreds of megabits. Guess someone needs to run a profiling kernel
and see where all that time is spent and then optimize altq(4).
Its nice to hear from OpenBSD developer on this matter.
I am wondering who is gonna be that "someone"? ;) and when it could happen?
The "someone" running a profiling kernel to identify the hot spots could be you.
cd /sys/arch/<arch>/config
config -p<kernelname>
build a kernel from the ../compile/<kernelname>.PROF directory in the usual way
kgmon -b to start profiling
(generate some traffic)
kgmon -h to stop profiling
kgmon -p to dump stats
gprof /bsd gmon.out to read stats...
Assuming you're interested in routed traffic (rather than queuing traffic
generated on a box itself), make sure you run the traffic source and sink
on other machines routing through the altq box, don't source/sink traffic
on the altq box itself.
Hello again ;)
I finaly had time to do kernel profiling.
So we have:
- default OpenBSD 4.8 install
- em0 nic (at pci express slot)
- default sysctl
- definition of queue in pf.conf:
altq on em0 cbq bandwidth 1Gb queue { q_lan }
queue q_lan bandwidth 950Mb cbq (default)
- low speed between Linux Debian box (as iperf server) and OpenBSD
box (as iperf client):
[ ID] Interval Transfer Bandwidth
[ 3] 41.0-42.0 sec 17.1 MBytes 144 Mbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 42.0-43.0 sec 17.2 MBytes 144 Mbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 43.0-44.0 sec 17.1 MBytes 144 Mbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 44.0-45.0 sec 17.2 MBytes 144 Mbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 45.0-46.0 sec 17.2 MBytes 144 Mbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 46.0-47.0 sec 17.1 MBytes 143 Mbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 47.0-48.0 sec 17.2 MBytes 144 Mbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 48.0-49.0 sec 17.1 MBytes 144 Mbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 49.0-50.0 sec 17.1 MBytes 144 Mbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-50.0 sec 858 MBytes 144 Mbits/sec
- stats from kernel profiling at:
http://erydium.pl/upload/20101230_profiling.txt
From the porfile output.
index %time self descendents called+self name index
[1] 83.9 0.00 359.49 sched_idle [1]
You spent> 80% in idle. So while forwarding all that traffic the box was
mostly idle.
although kernel profiling stats show that system spends 80% time in
idle, top during the iperf test shows:
load averages: 0.40, 0.16, 0.11 15:43:02
27 processes: 1 running, 25 idle, 1 on processor
CPU states: 0.6% user, 0.0% nice, 77.6% system, 21.8% interrupt, 0.0%
idle
Memory: Real: 10M/83M act/tot Free: 402M Swap: 0K/759M used/tot
PID USERNAME PRI NICE SIZE RES STATE WAIT TIME CPU COMMAND
1680 root 59 0 476K 1188K run - 0:21 63.28% iperf
Interesting are:
[6] 3.3 0.02 14.26 4028087 acpi_get_timecount [6]
[7] 3.2 0.14 13.66 3854116 binuptime [7]
I guess these are that high up in the profile because of altq.
These seem to be altq related:
[10] 2.9 0.05 12.22 3234667 cbq_pfattach [10]
[13] 2.2 0.03 9.18 2386574 tbr_dequeue [13]
[14] 2.1 0.38 8.71 2386574 rmc_dequeue_next [14]
Now this profiles shows one thing, the problem is not CPU bound but
actually it seems like the TBR (token bucket regulator) is the problem.
What seems to happen is that the TBR runs low and returns NULL and
requires a timeout to fire to move the packets.
i dont know that is TBR, but can I do something with it??
best regards,
RLW