On Sun, Mar 06, 2016 at 08:06:41PM +0300, Cyrill Gorcunov wrote: > > > > Well, this looks like LOCKDEP kernel. Are you really running LOCKDEP on > > production kernels ? >
Hi Eric, David. Sorry for the delay. Finally I've measured the latency on the hw. It's i7-2600 cpu with 16G of memory. Here are the collected data. --- Unpatched vanilla ================= commit 7f02bf6b5f5de90b7a331759b5364e41c0f39bf9 Author: Linus Torvalds <torva...@linux-foundation.org> Date: Tue Mar 8 09:41:20 2016 -0800 Creating new addresses ---------------------- 19.26% [kernel] [k] check_lifetime 13.88% [kernel] [k] __inet_insert_ifa 13.01% [kernel] [k] inet_rtm_newaddr Release ------- 20.96% [kernel] [k] _raw_spin_lock 17.79% [kernel] [k] preempt_count_add 14.79% [kernel] [k] __local_bh_enable_ip 13.08% [kernel] [k] preempt_count_sub 9.21% [kernel] [k] nf_ct_iterate_cleanup 3.15% [kernel] [k] _raw_spin_unlock 2.80% [kernel] [k] nf_conntrack_lock 2.67% [kernel] [k] in_lock_functions 2.63% [kernel] [k] get_parent_ip 2.26% [kernel] [k] __inet_del_ifa 2.17% [kernel] [k] fib_del_ifaddr 1.77% [kernel] [k] _cond_resched [root@s125 ~]# ./exploit.sh START 4 addresses STOP 1457537580 1457537581 START 2704 addresses STOP 1457537584 1457537589 START 10404 addresses STOP 1457537602 1457537622 START 23104 addresses STOP 1457537657 1457537702 START 40804 addresses STOP 1457537784 1457537867 START 63504 addresses STOP 1457538048 1457538187 Patched (David's two patches) ============================= Creating new addresses ---------------------- 21.63% [kernel] [k] check_lifetime 14.31% [kernel] [k] __inet_insert_ifa 13.47% [kernel] [k] inet_rtm_newaddr 1.53% [kernel] [k] check_preemption_disabled 1.38% [kernel] [k] page_fault 1.27% [kernel] [k] unmap_page_range Release ------- 24.26% [kernel] [k] _raw_spin_lock 17.55% [kernel] [k] preempt_count_add 14.81% [kernel] [k] __local_bh_enable_ip 14.17% [kernel] [k] preempt_count_sub 10.10% [kernel] [k] nf_ct_iterate_cleanup 3.00% [kernel] [k] _raw_spin_unlock 2.95% [kernel] [k] nf_conntrack_lock 2.86% [kernel] [k] in_lock_functions 2.73% [kernel] [k] get_parent_ip 1.91% [kernel] [k] _cond_resched 0.39% [kernel] [k] task_tick_fair 0.27% [kernel] [k] native_write_msr_safe 0.22% [kernel] [k] rcu_check_callbacks 0.20% [kernel] [k] check_lifetime 0.18% [kernel] [k] check_preemption_disabled 0.16% [kernel] [k] hrtimer_active 0.13% [kernel] [k] __inet_insert_ifa 0.13% [kernel] [k] __memmove 0.13% [kernel] [k] inet_rtm_newaddr [root@s125 ~]# ./exploit.sh START 4 addresses STOP 1457539863 1457539864 START 2704 addresses STOP 1457539867 1457539872 START 10404 addresses STOP 1457539885 1457539905 START 23104 addresses STOP 1457539938 1457539980 START 40804 addresses STOP 1457540058 1457540132 START 63504 addresses STOP 1457540305 1457540418 --- The lockdep is turned off. And the script itself is --- [root@s125 ~]# cat ./exploit.sh #!/bin/sh if [ -z $1 ]; then for x in `seq 1 50 255`; do echo -n "START " (unshare -n /bin/sh exploit.sh $x) sleep 1 for j in `seq 0 100`; do ip r > /dev/null done echo -n " " echo `date +%s` done else for x in `seq 0 $1`; do for y in `seq 0 $1`; do ip a a 127.1.$x.$y dev lo done done num=`ip a l dev lo | grep -c "inet "` echo -n "$num addresses " echo -n "STOP " echo -n `date +%s` exit fi --- Note i run ip r in a cycle and added sleep before. On idle machine this cycle takes ~1 second. But when run when kernel cleans up the netnamespace it takea a way longer. Also here is a graph for the data collected (blue line: unpatched version, red -- patched. Of course with patched version it become a way more better but still hanging). https://docs.google.com/spreadsheets/d/1eyQDxjuZY2DHKYksGACpHDDcV1Bd92e-ZiY8ywPKshA/edit?usp=sharing The perf output earlier shows the "perf top" when addresses are created and when they are releasing. The main problem still I think is that we allow to request as many inet addresses as there is enough free memory and of course kernel can't handle all in O(1) time, all resources must be released so there always be some lagging moment. Thus maybe introducing limits would be a good idea for sysadmins. Cyrill