On Mon, Mar 9, 2015 at 12:08 PM, Sebastian Andrzej Siewior
<bige...@linutronix.de> wrote:
> * Brian Silverman | 2015-03-05 00:16:20 [-0500]:
>
>>Beforehand, 000 is spending most of its time in interrupts, but bash
>>is doing something related to memory management on it in between.
>>            bash-14721 [000] ......1  6854.629126: rt_spin_lock 
>> <-free_pcppages_bulk
>>            bash-14721 [000] ....1.1  6854.629126: mm_page_pcpu_drain: 
>> page=ffffea000fa1aa40 pfn=4097705 order=0 migratetype=1
>>            bash-14721 [000] ......1  6854.629127: get_pfnblock_flags_mask 
>> <-free_pcppages_bulk
>>            bash-14721 [000] ......1  6854.629127: __mod_zone_page_state 
>> <-free_pcppages_bulk
>>            bash-14721 [000] ....1.1  6854.629127: mm_page_pcpu_drain: 
>> page=ffffea000f572ac0 pfn=4021419 order=0 migratetype=0
>>            bash-14721 [000] ......1  6854.629128: get_pfnblock_flags_mask 
>> <-free_pcppages_bulk
>>            bash-14721 [000] ......1  6854.629128: __mod_zone_page_state 
>> <-free_pcppages_bulk
>>... # lots more virtually identical repetitions of those last 3 lines
>>            bash-14721 [000] ....1.1  6854.629139: mm_page_pcpu_drain: 
>> page=ffffea000f481a80 pfn=4005994 order=0 migratetype=1
>>            bash-14721 [000] ......1  6854.629139: get_pfnblock_flags_mask 
>> <-free_pcppages_bulk
> You free memory and hold the zone->lock
>
>>            bash-14721 [000] d.....1  6854.629139: do_IRQ <-ret_from_intr
>>            bash-14721 [000] d.....1  6854.629139: irq_enter <-do_IRQ
>>... # wakes up the can1 ISR thread on 001 and the can0 one on 000
>>(same physical IRQ line)
>>            bash-14721 [000] d...3.1  6854.629261: sched_switch: 
>> prev_comm=bash prev_pid=14721 prev_prio=120 prev_state=R+ ==> 
>> next_comm=irq/18-can0 next_pid=2015 next_prio=28
>
> I would assume that this one raises NET_RX softirq. But at the bottom
> you have the irq handler on the other CPU which confuses me…

There wasn't actually any traffic on can0 for this test, so it didn't.
The can0 ISR only makes a few reads/writes to the device, never
calling netif_rx.

The can1 handler (which actually raises a NET_RX softirq) runs on 001
because it's pinned there.

>
>>... # runs the can0 ISR
>>     irq/18-can0-2015  [000] d...3..  6854.629283: sched_switch: 
>> prev_comm=irq/18-can0 prev_pid=2015 prev_prio=28 prev_state=S ==> 
>> next_comm=ksoftirqd/0 next_pid=3 next_prio=98
>>...
>>     ksoftirqd/0-3     [000] ....1.1  6854.629291: softirq_entry: vec=1 
>> [action=TIMER]
>>...
>>     ksoftirqd/0-3     [000] ....1.1  6854.629293: softirq_exit: vec=1 
>> [action=TIMER]
> only the timer since nobody raised NET_RX

Correct. I included that as context for what 000 spent its time doing
with the lock held, not implying that it's incorrect.

>
>>...
>>     ksoftirqd/0-3     [000] .......  6854.629298: schedule 
>> <-smpboot_thread_fn ...
>>     ksoftirqd/0-3     [000] d...3..  6854.629304: sched_switch: 
>> prev_comm=ksoftirqd/0 prev_pid=3 prev_prio=98 prev_state=S ==> 
>> next_comm=bash next_pid=14721 next_prio=28
>>...
>>            bash-14721 [000] d...1.1  6854.629308: 
>> smp_trace_reschedule_interrupt <-trace_reschedule_interrupt
>># Actually unnecessary schedule IPI from 001?
>>            bash-14721 [000] d...1.1  6854.629309: irq_enter 
>> <-smp_trace_reschedule_interrupt
>>...
>>            bash-14721 [000] ....1.1  6854.629316: __tick_nohz_task_switch 
>> <-__schedule
>>            bash-14721 [000] ......1  6854.629316: __mod_zone_page_state 
>> <-free_pcppages_bulk
>>            bash-14721 [000] ....1.1  6854.629317: mm_page_pcpu_drain: 
>> page=ffffea000f57a900 pfn=4021924 order=0 migratetype=0
>>            bash-14721 [000] ......1  6854.629317: get_pfnblock_flags_mask 
>> <-free_pcppages_bulk
>>            bash-14721 [000] ......1  6854.629317: __mod_zone_page_state 
>> <-free_pcppages_bulk
> and it continues cleaning up memory.
>
>>... # more of this like it was doing before
>>I don't see it unlocking the problematic mutex before the trace ends.
>>
>>The relevant part for 001 starts while it's running the can1 ISR.
>>     irq/18-can1-7228  [001] ....1.1  6854.629275: netif_rx: dev=can1 
>> skbaddr=ffff880412d8fc00 len=16
>>     irq/18-can1-7228  [001] ......1  6854.629275: migrate_disable 
>> <-netif_rx_internal
>>     irq/18-can1-7228  [001] ......2  6854.629275: enqueue_to_backlog 
>> <-netif_rx_internal
>
> enqueue_to_backlog() looks like packet reception so this should be
> handled in napi so I assume we run in NET_RX softirq

enqueue_to_backlog is the napi function that actually raises the NET_RX softirq.

>>     irq/18-can1-7228  [001] d.....2  6854.629276: _raw_spin_lock 
>> <-enqueue_to_backlog
>>     irq/18-can1-7228  [001] d...1.2  6854.629276: __raise_softirq_irqoff 
>> <-enqueue_to_backlog
>>     irq/18-can1-7228  [001] d...1.2  6854.629276: do_raise_softirq_irqoff 
>> <-__raise_softirq_irqoff
>>     irq/18-can1-7228  [001] d...2.2  6854.629276: softirq_raise: vec=3 
>> [action=NET_RX]
This is where the softirq gets raised.

>>... # continues handling the can1 interrupt
>>     irq/18-can1-7228  [001] ......6  6854.629286: rt_spin_lock 
>> <-get_page_from_freelist
>>     irq/18-can1-7228  [001] ......6  6854.629287: rt_spin_lock_slowlock 
>> <-get_page_from_freelist
>>     irq/18-can1-7228  [001] ......6  6854.629287: _raw_spin_lock 
>> <-rt_spin_lock_slowlock
>>     irq/18-can1-7228  [001] ....1.6  6854.629287: __try_to_take_rt_mutex 
>> <-rt_spin_lock_slowlock
>>     irq/18-can1-7228  [001] ....1.6  6854.629287: _raw_spin_lock_irq 
>> <-rt_spin_lock_slowlock
>>     irq/18-can1-7228  [001] d...2.6  6854.629288: _raw_spin_unlock_irq 
>> <-rt_spin_lock_slowlock
>>     irq/18-can1-7228  [001] ....1.6  6854.629288: task_blocks_on_rt_mutex 
>> <-rt_spin_lock_slowlock
>
> it might be zone->lock it goes after. It boosts the bash process which
> seems to free memory so it would make sense.
>
>>... # rt_mutex/scheduling stuff
>>     irq/18-can1-7228  [001] d...4.6  6854.629291: sched_pi_setprio: 
>> comm=bash pid=14721 oldprio=120 newprio=28
>>... # more scheduler stuff
>>     irq/18-can1-7228  [001] d...3.6  6854.629299: native_smp_send_reschedule 
>> <-rt_mutex_setprio
>>... # more scheduler stuff
>>     irq/18-can1-7228  [001] d...2.6  6854.629307: pick_next_task_fair 
>> <-__schedule
>>     irq/18-can1-7228  [001] d...2.6  6854.629307: pick_next_task_stop 
>> <-__schedule
>>     irq/18-can1-7228  [001] d...2.6  6854.629307: pick_next_task_dl 
>> <-__schedule
>>     irq/18-can1-7228  [001] d...2.6  6854.629307: pick_next_task_rt 
>> <-__schedule
>>     irq/18-can1-7228  [001] d...2.6  6854.629307: pick_next_task_fair 
>> <-__schedule
>>     irq/18-can1-7228  [001] d...2.6  6854.629308: pick_next_task_idle 
>> <-__schedule
>>     irq/18-can1-7228  [001] d...3.6  6854.629308: sched_switch: 
>> prev_comm=irq/18-can1 prev_pid=7228 prev_prio=28 prev_state=D ==>
>>next_comm=swapper/1 next_pid=0 next_prio=120
>>...
>>          <idle>-0     [001] d...1..  6854.629319: softirq_check_pending_idle 
>> <-tick_nohz_idle_enter
>>My tracing_off() call is in softirq_check_pending_idle, so that's it.
>
> It looks like your softirq for net_rx is getting a packet and then after
> raising NET_RX (again?) it blocks on a lock. In order to get this lock
> it boosts and schedules bash. It gets runable but on the other CPU. On
> CPU1 there is nothig going is nothing going and the only runable task is
> the idle thread. And this is probably where the warning is written
> because we go to idle while we should process a softirq instead.

That sounds like the issue. Doing the softirq instead of going idle in
this situation seems like it means calling thread_do_softirq() from
__schedule, but I don't know where the right place is. Can anybody
give me some help on where exactly to check for softirqs from?

Thanks,
Brian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to