In message <4bcf78e5.9020...@linux.vnet.ibm.com> you wrote:
> On 04/21/2010 04:03 PM, Michael Neuling wrote:
> > In message <4bcf029b.1020...@linux.vnet.ibm.com> you wrote:
> >> On 04/21/2010 08:35 AM, Michael Ellerman wrote:
> >>> On Tue, 2010-04-20 at 22:15 -0500, Brian King wrote:
> >>>> On 04/20/2010 09:04 PM, Michael Neuling wrote:
> >>>>> In message <201004210154.o3l1sxar001...@d01av04.pok.ibm.com> you wrote:
> >>>>>>
> >>>>>> Since there is nothing to stop an IPI from occurring to an
> >>>>>> offline CPU, rather than printing a warning to the logs,
> >>>>>> just ignore the IPI. This was seen while stress testing
> >>>>>> SMT enable/disable.
> >>>>>
> >>>>> This seems like a recipe for disaster.  Do we at least need a
> >>>>> WARN_ON_ONCE?
> >>>>
> >>>> Actually we are only seeing it once per offlining of a CPU,
> >>>> and only once in a while.
> >>>>  
> >>>> My guess is that once the CPU is marked offline fewer IPIs
> >>>> get sent to it since its no longer in the online mask.
> >>>
> >>> Hmm, right. Once it's offline it shouldn't get _any_ IPIs, AFAICS.
> >>>
> >>>> Perhaps we should be disabling IPIs to offline CPUs instead?
> >>>
> >>> You mean not sending them? We do:
> >>>
> >>> void smp_xics_message_pass(int target, int msg)
> >>> {
> >>>         unsigned int i;
> >>>
> >>>         if (target < NR_CPUS) {
> >>>                 smp_xics_do_message(target, msg);
> >>>         } else {
> >>>                 for_each_online_cpu(i) {
> >>>                         if (target == MSG_ALL_BUT_SELF
> >>>                             && i == smp_processor_id())
> >>>                                 continue;
> >>>                         smp_xics_do_message(i, msg);
> >>>                 }
> >>>         }
> >>> }      
> >>>
> >>> So it does sound like the IPI was sent while the cpu was online (ie.
> >>> before pseries_cpu_disable(), but xics_migrate_irqs_away() has not
> >>> caused the IPI to be cancelled.
> >>>
> >>> Problem is I don't think we can just ignore the IPI. The IPI might have
> >>> been sent for a smp_call_function() which is waiting for the result, in
> >>> which case if we ignore it the caller will block for ever.
> >>>
> >>> I don't see how to fix it :/
> >>
> >> Any objections to just removing the warning?
> > 
> > Well someone could be waiting for the result, so it could be a real
> > problem.  
> > 
> > IMHO the warning should stay.
> 
> Looking in arch/powerpc/kernel/smp.c, there are four possible IPIs:
> 
> void smp_message_recv(int msg)
> {
>       switch(msg) {
>       case PPC_MSG_CALL_FUNCTION:
>               generic_smp_call_function_interrupt();
>               break;
>       case PPC_MSG_RESCHEDULE:
>               /* we notice need_resched on exit */
>               break;
>       case PPC_MSG_CALL_FUNC_SINGLE:
>               generic_smp_call_function_single_interrupt();
>               break;
>       case PPC_MSG_DEBUGGER_BREAK:
>               if (crash_ipi_function_ptr) {
>                       crash_ipi_function_ptr(get_irq_regs());
>                       break;
>               }
> #ifdef CONFIG_DEBUGGER
>               debugger_ipi(get_irq_regs());
>               break;
> #endif /* CONFIG_DEBUGGER */
>               /* FALLTHROUGH */
> 
> 
> Both generic_smp_call_function_interrupt and
> generic_smp_call_function_single_interrupt have
> WARN_ON(!cpu_online(cpu)); in them. The debugger IPI, appears to
> ignore the IPI if the cpu is offline, which leaves the reschedule
> IPI. This is likely the one I am seeing in test, since I'm not seeing
> the other WARN_ON's.

I'm not sure what you are suggesting?

If the other methods produce the warning when a CPU is offline, surely
we should keep the warning?  Maybe we need to add one to the debugger
case too if we want to be consistent.  

Mikey
_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Reply via email to