On 4/18/23 02:29, Crystal Wood wrote: > On Tue, 2023-04-11 at 11:09 -0400, Sean Anderson wrote: >> Hi Crystal, >> >> On 4/4/23 12:04, Sean Anderson wrote: >> > On 4/4/23 11:33, Crystal Wood wrote: >> > > On Tue, 2023-04-04 at 10:55 -0400, Sean Anderson wrote: >> > > >> > > > @@ -1456,11 +1456,11 @@ static void tqm_congestion_task(struct >> > > > work_struct >> > > > *work) >> > > > union qm_mc_result *mcr; >> > > > struct qman_cgr *cgr; >> > > > >> > > > - spin_lock_irq(&p->cgr_lock); >> > > > + raw_spin_lock_irq(&p->cgr_lock); >> > > > qm_mc_start(&p->p); >> > > > qm_mc_commit(&p->p, QM_MCC_VERB_QUERYCONGESTION); >> > > > if (!qm_mc_result_timeout(&p->p, &mcr)) { >> > > > - spin_unlock_irq(&p->cgr_lock); >> > > > + raw_spin_unlock_irq(&p->cgr_lock); >> > > >> > > qm_mc_result_timeout() spins with a timeout of 10 ms which is very >> > > inappropriate for a raw lock. What is the actual expected upper bound? >> > >> > Hm, maybe we can move this qm_mc stuff outside cgr_lock? In most other >> > places they're called without cgr_lock, which implies that its usage >> > here is meant to synchronize against some other function. >> >> Do you have any suggestions here? I think this should really be handled >> in a follow-up patch. If you think this code is waiting too long in a raw >> spinlock, the existing code can wait just as long with IRQs disabled. >> This patch doesn't change existing system responsiveness. > > Well, AFAICT it expands the situations in which it happens from configuration > codepaths to stuff like congestion handling. The proper fix would probably be > to use some mechanism other than smp_call_function_single() to run code on the > target cpu so that it can run with irqs enabled (or get confirmation that the > actual worst case is short enough),
Well, we used to use a kthread/wait_for_completion before 96f413f47677 ("soc/fsl/qbman: fix issue in qman_delete_cgr_safe()"). The commit description there isn't very enlightening as to the actual bug, which is why I CC's Madalin earlier. To be honest, I'm not really familiar with the other options in the kernel. > but barring that I guess at least acknowledge the situation in a > comment? Fine by me. --Sean