On 2020-12-21 16:35, Bjorn Andersson wrote:
On Thu 17 Dec 12:49 CST 2020, Alex Elder wrote:

On 12/17/20 12:21 PM, risha...@codeaurora.org wrote:
> On 2020-12-17 08:12, Alex Elder wrote:
> > On 12/15/20 4:55 PM, Bjorn Andersson wrote:
> > > On Sat 12 Dec 14:48 CST 2020, Rishabh Bhatnagar wrote:
> > >
> > > > Create an unbound high priority workqueue for recovery tasks.
> >
> > I have been looking at a different issue that is caused by
> > crash notification.
> >
> > What happened was that the modem crashed while the AP was
> > in system suspend (or possibly even resuming) state.  And
> > there is no guarantee that the system will have called a
> > driver's ->resume callback when the crash notification is
> > delivered.
> >
> > In my case (in the IPA driver), handling a modem crash
> > cannot be done while the driver is suspended; i.e. the
> > activities in its ->resume callback must be completed
> > before we can recover from the crash.
> >
> > For this reason I might like to change the way the
> > crash notification is handled, but what I'd rather see
> > is to have the work queue not run until user space
> > is unfrozen, which would guarantee that all drivers
> > that have registered for a crash notification will
> > be resumed when the notification arrives.
> >
> > I'm not sure how that interacts with what you are
> > looking for here.  I think the workqueue could still
> > be unbound, but its work would be delayed longer before
> > any notification (and recovery) started.
> >
> >                     -Alex
> >
> >
> In that case, maybe adding a "WQ_FREEZABLE" flag might help?

Yes, exactly.  But how does that affect whatever you were
trying to do with your patch?


I don't see any impact on Rishabh's change in particular, syntactically
it would just be a matter of adding another flag and the impact would be
separate from his patch.

In other words, creating a separate work queue to get the long running
work off the system_wq and making sure that these doesn't run during
suspend & resume seems very reasonable to me.

The one piece that I'm still contemplating is the HIPRIO, I would like
to better understand the actual impact - or perhaps is this a result of
everyone downstream moving all their work to HIPRIO work queues,
starving the recovery?

Hi Bjorn,
You are right, this is a result of downstream having HIPRIO workqueues
therefore starving recovery. I don't have actual data to support the flag
as of now. If needed for now we can skip this flag and add it later with
sufficient data?
Regards,
Bjorn

Reply via email to