On Mon, 13 Nov 2017, Sagi Grimberg wrote: > > 3) Affinity override in managed mode > > > > Doable, but there are a couple of things to think about: > > I think that it will be good to shoot for (3). Given that there are > driver requirements I'd say that driver will expose up front if it can > handle it, and if not we fallback to (1). > > > * How is this enabled? > > > > - Opt-in by driver > > > > - Extra sysfs/procfs knob > > > > We definitely should not enable it per default because that would > > surprise users/drivers which work with the current managed devices and > > rely on the affinity files to be non writeable in managed mode. > > Do you know if any exist? Would it make sense to have a survey to > understand if anyone relies on it? > > From what I've seen so far, drivers that were converted simply worked > with the non-managed facility and didn't have any special code for it. > Perhaps Christoph can comment as he convert most of them. > > But if there aren't any drivers that absolutely rely on it, maybe its > not a bad idea to allow it by default?
Sure, I was just cautious and I have to admit that I have no insight into the driver side details. > > * When and how is the driver informed about the change? > > > > When: > > > > #1 Before the core tries to move the interrupt so it can veto the > > move if it cannot allocate new resources or whatever is required > > to operate after the move. > > What would the core do if a driver veto a move? Return the error code from write_affinity() as it does with any other error which fails to set the affinity. > I'm wandering in what conditions a driver will be unable to allocate > resources for move to cpu X but able to allocate for move to cpu Y. Node affine memory allocation is the only thing which comes to my mind, or some decision not to have a gazillion of queues on a single CPU. > This looks like it can work to me, but I'm probably not familiar enough > to see the full picture here. On the interrupt core side this is workable, I just need the input from the driver^Wsubsystem side if this can be implemented sanely. Thanks, tglx