On Fri, Jan 13, 2017 at 12:01:03AM -0800, David Carrillo-Cisneros wrote: > On Thu, Jan 12, 2017 at 4:14 AM, Mark Rutland <mark.rutl...@arm.com> wrote: > > On Tue, Jan 10, 2017 at 12:51:58PM -0800, David Carrillo-Cisneros wrote: > >> On Tue, Jan 10, 2017 at 8:38 AM, Mark Rutland <mark.rutl...@arm.com> wrote:
> >> > That's a fair point. Sorting by CPU before runtime we'll get subtrees we > >> > won't get fairness unless we sort the events solely by runtime at > >> > sched_in time. If we sort by with runtime before CPU we'll have to skip > >> > events not targeting the current CPU when scheduling task events in. I > >> > note the latter is true today anyhow. > >> > >> That's were ctx->inactive_groups comes in. That list is sorted by runtime > >> and the rb-tree is used to skip to the part of the list that has the events > >> that matter. > > > > Is the list only sorted by runtime and not {cpu,runtime}? If it's the > > latter, I'm not sure I follow. If it's the former, sorry for the noise! > > The former. List only sorted by runtime. Ah, sorry. I had missed that. > > The case I'm worried about is a set of task-bound events that have CPU > > filters. For example, if the user opens a set of task-bound events for > > any CPU: > > > > perf_event_open(attr1, pid, -1, -1, 0); > > perf_event_open(attr2, pid, -1, -1, 0); > > > > ... and also some for the same task, but limited to a specific CPU: > > > > perf_event_open(attr3, pid, 1, -1, 0); > > perf_event_open(attr4, pid, 1, -1, 0); > > > > ... if CPU is before runtime in the sort, one of these groups will > > always be considered first when scheduling, and may starve the other > > group. > > Yes, that case is the reason to have the sorted inactive_list and > the tree. I tried to explain this in the change log of this patch. Please > see new attempt below. That's mostly a reading comprehension failure on my behalf, the commit log does accurately describe this. It might be a little clearer if we say the inactive list is sorted *solely* by timestamp, but nothing more than that should be necessary. > >> > In Peter's original suggestion we didn't sort by cgroup. IIRC there was > >> > some email thread where the cgroup was considered for the sort (maybe > >> > that was *only* for cpu contexts? I'm not too familiar with cgroups), > >> > though I can't find the relevant mail, if it existed. :/ > >> > >> FWIW, in this approach, we only sort by cgroup in CPU contexts, since > >> cgroup > >> events are only installed in CPU contexts. > > > > Sure. However, I think a similar issue to the above applies when > > scheduling events where some are bound to a specific cgroup, and others > > are not. > > Yes, it's addressed in the same way. I see that now. Many thanks for the explanation, and apologies for the noise. Thanks, Mark.