On Wed, Dec 13, 2017 at 04:36:53PM +0000, Patrick Bellasi wrote: > On 13-Dec 17:19, Peter Zijlstra wrote: > > On Tue, Dec 05, 2017 at 05:10:16PM +0000, Patrick Bellasi wrote: > > > @@ -562,6 +577,12 @@ struct task_struct { > > > > > > const struct sched_class *sched_class; > > > struct sched_entity se; > > > + /* > > > + * Since we use se.avg.util_avg to update util_est fields, > > > + * this last can benefit from being close to se which > > > + * also defines se.avg as cache aligned. > > > + */ > > > + struct util_est util_est;
The thing is, since sched_entity has a member with cacheline alignment, the whole structure must have cacheline alignment, and this util_est _will_ start on a new line. See also: $ pahole -EC task_struct defconfig/kernel/sched/core.o ... struct sched_avg { /* typedef u64 */ long long unsigned int last_update_time; /* 576 8 */ /* typedef u64 */ long long unsigned int load_sum; /* 584 8 */ /* typedef u32 */ unsigned int util_sum; /* 592 4 */ /* typedef u32 */ unsigned int period_contrib; /* 596 4 */ long unsigned int load_avg; /* 600 8 */ long unsigned int util_avg; /* 608 8 */ } avg; /* 576 40 */ /* --- cacheline 6 boundary (384 bytes) --- */ } se; /* 192 448 */ /* --- cacheline 8 boundary (512 bytes) was 24 bytes ago --- */ struct util_est { long unsigned int last; /* 640 8 */ long unsigned int ewma; /* 648 8 */ } util_est; /* 640 16 */ ... The thing is somewhat confused on which cacheline is which, but you'll see sched_avg landing at 576 (cacheline #9) and util_est at 640 (line #10). > > > struct sched_rt_entity rt; > > > #ifdef CONFIG_CGROUP_SCHED > > > struct task_group *sched_task_group; > One goal was to keep util_est variables close to the util_avg used to > load the filter, for caches affinity sakes. > > The other goal was to have util_est data only for Tasks and CPU's > RQ, thus avoiding unused data for TG's RQ and SE. > > Unfortunately the first goal does not allow to achieve completely the > second and, you right, the solution looks a bit inconsistent. > > Do you think we should better disregard cache proximity and move > util_est_runnable to rq? proximity is likely important; I'd suggest moving util_est into sched_entity.