On 7/8/20 2:37 PM, Eric Dumazet wrote: > > > On 7/8/20 9:38 AM, YU, Xiangning wrote: >> Lockless Token Bucket (LTB) is a qdisc implementation that controls the >> use of outbound bandwidth on a shared link. With the help of lockless >> qdisc, and by decoupling rate limiting and bandwidth sharing, LTB is >> designed to scale in the cloud data centers. >> > > ... > > This ltb_class struct has a size of 1579584 bytes :/ > >> +struct ltb_class { >> + struct Qdisc_class_common common; >> + struct psched_ratecfg ratecfg; >> + struct psched_ratecfg ceilcfg; >> + u32 prio; >> + struct ltb_class *parent; >> + struct Qdisc *qdisc; >> + struct Qdisc *root_qdisc; >> + u32 classid; >> + struct list_head pnode; >> + unsigned long state; ____cacheline_aligned_in_smp >> + >> + /* Aggr/drain context only */ >> + s64 next_timestamp; ____cacheline_aligned_in_smp >> + int num_cpus; >> + int last_cpu; >> + s64 bw_used; >> + s64 last_bytes; >> + s64 last_timestamp; >> + s64 stat_bytes; >> + s64 stat_packets; >> + atomic64_t stat_drops; >> + >> + /* Balance delayed work only */ >> + s64 rate; ____cacheline_aligned_in_smp >> + s64 ceil; >> + s64 high_water; >> + int drop_delay; >> + s64 bw_allocated; >> + bool want_more; >> + >> + /* Shared b/w aggr/drain thread and balancer */ >> + unsigned long curr_interval; ____cacheline_aligned_in_smp >> + s64 bw_measured; /* Measured actual bandwidth */ >> + s64 maxbw; /* Calculated bandwidth */ >> + >> + STRUCT_KFIFO(struct sk_buff *, SKB_QLEN) aggr_queues[MAX_CPU_COUNT]; >> + ____cacheline_aligned_in_smp >> + STRUCT_KFIFO(struct sk_buff *, SKB_QLEN * MAX_CPU_COUNT) drain_queue; >> + ____cacheline_aligned_in_smp >> + STRUCT_KFIFO(struct sk_buff *, SKB_QLEN) fanout_queues[MAX_CPU_COUNT]; >> + ____cacheline_aligned_in_smp >> + >> + struct tasklet_struct aggr_tasklet; >> + struct hrtimer aggr_timer; >> +}; >> + >> > >> + >> +static struct ltb_class *ltb_alloc_class(struct Qdisc *sch, >> + struct ltb_class *parent, u32 classid, >> + struct psched_ratecfg *ratecfg, >> + struct psched_ratecfg *ceilcfg, >> + u32 prio) >> +{ >> + struct ltb_sched *ltb = qdisc_priv(sch); >> + struct ltb_class *cl; >> + int i; >> + >> + if (ratecfg->rate_bytes_ps > ceilcfg->rate_bytes_ps || >> + prio < 0 || prio >= TC_LTB_NUMPRIO) >> + return NULL; >> + >> + cl = kzalloc(sizeof(*cl), GFP_KERNEL); > > This is going to fail, 2MB chunks of physically contiguous memory is > unreasonable. > > 2MB per class makes this qdisc very particular, especially with 1000 classes ? > > In comparison, HTB class consumes less than 1 KB >
The main memory consumption comes from the kfifo queues. We use far more less classes than 1000 so we didn't really care that. If supporting 1000 classes is a goal, we should be able to aggressively reduce the queue length. Currently it is set to 512 per-CPU which is a waste. Also we can dynamically allocate the kfifo queues according to CPU numbers. Thanks, - Xiangning