On 7/8/20 2:37 PM, Eric Dumazet wrote:
> 
> 
> On 7/8/20 9:38 AM, YU, Xiangning wrote:
>> Lockless Token Bucket (LTB) is a qdisc implementation that controls the
>> use of outbound bandwidth on a shared link. With the help of lockless
>> qdisc, and by decoupling rate limiting and bandwidth sharing, LTB is
>> designed to scale in the cloud data centers.
>>
> 
> ...
> 
> This ltb_class struct has a size of 1579584 bytes :/
> 
>> +struct ltb_class {
>> +    struct Qdisc_class_common common;
>> +    struct psched_ratecfg ratecfg;
>> +    struct psched_ratecfg ceilcfg;
>> +    u32 prio;
>> +    struct ltb_class *parent;
>> +    struct Qdisc *qdisc;
>> +    struct Qdisc *root_qdisc;
>> +    u32 classid;
>> +    struct list_head pnode;
>> +    unsigned long state; ____cacheline_aligned_in_smp
>> +
>> +    /* Aggr/drain context only */
>> +    s64 next_timestamp; ____cacheline_aligned_in_smp
>> +    int num_cpus;
>> +    int last_cpu;
>> +    s64 bw_used;
>> +    s64 last_bytes;
>> +    s64 last_timestamp;
>> +    s64 stat_bytes;
>> +    s64 stat_packets;
>> +    atomic64_t stat_drops;
>> +
>> +    /* Balance delayed work only */
>> +    s64 rate; ____cacheline_aligned_in_smp
>> +    s64 ceil;
>> +    s64 high_water;
>> +    int drop_delay;
>> +    s64 bw_allocated;
>> +    bool want_more;
>> +
>> +    /* Shared b/w aggr/drain thread and balancer */
>> +    unsigned long curr_interval; ____cacheline_aligned_in_smp
>> +    s64 bw_measured;        /* Measured actual bandwidth */
>> +    s64 maxbw;      /* Calculated bandwidth */
>> +
>> +    STRUCT_KFIFO(struct sk_buff *, SKB_QLEN) aggr_queues[MAX_CPU_COUNT];
>> +    ____cacheline_aligned_in_smp
>> +    STRUCT_KFIFO(struct sk_buff *, SKB_QLEN * MAX_CPU_COUNT) drain_queue;
>> +    ____cacheline_aligned_in_smp
>> +    STRUCT_KFIFO(struct sk_buff *, SKB_QLEN) fanout_queues[MAX_CPU_COUNT];
>> +    ____cacheline_aligned_in_smp
>> +
>> +    struct tasklet_struct aggr_tasklet;
>> +    struct hrtimer aggr_timer;
>> +};
>> +
>>
> 
>> +
>> +static struct ltb_class *ltb_alloc_class(struct Qdisc *sch,
>> +                                     struct ltb_class *parent, u32 classid,
>> +                                     struct psched_ratecfg *ratecfg,
>> +                                     struct psched_ratecfg *ceilcfg,
>> +                                     u32 prio)
>> +{
>> +    struct ltb_sched *ltb  = qdisc_priv(sch);
>> +    struct ltb_class *cl;
>> +    int i;
>> +
>> +    if (ratecfg->rate_bytes_ps > ceilcfg->rate_bytes_ps ||
>> +        prio < 0 || prio >= TC_LTB_NUMPRIO)
>> +            return NULL;
>> +
>> +    cl = kzalloc(sizeof(*cl), GFP_KERNEL);
> 
> This is going to fail, 2MB chunks of physically contiguous memory is 
> unreasonable.
> 
> 2MB per class makes this qdisc very particular, especially with 1000 classes ?
> 
> In comparison, HTB class consumes less than 1 KB
> 

The main memory consumption comes from the kfifo queues. We use far more less 
classes than 1000 so we didn't really care that.

If supporting 1000 classes is a goal, we should be able to aggressively reduce 
the queue length. Currently it is set to 512 per-CPU which is a waste. Also we 
can dynamically allocate the kfifo queues according to CPU numbers.

Thanks,
- Xiangning

Reply via email to