Hello, Vivek. On Tue, Dec 11, 2012 at 10:02:34AM -0500, Vivek Goyal wrote: > cfq_group_served() { > if (iops_mode(cfqd)) > charge = cfqq->slice_dispatch; > cfqg->vdisktime += cfq_scale_slice(charge, cfqg); > } > > Isn't it effectively IOPS scheduling. One should get IOPS rate in proportion > to > their weight (as long as they can throw enough traffic at device to keep > it busy). If not, can you please give more details about your proposal.
The problem is that we lose a lot of isolation w/o idling between queues or groups. This is because we switch between slices and while a slice is in progress only ios belongint to that slice can be issued. ie. higher priority cfqgs / cfqqs, after dispatching the ios they have ready, lose their slice immmediately. Lower priority slice takes over and when hgiher priority ones get ready, they have to wait for the lower priority one before submitting the new IOs. In many cases, they end up not being able to generate IOs any faster than the ones in lower priority cfqqs/cfqgs. This is becase we switch slices rather than iops. We can make cfq essentially switch iops by implementing very aggressive preemption but I really don't see much point in that. cfq is way too heavy and ill-suited for high speed non-rot devices which are becoming more and more consistent in terms of iops they can handle. I think we need something better suited for the maturing non-rot devices. They're becoming very different from what cfq was built for and we really shouldn't be maintaining several rb trees which need full synchronization for each IO. We're doing way too much and it just isn't scalable. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/