Hi,
I finally got some chance and progress on redesigning rdma cgroup
controller for the most use cases that we discussed in this email
chain.
I am posting RFC and soon code in new email.
Parav
On Sun, Sep 20, 2015 at 4:05 PM, Haggai Eran wrote:
> On 15/09/2015 06:45, Jason Gunthorpe wrote:
>>
On 15/09/2015 06:45, Jason Gunthorpe wrote:
> No, I'm saying the resource pool is *well defined* and *fixed* by each
> hardware.
>
> The only question is how do we expose the N resource limits, the list
> of which is totally vendor specific.
I don't see why you say the limits are vendor specific.
Hi Jason, Sean, Tejun,
I am in process of defining new approach, design based on the feedback
given here for new RDMA cgroup from all of you.
I have also collected feedback from Liran yesterday and ORNL folks too.
Soon I will post the new approach, high level APIs and functionality
for review bef
On Tue, Sep 15, 2015 at 08:38:54AM +0530, Parav Pandit wrote:
> As you precisely described, about wild ratio,
> we are asking vendor driver (bottom most layer) to statically define
> what the resource pool is, without telling him which application are
> we going to run to use those pool.
> Therefo
> Because actual hardware resources *ARE* the limit. We cannot abstract
> it away. The hardware/driver has real, fixed, immutable limits. No API
> abstraction can possibly change that.
>
> The limits are such there *IS NO* API boundary that can bundle them
> into something simpler. There will alway
On Tue, Sep 15, 2015 at 12:24:41AM +0530, Parav Pandit wrote:
> On Mon, Sep 14, 2015 at 10:58 PM, Jason Gunthorpe
> wrote:
> > On Mon, Sep 14, 2015 at 04:39:33PM +0530, Parav Pandit wrote:
> >
> >> 1. How does the % of resource, is different than absolute number? With
> >> rest of the cgroups syst
On Mon, Sep 14, 2015 at 10:58 PM, Jason Gunthorpe
wrote:
> On Mon, Sep 14, 2015 at 04:39:33PM +0530, Parav Pandit wrote:
>
>> 1. How does the % of resource, is different than absolute number? With
>> rest of the cgroups systems we define absolute number at most places
>> to my knowledge.
>
> There
On Mon, Sep 14, 2015 at 04:39:33PM +0530, Parav Pandit wrote:
> 1. How does the % of resource, is different than absolute number? With
> rest of the cgroups systems we define absolute number at most places
> to my knowledge.
There isn't really much choice if the abstraction is a bundle of all
res
Hello, Parav.
On Mon, Sep 14, 2015 at 07:34:09PM +0530, Parav Pandit wrote:
> I missed to acknowledge your point that we need both - hard limit and
> soft limit/weight. Current patchset is only based on hard limit.
> I see that weight would be another helfpul layer in chain that we can
> implement
Hi Tejun,
I missed to acknowledge your point that we need both - hard limit and
soft limit/weight. Current patchset is only based on hard limit.
I see that weight would be another helfpul layer in chain that we can
implement after this as incremental that makes review, debugging
manageable?
Parav
On Sat, Sep 12, 2015 at 1:36 AM, Hefty, Sean wrote:
>> > Trying to limit the number of QPs that an app can allocate,
>> > therefore, just limits how much of the address space an app can use.
>> > There's no clear link between QP limits and HW resource limits,
>> > unless you assume a very specific
On Sat, Sep 12, 2015 at 12:55 AM, Tejun Heo wrote:
> Hello, Parav.
>
> On Fri, Sep 11, 2015 at 10:09:48PM +0530, Parav Pandit wrote:
>> > If you're planning on following what the existing memcg did in this
>> > area, it's unlikely to go well. Would you mind sharing what you have
>> > on mind in t
On Sat, Sep 12, 2015 at 12:52 AM, Hefty, Sean wrote:
>> So, the existence of resource limitations is fine. That's what we
>> deal with all the time. The problem usually with this sort of
>> interfaces which expose implementation details to users directly is
>> that it severely limits engineering
> > Trying to limit the number of QPs that an app can allocate,
> > therefore, just limits how much of the address space an app can use.
> > There's no clear link between QP limits and HW resource limits,
> > unless you assume a very specific underlying implementation.
>
> Isn't that the point tho
On Fri, Sep 11, 2015 at 07:22:56PM +, Hefty, Sean wrote:
> Trying to limit the number of QPs that an app can allocate,
> therefore, just limits how much of the address space an app can use.
> There's no clear link between QP limits and HW resource limits,
> unless you assume a very specific u
Hello, Parav.
On Fri, Sep 11, 2015 at 10:09:48PM +0530, Parav Pandit wrote:
> > If you're planning on following what the existing memcg did in this
> > area, it's unlikely to go well. Would you mind sharing what you have
> > on mind in the long term? Where do you see this going?
>
> At least cur
> So, the existence of resource limitations is fine. That's what we
> deal with all the time. The problem usually with this sort of
> interfaces which expose implementation details to users directly is
> that it severely limits engineering manuevering space. You usually
> want your users to expr
Hello, Parav.
On Fri, Sep 11, 2015 at 10:17:42PM +0530, Parav Pandit wrote:
> IO controller and applications are mature in nature.
> When IO controller throttles the IO, applications are pretty mature
> where if IO takes longer to complete, there is possibly almost no way
> to cancel the system ca
> cpuset is a special case but think of cpu, memory or io controllers.
> Their resource distribution schemes are a lot more developed than
> what's proposed in this patchset and that's a necessity because nobody
> wants to cripple their machines for resource control.
IO controller and applications
On Fri, Sep 11, 2015 at 10:04 PM, Tejun Heo wrote:
> Hello, Parav.
>
> On Fri, Sep 11, 2015 at 09:56:31PM +0530, Parav Pandit wrote:
>> Resource run away by application can lead to (a) kernel and (b) other
>> applications left out with no resources situation.
>
> Yeap, that this controller would b
Hello, Parav.
On Fri, Sep 11, 2015 at 09:56:31PM +0530, Parav Pandit wrote:
> Resource run away by application can lead to (a) kernel and (b) other
> applications left out with no resources situation.
Yeap, that this controller would be able to prevent to a reasonable
extent.
> Both the problems
> If the resource isn't and the main goal is preventing runaway
> hogs, it'll be able to do that but is that the goal here? For this to
> be actually useful for performance contended cases, it'd need higher
> level abstractions.
>
Resource run away by application can lead to (a) kernel and (b) ot
Hello, Parav.
On Fri, Sep 11, 2015 at 10:13:59AM +0530, Parav Pandit wrote:
> > My uneducated suspicion is that the abstraction is just not developed
> > enough. It should be possible to virtualize these resources through,
> > most likely, time-sharing to the level where userland simply says "I
>
Hello, Doug.
On Fri, Sep 11, 2015 at 12:24:33AM -0400, Doug Ledford wrote:
> > My uneducated suspicion is that the abstraction is just not developed
> > enough.
>
> The abstraction is 10+ years old. It has had plenty of time to ferment
> and something better for the specific use case has not eme
On Fri, Sep 11, 2015 at 9:34 AM, Tejun Heo wrote:
> Hello, Parav.
>
> On Fri, Sep 11, 2015 at 09:09:58AM +0530, Parav Pandit wrote:
>> The fact is that user level application uses hardware resources.
>> Verbs layer is software abstraction for it. Drivers are hiding how
>> they implement this QP or
On 09/11/2015 12:04 AM, Tejun Heo wrote:
> Hello, Parav.
>
> On Fri, Sep 11, 2015 at 09:09:58AM +0530, Parav Pandit wrote:
>> The fact is that user level application uses hardware resources.
>> Verbs layer is software abstraction for it. Drivers are hiding how
>> they implement this QP or CQ or wh
Hello, Parav.
On Fri, Sep 11, 2015 at 09:09:58AM +0530, Parav Pandit wrote:
> The fact is that user level application uses hardware resources.
> Verbs layer is software abstraction for it. Drivers are hiding how
> they implement this QP or CQ or whatever hardware resource they
> project via API la
On Fri, Sep 11, 2015 at 1:52 AM, Tejun Heo wrote:
> Hello, Parav.
>
> On Thu, Sep 10, 2015 at 11:16:49PM +0530, Parav Pandit wrote:
>> >> These resources include are- QP (queue pair) to transfer data, CQ
>> >> (Completion queue) to indicate completion of data transfer operation,
>> >> MR (memory
Hello, Parav.
On Thu, Sep 10, 2015 at 11:16:49PM +0530, Parav Pandit wrote:
> >> These resources include are- QP (queue pair) to transfer data, CQ
> >> (Completion queue) to indicate completion of data transfer operation,
> >> MR (memory region) to represent user application memory as source or
>
> > In past there has been similar comment to have dedicated cgroup
> > controller for RDMA instead of merging with device cgroup.
> > I am ok with both the approach, however I prefer to utilize device
> > controller instead of spinning of new controller for new devices
> > category.
> > I anticipa
On Thu, Sep 10, 2015 at 10:19 PM, Tejun Heo wrote:
> Hello, Parav.
>
> On Wed, Sep 09, 2015 at 09:27:40AM +0530, Parav Pandit wrote:
>> This is one old white paper, but most of the reasoning still holds true on
>> RDMA.
>> http://h10032.www1.hp.com/ctg/Manual/c00257031.pdf
>
> Just read it. Much
Hello, Parav.
On Wed, Sep 09, 2015 at 09:27:40AM +0530, Parav Pandit wrote:
> This is one old white paper, but most of the reasoning still holds true on
> RDMA.
> http://h10032.www1.hp.com/ctg/Manual/c00257031.pdf
Just read it. Much appreciated.
...
> These resources include are- QP (queue pa
On Tue, Sep 8, 2015 at 8:53 PM, Tejun Heo wrote:
> Hello, Parav.
>
> On Tue, Sep 08, 2015 at 02:08:16AM +0530, Parav Pandit wrote:
>> Currently user space applications can easily take away all the rdma
>> device specific resources such as AH, CQ, QP, MR etc. Due to which other
>> applications in o
Hello, Parav.
On Tue, Sep 08, 2015 at 02:08:16AM +0530, Parav Pandit wrote:
> Currently user space applications can easily take away all the rdma
> device specific resources such as AH, CQ, QP, MR etc. Due to which other
> applications in other cgroup or kernel space ULPs may not even get chance
>
On 07/09/2015 23:38, Parav Pandit wrote:
> Currently user space applications can easily take away all the rdma
> device specific resources such as AH, CQ, QP, MR etc. Due to which other
> applications in other cgroup or kernel space ULPs may not even get chance
> to allocate any rdma resources.
>
Hi Doug, Tejun,
This is from cgroups for-4.3 branch.
linux-rdma trunk will face compilation error as its behind Tejun's
for-4.3 branch.
Patch has dependency on the some of the cgroup subsystem functionality
for fork().
Therefore its required to merge those changes first to linux-rdma trunk.
Parav
Currently user space applications can easily take away all the rdma
device specific resources such as AH, CQ, QP, MR etc. Due to which other
applications in other cgroup or kernel space ULPs may not even get chance
to allocate any rdma resources.
This patch-set allows limiting rdma resources to se
37 matches
Mail list logo