On Sun, Apr 14, 2013 at 09:10:36PM -0400, Michael R. Hines wrote: > On 04/14/2013 05:16 PM, Michael S. Tsirkin wrote: > >On Sun, Apr 14, 2013 at 03:43:28PM -0400, Michael R. Hines wrote: > >>On 04/14/2013 02:51 PM, Michael S. Tsirkin wrote: > >>>On Sun, Apr 14, 2013 at 10:31:20AM -0400, Michael R. Hines wrote: > >>>>On 04/14/2013 04:28 AM, Michael S. Tsirkin wrote: > >>>>>On Fri, Apr 12, 2013 at 09:47:08AM -0400, Michael R. Hines wrote: > >>>>>>Second, as I've explained, I strongly, strongly disagree with > >>>>>>unregistering > >>>>>>memory for all of the aforementioned reasons - workloads do not > >>>>>>operate in such a manner that they can tolerate memory to be > >>>>>>pulled out from underneath them at such fine-grained time scales > >>>>>>in the *middle* of a relocation and I will not commit to writing a > >>>>>>solution > >>>>>>for a problem that doesn't exist. > >>>>>Exactly same thing happens with swap, doesn't it? > >>>>>You are saying workloads simply can not tolerate swap. > >>>>> > >>>>>>If you can prove (through some kind of anaylsis) that workloads > >>>>>>would benefit from this kind of fine-grained memory overcommit > >>>>>>by having cgroups swap out memory to disk underneath them > >>>>>>without their permission, I would happily reconsider my position. > >>>>>> > >>>>>>- Michael > >>>>>This has nothing to do with cgroups directly, it's just a way to > >>>>>demonstrate you have a bug. > >>>>> > >>>>If your datacenter or your cloud or your product does not want to > >>>>tolerate page registration, then don't use RDMA! > >>>> > >>>>The bottom line is: RDMA is useless without page registration. Without > >>>>it, the performance of it will be crippled. If you define that as a bug, > >>>>then so be it. > >>>> > >>>>- Michael > >>>No one cares if you do page registration or not. ulimit -l 10g is the > >>>problem. You should limit the amount of locked memory. > >>>Lots of good research went into making RDMA go fast with limited locked > >>>memory, with some success. Search for "registration cache" for example. > >>> > >>Patches using such a cache would be welcome. > >> > >>- Michael > >> > >And when someone writes them one day, we'll have to carry the old code > >around for interoperability as well. Not pretty. To avoid that, you > >need to explicitly say in the documenation that it's experimental and > >unsupported. > > > > That's what protocols are for. > > As I've already said, I've incorporated this into the design of the protocol > already. > > The protocol already has a field called "repeat" which allows a user to > request multiple chunk registrations at the same time. > If you insist, I can add a capability / command to the protocol > called "unregister chunk", > but I'm not volunteering to implement that command as I don't have any data > showing it to be of any value.
The value would be being able to run your code in qemu as unpriveledged user. > That would insulate the protocol against any such future > "registration cache" design. > > - Michael > It won't. If it's unimplemented it won't be of any use since now your code does not implement the protocol fully. -- MST