On Fri, Jul 20, 2012 at 4:38 AM, Eoghan Glynn <egl...@redhat.com> wrote:
> Hi Narayan, > > I had the idea previously of applying a "weighting function" to the > resource usage being allocated from the quota, as opposed to simply > counting raw instances. > > The notion I had in mind was more related to image usage in glance, > where the image "footprint" can vary very widely. However I think it > could be useful for some nova resources also. > > Now for some resource types, for example say volumes, usage can be > controlled along multiple axes (i.e. number of volumes and total size), > so that gives more flexibility. > > But if I'm hearing you correctly, you'd want to apply a lower weighting > to instances that are scheduled onto one of the higher-memory compute > nodes, and vice versa a higher weighting to instances that happen to > be run on lower-memory nodes. > Does that sum it up, or have I misunderstood? I think you've got it. I hadn't really asked with a particular solution in mind, i was mainly looking for ideas. I think that weighting would help. Effectively we need to discount memory usage on the bigmem nodes, or something like that. The harder part is that we need to be able to specify independent/orthogonal quota constraints on different flavors. It would be really useful to be able to say basically, you can have 2TB of memory from this flavor, and 4TB of memory from that flavor. This would allow saying something like "you can have up to 3 1TB instances, and independently have up to 3TB of small instances as well." > BTW what kind of nova-scheduler config are you using? We're using the filter scheduler. We've defined a bunch of custom flavors, in addition to the stock ones, that allow us to fill up all of our node types. So for each node type, we define flavors for the complete node (minus a GB of memory for the hypervisor), and 3/4, 1/2, 1/4, and 1/8, 1/16, and 1/32 of the node. We've used a machine type prefix for each one. The compute nodes are IBM idataplex, so we have idp.{100,75,50,25,12,6,3}. We've done this for each machine type, so we have idp.*, mem.*, gpu.*, etc. Each machine type has a unique hostname prefix (cc for the idp nodes, cm for the bigmem nodes, cg for gpu nodes, etc), and the filter scheduler is setup to route requests for these custom flavors only to nodes with the appropriate hostname prefix. This isn't an ideal solution, but it minimizes risk of fragmentation. (With the default flavors, we'd see a lot of cases where there was idle capacity left on the nodes that wasn't usable because the ratio was wrong for the default flavors) So far, this scheduling scheme has worked pretty well, aside from leaving some instances in a weird state when you try to start a bunch (20-50) at a time. I haven't had time to track that down yet. -nld _______________________________________________ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp