Try changing the following in nova.conf and restart the nova-scheduler: scheduler_host_subset_size = 10 scheduler_max_attempts = 10
Cheers, George On Wed, Nov 30, 2016 at 9:56 AM, Massimo Sgaravatto < massimo.sgarava...@gmail.com> wrote: > Hi all > > I have a problem with scheduling in our Mitaka Cloud, > Basically when there are a lot of requests for new instances, some of them > fail because "Failed to compute_task_build_instances: Exceeded maximum > number of retries". And the failures are because "Insufficient compute > resources: Free memory 2879.50 MB < requested > 8192 MB" [*] > > But there are compute nodes with enough memory that could serve such > requests. > > In the conductor log I also see messages reporting that "Function > 'nova.servicegroup.drivers.db.DbDriver._report_state' run outlasted > interval by xxx sec" [**] > > > My understanding is that: > > - VM a is scheduled to a certain compute node > - the scheduler chooses the same compute node for VM b before the info for > that compute node is updated (so the 'size' of VM a is not taken into > account) > > Does this make sense or am I totally wrong ? > > Any hints about how to cope with such scenarios, besides increasing > scheduler_max_attempts ? > > scheduler_default_filters is set to: > > scheduler_default_filters = AggregateInstanceExtraSpecsFilter, > AggregateMultiTenancyIsolation,RetryFilter,AvailabilityZoneFilter, > RamFilter,CoreFilter,AggregateRamFilter,AggregateCoreFilter,ComputeFilter, > ComputeCapabilitiesFilter,ImagePropertiesFilter, > ServerGroupAntiAffinityFilter,ServerGroupAffinityFilter > > > Thanks a lot, Massimo > > [*] > > 2016-11-30 15:10:20.233 25140 WARNING nova.scheduler.utils > [req-ec8c0bdc-b413-4cab-b925-eb8f11212049 840c96b6fb1e4972beaa3d30ade10cc7 > d27fe2becea94a3e980fb9f66e2f29 > 1a - - -] Failed to compute_task_build_instances: Exceeded maximum number > of retries. Exceeded max scheduling attempts 5 for instance > 314eccd0-fc73-446f-8138-7d8d3c > 8644f7. Last exception: Insufficient compute resources: Free memory > 2879.50 MB < requested 8192 MB. > 2016-11-30 15:10:20.233 25140 WARNING nova.scheduler.utils > [req-ec8c0bdc-b413-4cab-b925-eb8f11212049 840c96b6fb1e4972beaa3d30ade10cc7 > d27fe2becea94a3e980fb9f66e2f29 > 1a - - -] [instance: 314eccd0-fc73-446f-8138-7d8d3c8644f7] Setting > instance to ERROR state. > > > [**] > > 2016-11-30 15:10:48.873 25128 WARNING oslo.service.loopingcall [-] > Function 'nova.servicegroup.drivers.db.DbDriver._report_state' run > outlasted interval by 9.08 sec > 2016-11-30 15:10:54.372 25142 WARNING oslo.service.loopingcall [-] > Function 'nova.servicegroup.drivers.db.DbDriver._report_state' run > outlasted interval by 9.33 sec > 2016-11-30 15:10:54.375 25140 WARNING oslo.service.loopingcall [-] > Function 'nova.servicegroup.drivers.db.DbDriver._report_state' run > outlasted interval by 9.32 sec > 2016-11-30 15:10:54.376 25129 WARNING oslo.service.loopingcall [-] > Function 'nova.servicegroup.drivers.db.DbDriver._report_state' run > outlasted interval by 9.30 sec > 2016-11-30 15:10:54.381 25138 WARNING oslo.service.loopingcall [-] > Function 'nova.servicegroup.drivers.db.DbDriver._report_state' run > outlasted interval by 9.24 sec > 2016-11-30 15:10:54.381 25139 WARNING oslo.service.loopingcall [-] > Function 'nova.servicegroup.drivers.db.DbDriver._report_state' run > outlasted interval by 9.28 sec > 2016-11-30 15:10:54.382 25143 WARNING oslo.service.loopingcall [-] > Function 'nova.servicegroup.drivers.db.DbDriver._report_state' run > outlasted interval by 9.24 sec > 2016-11-30 15:10:54.385 25141 WARNING oslo.service.loopingcall [-] > Function 'nova.servicegroup.drivers.db.DbDriver._report_state' run > outlasted interval by 9.11 sec > 2016-11-30 15:11:01.964 25128 WARNING oslo.service.loopingcall [-] > Function 'nova.servicegroup.drivers.db.DbDriver._report_state' run > outlasted interval by 3.09 sec > 2016-11-30 15:11:05.503 25142 WARNING oslo.service.loopingcall [-] > Function 'nova.servicegroup.drivers.db.DbDriver._report_state' run > outlasted interval by 1.13 sec > 2016-11-30 15:11:05.506 25138 WARNING oslo.service.loopingcall [-] > Function 'nova.servicegroup.drivers.db.DbDriver._report_state' run > outlasted interval by 1.12 sec > 2016-11-30 15:11:05.509 25139 WARNING oslo.service.loopingcall [-] > Function 'nova.servicegroup.drivers.db.DbDriver._report_state' run > outlasted interval by 1.13 sec > 2016-11-30 15:11:05.512 25141 WARNING oslo.service.loopingcall [-] > Function 'nova.servicegroup.drivers.db.DbDriver._report_state' run > outlasted interval by 1.13 sec > 2016-11-30 15:11:05.525 25143 WARNING oslo.service.loopingcall [-] > Function 'nova.servicegroup.drivers.db.DbDriver._report_state' run > outlasted interval by 1.14 sec > 2016-11-30 15:11:05.526 25140 WARNING oslo.service.loopingcall [-] > Function 'nova.servicegroup.drivers.db.DbDriver._report_state' run > outlasted interval by 1.15 sec > 2016-11-30 15:11:05.529 25129 WARNING oslo.service.loopingcall [-] > Function 'nova.servicegroup.drivers.db.DbDriver._report_state' run > outlasted interval by 1.15 sec > > _______________________________________________ > OpenStack-operators mailing list > OpenStack-operators@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators > >
_______________________________________________ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators