On 22/02/16 19:07, "John Garbutt" <[email protected]> wrote: >On 22 February 2016 at 17:38, Sean Dague <[email protected]> wrote: >> On 02/22/2016 12:20 PM, Daniel P. Berrange wrote: >>> On Mon, Feb 22, 2016 at 12:07:37PM -0500, Sean Dague wrote: >>>> On 02/22/2016 10:43 AM, Chris Friesen wrote: >>>>> Hi all, >>>>> >>>>> We've recently run into some interesting behaviour that I thought I >>>>> should bring up to see if we want to do anything about it. >>>>> >>>>> Basically the problem seems to be that nova-compute is doing disk I/O >>>>> from the main thread, and if it blocks then it can block all of >>>>> nova-compute (since all eventlets will be blocked). Examples that we've >>>>> found include glance image download, file renaming, instance directory >>>>> creation, opening the instance xml file, etc. We've seen nova-compute >>>>> block for upwards of 50 seconds. >>>>> >>>>> Now the specific case where we hit this is not a production >>>>> environment. It's only got one spinning disk shared by all the guests, >>>>> the guests were hammering on the disk pretty hard, the IO scheduler for >>>>> the instance disk was CFQ which seems to be buggy in our kernel. >>>>> >>>>> But the fact remains that nova-compute is doing disk I/O from the main >>>>> thread, and if the guests push that disk hard enough then nova-compute >>>>> is going to suffer. >>>>> >>>>> Given the above...would it make sense to use eventlet.tpool or similar >>>>> to perform all disk access in a separate OS thread? There'd likely be a >>>>> bit of a performance hit, but at least it would isolate the main thread >>>>> from IO blocking. >>>> >>>> Making nova-compute more robust is fine, though the reality is once you >>>> IO starve a system, a lot of stuff is going to fall over weird. >>>> >>>> So there has to be a tradeoff of the complexity of any new code vs. what >>>> it gains. I think individual patches should be evaluated as such, or a >>>> spec if this is going to get really invasive. >>> >>> There are OS level mechanisms (eg cgroups blkio controller) for doing >>> I/O priorization that you could use to give Nova higher priority over >>> the VMs, to reduce (if not eliminate) the possibility that a busy VM >>> can inflict a denial of service on the mgmt layer. Of course figuring >>> out how to use that mechanism correctly is not entirely trivial. >>> >>> I think it is probably worth focusing effort in that area, before jumping >>> into making all the I/O related code in Nova more complicated. eg have >>> someone investigate & write up recommendation in Nova docs for how to >>> configure the host OS & Nova such that VMs cannot inflict an I/O denial >>> of service attack on the mgmt service. >> >> +1 that would be much nicer. >> >> We've got some set of bugs in the tracker right now which are basically >> "after the compute node being at loadavg of 11 for an hour, nova-compute >> starts failing". Having some basic methodology to use Linux >> prioritization on the worker process would mitigate those quite a bit, >> and could be used by all users immediately, vs. complex nova-compute >> changes which would only apply to new / upgraded deploys. >> > >+1 > >Does that turn into improved deployment docs that cover how you do >that on various platforms? > >Maybe some tools to help with that also go in here? >http://git.openstack.org/cgit/openstack/osops-tools-generic/ And some easy configuration in the puppet/ansible/chef standard recipes would also help. > >Thanks, >John > >PS >FWIW, how xenapi runs nova-compute in VM has a similar outcome, albeit >in a more heavy handed way. > >__________________________________________________________________________ >OpenStack Development Mailing List (not for usage questions) >Unsubscribe: [email protected]?subject:unsubscribe >http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
smime.p7s
Description: S/MIME cryptographic signature
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: [email protected]?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
