On 15 July 2016 at 09:26, Cheng, Yingxin <yingxin.ch...@intel.com> wrote: > It is easy to understand that scheduling in nova-scheduler service consists > of 2 major phases: > A. Cache refresh, in code [1]. > B. Filtering and weighing, in code [2]. > > Couple of previous experiments [3] [4] shows that “cache-refresh” is the > major bottleneck of nova scheduler. For example, the 15th page of > presentation [3] says the time cost of “cache-refresh” takes 98.5% of time of > the entire `_schedule` function [6], when there are 200-1000 nodes and 50+ > concurrent requests. The latest experiments [5] in China Mobile’s 1000-node > environment also prove the same conclusion, and it’s even 99.7% when there’re > 40+ concurrent requests. > > Here’re some existing solutions for the “cache-refresh” bottleneck: > I. Caching scheduler. > II. Scheduler filters in DB [7]. > III. Eventually consistent scheduler host state [8]. > > I can discuss their merits and drawbacks in a separate thread, but here I > want to show a simplest solution based on my findings during the experiments > [5]. I wrapped the expensive function [1] and tried to see the behavior of > cache-refresh under pressure. It is very interesting to see a single > cache-refresh only costs about 0.3 seconds. And when there’re concurrent > cache-refresh operations, this cost can be suddenly increased to 8 seconds. > I’ve seen it even reached 60 seconds for one cache-refresh under higher > pressure. See the below section for details.
I am curious about what DB driver you are using? Using PyMySQL should remove at lot of those issues. This is the driver we use in the gate now, but it didn't used to be the default. If you use the C based MySQL driver, you will find it locks the whole process when making a DB call, then eventlet schedules the next DB call, etc, etc, and then it loops back and allows the python code to process the first db call, etc. In extreme cases you will find the code processing the DB query considers some of the hosts to be down since its so long since the DB call was returned. Switching the driver should dramatically increase the performance of (II) > It raises a question in the current implementation: Do we really need a > cache-refresh operation [1] for *every* requests? If those concurrent > operations are replaced by one database query, the scheduler is still happy > with the latest resource view from database. Scheduler is even happier > because those expensive cache-refresh operations are minimized and much > faster (0.3 seconds). I believe it is the simplest optimization to scheduler > performance, which doesn’t make any changes in filter scheduler. Minor > improvements inside host manager is enough. So it depends on the usage patterns in your cloud. The caching scheduler is one way to avoid the cache-refresh operation on every request. It has an upper limit on throughput as you are forced into having a single active nova-scheduler process. But the caching means you can only have a single nova-scheduler process, where as (II) allows you to have multiple nova-scheduler workers to increase the concurrency. > [1] > https://github.com/openstack/nova/blob/master/nova/scheduler/filter_scheduler.py#L104 > [2] > https://github.com/openstack/nova/blob/master/nova/scheduler/filter_scheduler.py#L112-L123 > [3] > https://www.openstack.org/assets/presentation-media/7129-Dive-into-nova-scheduler-performance-summit.pdf > [4] http://lists.openstack.org/pipermail/openstack-dev/2016-June/098202.html > [5] Please refer to Barcelona summit session ID 15334 later: “A tool to test > and tune your OpenStack Cloud? Sharing our 1000 node China Mobile experience.” > [6] > https://github.com/openstack/nova/blob/master/nova/scheduler/filter_scheduler.py#L53 > [7] https://review.openstack.org/#/c/300178/ > [8] https://review.openstack.org/#/c/306844/ > > > ****** Here is the discovery from latest experiments [5] ****** > https://docs.google.com/document/d/1N_ZENg-jmFabyE0kLMBgIjBGXfL517QftX3DW7RVCzU/edit?usp=sharing > > The figure 1 illustrates the concurrent cache-refresh operations in a nova > scheduler service. There’re at most 23 requests waiting for the cache-refresh > operations at time 43s. > > The figure 2 illustrates the time cost of every requests in the same > experiment. It shows that the cost is increased with the growth of > concurrency. It proves the vicious circle that a request will wait longer for > the database when there’re more waiting requests. > > The figure 3/4 illustrate a worse case when the cache-refresh operation costs > reach 60 seconds because of excessive cache-refresh operations. Sorry, its not clear to be if this was using I, II, or III? It seems like its just using the default system? This looks like the problems I have seen when you don't use PyMySQL for your DB driver. Thanks, John __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev