Re: [Openstack-operators] nova-api response time

Matt Riedemann Sun, 07 May 2017 20:02:27 -0700

On 5/5/2017 9:00 AM, Vladimir Prokofev wrote:

Writing things down always helps.
So I dug a little further, and seems that keystone is the weakest link.
It appears that it takes 500-600ms to generate a token, similar time to
validate it. This means that first time application accesses the API
request will run at least 1000ms, or even more. Second time will be
significantly faster because you no longer need to request a token, and
validation is done via memcached.
But for some reason cached token expires very fast, in a matter of
minutes(I didn't actually measured exact time), so at some point
validation will once again take half a second.
If your app written in a bad way, i.e. for every request it creates a
new auth session, this means that it will run very very slowly, as every
request will take half a second to get a token.


So now it's a question of determining why keystone is so slow? Btw, I
use keystone 10.0.0 with fernet tokens and apache WSGI(5 processes, 1
thread) on three controller nodes behind haproxy.
I found this article[1] about keystone benchmarking, and it seems that
it can run significantly faster.

So once again:
 - what are your average times for token creation/validation?
 - is this procedure computationally extensive and requires faster
CPU(fernet implies encryption/decription)?
 - appreciate a link on a good doc on how to check and increases
memcached caching timers.

By the way, I think this topic now should be named "openstack API
response times in general", rather than strictly tie to nova-api.

[1] 
https://docs.openstack.org/developer/rally/overview/stories/keystone/authenticate.html

2017-05-05 13:38 GMT+03:00 Vladimir Prokofev <[email protected]
<mailto:[email protected]>>:

    Hello Ops.

    So I had a feeling that my Openstack API is running a bit slow, like
    it takes a couple of seconds to get a full list of all instances in
    Horizon for example, which is kind of a lot for a basic MySQL
    select-and-present job. This got me wondering to how can I measure
    execution times and find out how long it takes for any part of the
    API request to execute.

    So I got a first REST client I could find(it happens to be a Google
    Chrome application called ARC(Advanced REST Client)), decided how
    and what I want to measure and got to it.

    I'm doing a basic /servers request to public nova-api endpoint using
    X-Auth-Token I got earlier from keystone. Project that I pull has
    only single instance in it.
    From clients perspective there're 4 major parts of request:
    connection time, send time, wait time, and receive time.
    Connection time, send time, and receive time are all negligible, as
    they take about 5-10ms combined. The major part is wait time, which
    can take from 130ms to 2500ms.
    Wait time is directly correlated to the execution time that I can
    see in nova-api log, with about 5-10ms added lag, which is expected
    and perfectly normal as I understand it.

    First run after some time is always the slowest one. and takes from
    900ms to 2500ms. If I do the same request in the next few minutes it
    will be faster - 130 to 200ms. My guess is that there's some caching
    involved(memcached?).

    So my questions are:
     - what execution times do you usually see in your environment for
    nova-api, or other APIs for that matter? I have a feeling that even
    130ms is very slow for such a simple task;
     - I assumed that first request is so slow because it pulls data
    from MySQL which resides on slow storage(3 node Galera cluster, each
    node is a KVM domain based on 2 SATA 7200 HDD in software RAID1), so
    I enabled slow query logs for MySQL with 500ms threshold, but I see
    nothing, even when wait time > 2000ms. Seems MySQL has nothing to do
    with it;
     - one more idea is that it takes a lot of time for keystone to
    validate a token. With expired token I get 401 response in 500-600ms
    first time, and 30-40ms next time I do a request;
     - another idea about slow storage is that nova-api has to write
    something(a log for example), before sending response, but I don't
    know how to test this exactly, rather that just transfer my
    controller nodes on an SSD;
     - is there a way I can profile nova-api, and see exactly which
    method takes the most time?

    I understand that I didn't exactly described my setup. This is due
    to the fact that I'm more interested in learning how to profile API
    components, rather than just plainly fix my setup(e.g. make it
    faster). In the end as I see it it's either a storage related
    issue(fixed by transferring controllers and database to a fast
    storage like SSD), or computing power issue(fixed by placing
    controllers on a dedicated bare-metal with fast CPU/memory). I have
    a desire not to guess, but to know exactly which is it :)
    I will describe setup in detail if such request is made.




_______________________________________________
OpenStack-operators mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

I can't answer your questions about Keystone and tokens but you asked aquestion about ways (tools) to profile the time spent in a nova-apirequest to list servers, and Nova has supported the osprofiler tool [1]since the Newton release. That might help you out with profiling.


[1] https://docs.openstack.org/developer/osprofiler/

--

Thanks,

Matt

_______________________________________________
OpenStack-operators mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] nova-api response time

Reply via email to