Re: [openstack-dev] In memory joins in Nova

Mike Bayer Wed, 12 Aug 2015 11:38:32 -0700


On 8/12/15 1:49 PM, Sachin Manpathak wrote:

Thanks, This feedback was helpful.
Perhaps my paraphrasing was misleading. I am not running openstack atscale in order to see how much the DB can sustain. My observation wasthat the host running nova services saturates on CPU much earlier thanthe DB does.

You absolutely *want* a single host to be saturated *way* before thedatabase is; the database here is a single vertical service intended toserve hundreds or thousands of horizontally scaled clientssimultaneously. A single request at a time should not even be a blipin the database's view of things.

Joins could be one of the reasons. I also observed that backgroundtasks like instance creation, resource/stats updates contend with getqueries. In addition to caching optimizations prioritizing tasks innova could help.

Is there a nova API to fetch list of instances without metadata? UntilI find a good way to profile openstack code, changing the queries canbe a good experiement IMO.

On Wed, Aug 12, 2015 at 8:12 AM, Dan Smith <d...@danplanet.com<mailto:d...@danplanet.com>> wrote:


    > If OTOH we are referring to the width of the columns and the join is
    > such that you're going to get the same A identity over and over
    again,
    > if you join A and B you get a "wide" row with all of A and B
    with a very
    > large amount of redundant data sent over the wire again and
    again (note
    > that the database drivers available to us in Python always send
    all rows
    > and columns over the wire unconditionally, whether or not we
    fetch them
    > in application code).

    Yep, it was this. N instances times M rows of metadata each. If
    you pull
    100 instances and they each have 30 rows of system metadata, that's a
    lot of data, and most of it is the instance being repeated 30
    times for
    each metadata row. When we first released code doing this, a prominent
    host immediately raised the red flag because their DB traffic shot
    through the roof.

    > In this case you *do* want to do the join in
    > Python to some extent, though you use the database to deliver the
    > simplest information possible to work with first; you get the
    full row
    > for all of the A entries, then a second query for all of B plus A's
    > primary key that can be quickly matched to that of A.

    This is what we're doing. Fetch the list of instances that match the
    filters, then for the ones that were returned, get their metadata.

    --Dan

    __________________________________________________________________________
    OpenStack Development Mailing List (not for usage questions)
    Unsubscribe:
    openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
    <http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe>
    http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] In memory joins in Nova

Reply via email to