Re: [openstack-dev] [nova][ceilometer] proposal to send bulk hypervisor stats data in periodic notifications

Matt Riedemann Fri, 26 Jun 2015 12:51:16 -0700


On 6/26/2015 2:17 PM, Matt Riedemann wrote:



On 6/22/2015 4:55 AM, Daniel P. Berrange wrote:

On Sun, Jun 21, 2015 at 11:14:00AM -0500, Matt Riedemann wrote:



On 6/20/2015 3:35 PM, Daniel P. Berrange wrote:

On Sat, Jun 20, 2015 at 01:50:53PM -0500, Matt Riedemann wrote:

Waking up from a rare nap opportunity on a Saturday, this is what was
bothering me:

The proposal in the etherpad assumes that we are just getting bulk
host/domain/guest VM stats from the hypervisor and sending those in a
notification, but how do we go about filtering those out to only
instances
that were booted through Nova?


In general I would say that is an unsupported deployment scenario to
have other random virt guests running on a nova compute node.

Having said that, when nova uses libguestfs, it will create some temp
guests via libvirt, so we do have to consider that possibility.

Even today with the general list domains virt driver call, we could be
getting domains that weren't launched by Nova I believe.

Jason pointed out the ceilometer code gets all of the non-error state
instances from nova first [1] and then for each of those it does
the domain
lookup from libvirt, filtering out any that are in SHUTOFF state [2].

When talking about the new virt driver API for bulk stats, danpb
said to use
virConnectGetAllDomainStats with libvirt [3] but I'm not aware of
that being
able to filter out instances that weren't created by nova.  I don't
think we
want a notification from nova about the hypervisor stats to include
things
that were created outside nova, like directly through virsh or
vCenter.

For at least libvirt, if virConnectGetAllDomainStats returns the
domain
metadata then we can filter those since there is nova-specific
metadata in
the domains created through nova [4] but I'm not sure that's true
about the
other virt types in nova (I think the vCenter driver tags VMs
somehow as
being created by OpenStack/Nova, but not sure about
xen/hyper-v/ironic).


The nova database hsa a list of domains that it owns, so if you
query the
database for a list of valid UUIDs for the host, you can use that to
filter
the domains that libvirt reports by comparing UUIDs.

Regards,
Daniel


Dan, is virsh domstats using virConnectGetAllDomainStats?  I have
libvirt
1.2.8 on RHEL 7.1, created two m1.tiny instances through nova and got
this
from virsh domstats:

http://paste.openstack.org/show/310874/

Is that similar to what we'd see from virConnectGetAllDomainStats?  I
haven't yet written any code in the libvirt driver to use
virConnectGetAllDomainStats to see what that looks like.


Yes, that's the kind of data you'd expect.


Regards,
Daniel


Here is another issue I just thought of.  There are limits to the size
of a message you can send through RPC right?  So what if you have a lot
of instances running and you're pulling bulk stats on them and sending
over rpc via a notification?  Is there the possibility that we blow that
up on message size limits?

For libvirt/xen/hyper-v this is maybe not a big deal since the compute
node is 1:1 with the hypervisor and I'd think in most cases you don't
have enough instances running on that compute host to blow the size
limit on the message payload, unless you have a big ass compute host.

But what about clustered virt drivers like vcenter and ironic?  That one
compute node could be getting bulk stats on an entire cloud (vcenter
cluster at least).

Maybe we could just chunk the messages/notifications if we know the rpc
message limit?

With respect to message size limit, I found a thread in the rabbitmqmailing list [1] talking about message size limits which basically saysyou're only bounded by resources available, but sending things too largeis obviously a bad idea since you starve the system and can potentiallyscrew up the heartbeat checking.

The actual 64K size limit I was really thinking of originally was a Qpidlimitation that was fixed in the long long ago by bnemec [2].

So I guess for the purpose of a bulk stats notification, we'd probablybe safe to keep the messages under 64K and just chunk through the listof instances.

[1]http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2012-March/018699.html

[2] https://review.openstack.org/#/c/28711/

--

Thanks,

Matt Riedemann


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova][ceilometer] proposal to send bulk hypervisor stats data in periodic notifications

Reply via email to