Re: [Openstack-operators] [OCTAVIA][QUEENS][KOLLA] - Amphora to Health-manager invalid UDP heartbeat.

Michael Johnson Tue, 23 Oct 2018 10:12:18 -0700

Are the controller and the amphora using the same version of Octavia?

We had a python3 issue where we had to change the HMAC digest used. If
you controller is running an older version of Octavia than your
amphora images, it may not have the compatibility code to support the
new format.  The compatibility code is here:
https://github.com/openstack/octavia/blob/master/octavia/amphorae/backends/health_daemon/status_message.py#L56

There is also a release note about the issue here:
https://docs.openstack.org/releasenotes/octavia/rocky.html#upgrade-notes

If that is not the issue, I would double check the heartbeat_key in
the health manager configuration files and inside one of the amphora.

Note, that this key is only used for health heartbeats and stats, it
is not used for the controller to amphora communication on port 9443.

Also, load balancers cannot get "stuck" in PENDING_* states unless
someone has killed the controller process that was actively working on
that load balancer. By killed I mean a non-graceful shutdown of the
process that was in the middle of working on the load balancer.
Otherwise all code paths lead back to ACTIVE or ERROR status after it
finishes the work or gives up retrying the requested action. Check
your controller logs to make sure this load balancer is not still
being worked on by one of the controllers. The default retry timeouts
(some are up to 25 minutes) are very long (it will keep trying to
accomplish the request) to accommodate very slow (virtual box) hosts
and the test gates. You will want to tune those down for a production
deployment.

Michael

On Tue, Oct 23, 2018 at 7:09 AM Gaël THEROND <[email protected]> wrote:
>
> Hi guys,
>
> I'm finishing to work on my POC for Octavia and after solving few issues with 
> my configuration I'm close to get a properly working setup.
> However, I'm facing a small but yet annoying bug with the health-manager 
> receiving amphora heartbeat UDP packet which it consider as not correct and 
> so drop it.
>
> Here are the messages that can be found in logs:
>
> 2018-10-23 13:53:21.844 25 WARNING 
> octavia.amphorae.backends.health_daemon.status_message [-] calculated hmac: 
> faf73e41a0f843b826ee581c3995b7f7e56b5e5a294fca0b84eda426766f8415 not equal to 
> msg hmac: 6137613337316432636365393832376431343337306537353066626130653261 
> dropping packet
>
> Which come from this part of the HM Code:
>
> https://docs.openstack.org/octavia/pike/_modules/octavia/amphorae/backends/health_daemon/status_message.html#get_payload
>
> The annoying thing is that I don't get why the UDP packet is considered as 
> stale and how can I try to reproduce the payload which is send to the 
> HealthManager.
> I'm willing to write a simple PY program to simulate the heartbeat payload 
> but I don't now what's exactly the message and I think I miss some 
> informations.
>
> Both HealthManager and the Amphora do use the same heartbeat_key and both can 
> contact on the network as the initial Health-manager to Amphora 9443 
> connection is validated.
>
> As an effect to this situation, my loadbalancer is stuck in PENDING_UPDATE 
> mode.
>
> Do you have any idea on how can I handle such thing or if it's something 
> already seen out there for anyone else?
>
> Kind regards,
> G.
> _______________________________________________
> OpenStack-operators mailing list
> [email protected]
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

_______________________________________________
OpenStack-operators mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] [OCTAVIA][QUEENS][KOLLA] - Amphora to Health-manager invalid UDP heartbeat.

Reply via email to