On Fri, Dec 16, 2022 at 06:58:33AM -0700, Shawn Heisey wrote: > On 12/16/22 01:59, Shawn Heisey wrote: > > On 12/16/22 00:26, Willy Tarreau wrote: > > > Both work for me using firefox (green flash after reload). > > > > It wasn't working when I tested it. I rebooted for a kernel upgrade and > > it still wasn't working. > > > > And then a while later I was poking around in my zabbix UI and saw the > > green lightning bolt. No idea what changed. Glad it's working, but > > problems that fix themselves annoy me because I usually never learn what > > happened. > > I think I know what happened. > > I was having problems with my pacemaker cluster where it got very confused > about the haproxy resource. I had the haproxy service enabled at boot for > both systems. I have now disabled that in systemd so it's fully under the > control of pacemaker. I'm pretty sure that pacemaker was confused because > it saw the service running on a system where it should have been disabled > and pacemaker didn't start it ... and it decided that was unacceptable and > basically broke the cluster. > > So for a while I had the virtual IP resource on the "lesser" server and the > haproxy resource on the main server. But because I had haproxy enabled at > boot time, it was actually running on both. The haproxy config is the same > between both systems, but the other server was still running a broken > haproxy version. Most of the backends are actually on the better server > accessed by br0 IP address rather than localhost, so the broken haproxy was > still sending them to the right place. This also explains why I was not > seeing traffic with tcpdump filtering on "udp port 443". I have a ways to > go before I've got true HA for my websites. Setting up a database cluster > is going to be challenging, I think. > > I got pacemaker back in working order after I was done with my testing, so > both resources were colocated on the better server and haproxy was not > running on the other one. I think you tried the URLs after I had fixed > pacemaker, and when I saw it working on zabbix, that was also definitely > after I fixed pacemaker.
Thanks for sharing your analysis. Indeed, everything makes sense now. > On that UDP bind thing ... I now have three binds defined. The virtual IP, > the IP of the first server, and the IP of the second server. As long as you don't have too many nodes, that's often the simplest thing to do. It requires ip_non_local_bind=1 but that's extremely frequent where haproxy runs. Willy