Re: [rsyslog] Basic Rsyslog Troubleshooting

David Lang via rsyslog Sun, 24 Apr 2022 05:27:15 -0700

One problem with TCP load balancing of syslog messages is that the loadbalancers do not understand the syslog protocol, so they can't rebalance at amessage boundry.

A second problem is that when a firewall or load balancer drops a connection,the sender doesn't know that it's dropped until the next time it tries todeliver a message. Since TCP doesn't have any way for the OS TCP stack to tellsoftware "that message that you submitted to an open connection, and I accepted,it can't be delivered" (once the OS accepts the message, the sender has toassume that it will be delivered)

As a result, it's very easy for TCP syslog to be less reliable than UDP syslog.The 'common sense wisdom' is that TCP is reliable because dropped packets insidean ongoing connection will get retried, but dropped packets are actually veryuncommon inside a datacenter. They may happen when a firewall/router isoverloaded, but it's not very common. Back in 2006 or so I did testing and foundthat within a local network, UDP was almost perfectly reliable (as long as thereceiver could keep up and not overflow the OS buffers)

Rsyslog has the rebindinterval feature, which tells the sender to disconnect andreconnect periodically so that the load balancer has a chance to make a newbalancing decision.

you also want to make sure that the log stram is not idle for too long ('mark'was the historical method of doing that, I prefer vmstat 60 |logger -t vmstat asit's not much larger and an extremely dense set of information that can be veryuseful when troubleshooting)

The other thing to look at is the RELP protocol, it was developed specificallybecause TCP was designed to be reliable over an unreliable wire, but assumesthat both ends will remain up and the connection will not be cut by a middlebox.RELP does full application level acks so that the sender knows that the receiverrsyslog actually processed the message

with plain TCP, once the sending software submits data to the OS stack and theOS stack says it's accepted the data, the data then sits in a buffer on thesending machine, then gets sent over the wire (with retries), then sits in abuffer on the receiving machine until the receiving software reads it. Ifanything causes the connection to be terminated (firewall, load balancer, crashon the receiving machine, etc) the data will be lost and the sending softwarehas no way of learning about it.


David Lang


 On Sun, 24 Apr 2022, Steven D via rsyslog wrote:

Date: Sun, 24 Apr 2022 12:14:35 +0000
From: Steven D via rsyslog <rsyslog@lists.adiscon.com>
To: "rsyslog@lists.adiscon.com" <rsyslog@lists.adiscon.com>
Cc: Steven D <pheerl...@hotmail.com>
Subject: [rsyslog] Basic Rsyslog Troubleshooting

Greetings list

New to rsyslog list, not new to logging. We're experiencing an odd issue where 
TCP syslog messages are being dropped at seemingly random intervals...hoping to 
get some input.

The TLDR on our architecture is we have set up a couple rsyslog receivers 
behind a Netscaler Load balancer. Multiple platforms/devices are configured to 
send syslog to the load balancer, which distributes to the receivers. Receivers 
are running RHEL v8 and rsyslog v8.1911. Receivers write files to disk, which 
we then read with a SIEM agent.

We've got a modestly sized environment with a syslog client base of 200-300 
servers, 30 networking devices (including firewalls) and some applications all 
directing logging to the load balancer.

Our config file is pretty vanilla, no cache, or advanced tweaks. Just using the "imtcp" 
and "imudp" modules and rulesets to write files to disk based on the sending host IP/port.

The first problem we're seeing is that hosts sending via TCP have log messages 
missed (never written to disk), where UDP seems more reliable. When switching 
the firewalls to UDP, throughput nearly doubles and message loss is less 
noticeable (yeah I know it's still UDP).

Possibly related is that we've noticed that each receiver also holds a lot of 
"Established" connections for back to the clients, but different ports. 
(Possible session/connection exhaustion?)

Any guidance on how we can approach and troubleshoot this issue would be 
appreciated. Commands, dummy guides, sarcasm all welcome.

Thanks much

Regards,

Steven.

_______________________________________________
rsyslog mailing list
https://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

_______________________________________________
rsyslog mailing list
https://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Re: [rsyslog] Basic Rsyslog Troubleshooting

Reply via email to