David, Thanks for all your help today, I committed a few changed to our config today and i'll keep an eye out for changes next day or so.
Mind if I drop another pstats here after some bake in for a re-review? Here's what I updated it too based on your pointers. module(load="imudp") module(load="imtcp" MaxListeners="100" AddtlFrameDelimiter="000" KeepAlive="on" KeepAlive.Probes="1" KeepAlive.Time="10") module(load="impstats" interval="30" ruleset="pstats_rule") ruleset(name="pstats_rule") { action(name="pstats_rule" type="omfile" File="/var/log/rsyslog_pstats.log" FileCreateMode="0744" FileOwner="loguser" FileGroup="loguser") } input(type="imudp" port="10514" ruleset="firewall_rule") input(type="imtcp" port="10514" ruleset="firewall_rule") template(name="firewall_logs" type="string" string="/data/logs/pan/10514/%fromhost-ip%/syslog.log") ruleset(name="firewall_rule") { action(name="firewall_rule" type="omfile" FileCreateMode="0744" DirCreateMode="0755" FileOwner="loguser" FileGroup="loguser" DirOwner="loguser" DirGroup="loguser" DynaFile="firewall_logs" DynaFileCacheSize = "50") } ________________________________ From: David Lang <da...@lang.hm> Sent: Sunday, April 24, 2022 12:32 PM To: Steven D <pheerl...@hotmail.com> Cc: David Lang <da...@lang.hm>; Steven D via rsyslog <rsyslog@lists.adiscon.com> Subject: Re: [rsyslog] Basic Rsyslog Troubleshooting On Sun, 24 Apr 2022, Steven D wrote: > * dynafilecachsize is a global setting, I don't need to specify it per > ruleset/action? I believe that it's per action (with the action() syntax, there are no global settings except encryption) > * Assuming so and I have ~300 unique hosts writing to files, would > "dynafilecachsize = 500" be too much? no, that sounds very reasonable. The current pstats output shows hundreds of thousands of cache evictions, that should drop to near zero (pretty much only showing up if you have date as part of it and the date changes). A small number is fine, thousands is bad, hundreds of thousands very bad David Lang > Regards, > Steven > ________________________________ > From: David Lang <da...@lang.hm> > Sent: Sunday, April 24, 2022 11:37 AM > To: Steven D <pheerl...@hotmail.com> > Cc: David Lang <da...@lang.hm>; Steven D via rsyslog > <rsyslog@lists.adiscon.com> > Subject: Re: [rsyslog] Basic Rsyslog Troubleshooting > > you definantly need to increase the dynacachesize for the firewall logs > > also, if you add name= to the action, the pstats lines will be named by that > rather than action # > > bump up the cache size so that it can keep track of all the files that will be > getting logs at the same time (plus a bit to be on the safe side, it REALLY > hurts to have it below the working set size) and see what that does to things. > I'll bet that cpu utilization increases and you have less problems with losing > logs. > > if you continue to have problems, try to get a pstats dump of the period where > you lose some logs so we can see what it looks like. > > having the max main queue size hit almost 4k seems likely to be an indication > of > a problem as well, but that may go away once we get the cache size reasonable > > David Lang > > On Sun, 24 Apr 2022, Steven D wrote: > >> Date: Sun, 24 Apr 2022 15:27:47 +0000 >> From: Steven D <pheerl...@hotmail.com> >> To: David Lang <da...@lang.hm> >> Cc: Steven D via rsyslog <rsyslog@lists.adiscon.com> >> Subject: Re: [rsyslog] Basic Rsyslog Troubleshooting >> >> Great, I'll eyeball the impstats module options a lil more closely. >> >> Attached is a few cycles with the current settings, sanitized some of the >> rule names. >> ________________________________ >> From: David Lang <da...@lang.hm> >> Sent: Sunday, April 24, 2022 11:06 AM >> To: David Lang <da...@lang.hm> >> Cc: Steven D <pheerl...@hotmail.com>; Steven D via rsyslog >> <rsyslog@lists.adiscon.com> >> Subject: Re: [rsyslog] Basic Rsyslog Troubleshooting >> >> and if you can post a couple cycles of the pstats output I can help explain >> what's what there and see if there's anything obvious. >> >> David Lang >> >> On Sun, 24 Apr 2022, David Lang wrote: >> >>> Date: Sun, 24 Apr 2022 08:05:22 -0700 (PDT) >>> From: David Lang <da...@lang.hm> >>> To: Steven D <pheerl...@hotmail.com> >>> Cc: David Lang <da...@lang.hm>, >>> Steven D via rsyslog <rsyslog@lists.adiscon.com> >>> Subject: Re: [rsyslog] Basic Rsyslog Troubleshooting >>> >>> On Sun, 24 Apr 2022, Steven D wrote: >>> >>>> Re: Load balancer - that makes sense to me as well. >>>> >>>> I've added this line to our config, does it seem appropriate for pstats? >>>> Our Linux team keeps a tight grip on rights, so i'm pretty limited in what >>>> I can do/access outside of rsyslog and the SIEM agent configs... I'll have >>>> to write the file out where I can actually access it (rolleyes) >>>> >>>> module(load="impstats" interval="30" ruleset="pstats_rule") >>> >>> when things are working normally this is good, when they aren't, it's best >>> to >>> have the module write to a file directly (see the module options) >>> >>>> ruleset(name="pstats_rule") { >>>> action(type="omfile" >>>> File="/var/log/rsyslog_pstats.log" >>>> FileCreateMode="0744" >>>> FileOwner="loguser" >>>> FileGroup="loguser") >>>> } >>>> >>>> Running Top + H now to get a feel on resource usage, but at first glance >>>> nothing is really about 1~2% >>> >>> what does wait time look like? >>> >>> David Lang >>> >>>> ________________________________ >>>> From: David Lang <da...@lang.hm> >>>> Sent: Sunday, April 24, 2022 10:39 AM >>>> To: Steven D <pheerl...@hotmail.com> >>>> Cc: David Lang <da...@lang.hm>; Steven D via rsyslog >>>> <rsyslog@lists.adiscon.com> >>>> Subject: Re: [rsyslog] Basic Rsyslog Troubleshooting >>>> >>>> On Sun, 24 Apr 2022, Steven D wrote: >>>> >>>>> Would setting the KeepAlives in the rsyslog config on the server-side help >>>>> to manage the (zombie?) TCP connections.? >>>>> >>>>> * The load balancer being in the middle feels like it's the cause of >>>>> repeated ESTABLISHED connections, but to keep HA/redundancy it's kind of a >>>>> necessary evil. >>>> >>>> by the way, I think the fact that the load balancer cuts the connection and >>>> the >>>> server doesn't know it's cut and has to wait for it to time out (a very >>>> long >>>> time) is the cause of the large number of ESTABLISHED connections >>>> >>>> David Lang >>>> >>> >> > _______________________________________________ rsyslog mailing list https://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.