Re: [rsyslog] Higher than High performance Rsyslog advice/suggestions?

Ben Hart via rsyslog Thu, 20 Jul 2023 07:31:45 -0700

Thanks for the clarification and additional info David.



From: David Lang <da...@lang.hm>
Date: Monday, July 17, 2023 at 12:41 PM
To: Ben Hart <ben.h...@jamf.com>
Cc: David Lang <da...@lang.hm>, Ben Hart via rsyslog <rsyslog@lists.adiscon.com>
Subject: Re: [rsyslog] Higher than High performance Rsyslog advice/suggestions?
in terms of rsyslog scale, I've run rsyslog where it saturated Gb ethernet and
had it keep up without a problem on reasonably modest hardware (300k logs/sec)
and in stripped down configs on faster networks, others have hit 1m logs/sec

500kb of logs every hour is pretty trivial, even 500k messages/hour is ~8k
logs/min or ~140 logs/sec.

and since a queue on the ruleset with an input bound to that ruleset is almost
the same as running a completely separate instance of rsyslog for that ruleset,
things become even more trivial

so there should be no need to configure a thread count >1 anywhere with this
volume


Mon Jul 17 09:08:11 2023: f_all: origin=core.queue size=0 enqueued=0 full=0 
discarded.full=0 discarded.nf=0 maxqsize=0

enqueued means 'added to the queue'

size means 'number of items in the queue at the time of sample'

maxqsize means 'max number of items in the queue since startup'

full means 'the number of times things could not be added to the queue because 
it was full'

discarded* means 'the number of logs thrown away because the queue was too full'
(this is based on the watermark settings, not something that happens by default)

everything but size is a running total


Mon Jul 17 09:08:11 2023: imudp(*/20525/IPv4): origin=imudp submitted=35802111 
disallowed=0

submitted means 'the number of log messages that arrived via this port'


Mon Jul 17 09:08:11 2023: dynafile cache d_wlc: origin=omfile requests=0 
level0=0 missed=0 evicted=0 maxused=0 closetimeouts=0

watch these lines for missed/evicted to become large. If they shoot up you need
to set the dynafilecachesize larger (if you have something like dates in your
template, each time the date changes you will see a miss and eventually an
eviction. But if the cache size is smaller than the working set that you will be
actively writing to, these will be huge and performance will plummet)


Mon Jul 17 09:08:11 2023: action-12-builtin:omfile: origin=core.action 
processed=143880635 failed=0 suspended=0 suspended.duration=0 resumed=0

processed means 'the number of log mesages its handled'
suspended means 'the number of times it's stopped processing'
failed means 'the number of times the connection has just failed'



your iostat output didn't include the extended information from -x, one of those
items is the percent utilization of the disk. your numbers look low enough that
I wouldn't expect there to be any significant problem.

cpu utilization looks trivial

David Lang

On Mon, 17 Jul 2023, Ben Hart wrote:

> Much appreciated David!
>
> I had been searching for this `enqueued` term and found almost nothing.. I’m 
> glad to hear that’s just more of a running tally of items queued and not so 
> much indicative of queued-but-unprocessed-items.
> Glad to hear I was on the right track in the beginning by going with the 
> ruleset with individual queues.
>
> So here’s the situation: This UF host receives and forwards log data to 
> Splunk Cloud from networking devices that are un-able to communication to an 
> HTTP SplunkCloud listener.
> Networking reported data missing from SplunkCloud, So I head off to this host 
> and start poking around. The Rsyslog daemon was running, no obvious errors 
> that I could see. The Universal Forward was the same although I admit it’s 
> harder to find potential performance issues in the UF especially when you 
> only have visibility from one side (I have no access to SC directly).
>
> Anyway.. the data coming into SC was kinda sporadic.. and being that I did 
> not know what enqueued meant. To me en-queued would mean ‘in the queue’ you 
> know? Anyway that figure kept growing and growing, I went looking for high 
> performance tips for rsyslog.
> The two largest (and most important) log files grow by roughly 50k 
> (firewall.log) and 500k (meraki.log) every hour. To me that’s pretty high.. 
> to those more experienced with Rsyslog possibly not.
>
> In any case I was just wanting to make sure I had the best possible 
> performing Rsyslog config I could get.   The info you requested is attached, 
> maybe it shows that I’m worried over nothing, or maybe it shows I have 
> resources for improvement.
>
> Thanks!
>
> From: David Lang <da...@lang.hm>
> Date: Friday, July 14, 2023 at 12:26 PM
> To: Ben Hart via rsyslog <rsyslog@lists.adiscon.com>
> Cc: Ben Hart <ben.h...@jamf.com>
> Subject: Re: [rsyslog] Higher than High performance Rsyslog 
> advice/suggestions?
> enqueued is a running total of how many messages have been put in teh queue
> since you restarted (unless you configure impstats to reset it's counters each
> run, but that can lose some data due to race conditions)
>
> it's sad but true that most attempts to optimize rsyslog actually end up 
> hurting
> performance mroe than they help, and rsyslog with simple configs is frequently
> fast enough to not need any optimization.
>
> having too many threads and too many queues can actually slow you down.
>
> with omfile for example, the overhead of locking the queue with one thread,
> inserting the message, unlocking the queue and
> then locking the queue with a different thread, marking that you are starting
> to work on the message, unlocking the queue, locking the queue, marking that 
> you
> processed the message and unlocking the queue absolutly dwarf the cost of just
> writing the log to disk
>
> multiple threads can also cause more locking overhead. you should only 
> increase
> threads if your measurements show that you have a thread maxing out a core 
> (top,
> then hit H to show threads, see if any thread is hitting 100% cpu)
>
> multiple thread when you are using omfile is even worse, as the omfile then 
> has
> to do locking itself to prevent the multiple threads from writing at the same
> time.
>
> you only want to use threads when you have expensive processing (which can be 
> a
> bad template, but there are ways to improve that)
>
> now, a queue on a ruleset that is being tied to an input is a bit different,
> that queue then replaces the use (and locking) of the main queue and can be a
> win.
>
> the bigger win is usually just increasing the batch size, but increasing the
> size produces diminishing returns, above a few hundred to a few thousand is
> seldom useful
>
>
> What is the volume of logs you are trying to process? what is making you think
> you need to change things to improve performance?
>
> please show a couple rounds of impstats output under load, and ideally a
> smapshot of top (with H to show the threads), and iostat -cdtyz 10 or 
> something
> similar to show the disk activity during this time.
>
> David Lang
> Caution: This email originated from outside of Jamf. DO NOT click on links or 
> open attachments unless you were expecting, recognize, and know the content 
> is safe.
>
Caution: This email originated from outside of Jamf. DO NOT click on links or 
open attachments unless you were expecting, recognize, and know the content is 
safe.
_______________________________________________
rsyslog mailing list
https://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Re: [rsyslog] Higher than High performance Rsyslog advice/suggestions?

Reply via email to