Re: [rsyslog] Very high throughput options

David Lang Tue, 14 May 2013 21:45:39 -0700

On Tue, 14 May 2013, Chris Bartram wrote:

We are in the planning stages of setting up a rsyslog server pool toaccommodate syslog streams from a couple thousand *nix servers; includingauditd type data and potentially some application logs (so it's going to be aVERY high volume of data) and we're looking to archive this data somewhere.Wehave a 10Gb network infrastructure, and I can throw as many RHEL machines atit as needed (as well as F5 load balancers in front).
Eventually the data may need to be searched, but highest priority is gettingit written somewhere quickly (and reliably - we need to minimize any possibledata loss so our archives can stand up to auditing requirements). In thatregard, any suggestions on file systems that can handle that kind of load?Ideally we want all the log files written to the same storage somewhere - i.e.we don't want to have to consolidate files from separate locations to searchall the log files for some specific host. On the other hand we can split upload by subnet sources perhaps and route specific machines to specific rsyslogclusters to ease the load on any one cluster (though our larger subnets stillmay have around 1,000 systems reporting); as long as it's easy to identifywhere to look for data from a given host.
I welcome any advice on setups that allow multiple concurrent (active) rsyslogservers writing to a common-ish file system as well as any gotchas orperformance benchmarks we can use to help plan the system.


do you have any idea what sort of data volume you are talking about here?

you say "VERY high volume of data", but different people define that indifferent ways :-)

I've built a system to handle 100K log messages/sec and I gave a presentation onit at LISA in december, the video, paper and slides are available athttps://www.usenix.org/conference/lisa12/building-100k-logsec-logging-infrastructureWhen I built it, I didn't have access to any 10G equipment, so I could only testthings up to ~380K log messages/sec.

At work I am currently part of a team defining how to take what I built for whatwas an 800 person company (with an extremely large web presense) when it wasaquired and scale it up to the 8000 person company that aquired us. As part ofthis, one of the other people tried to scare me about the total log volume bysaying that they handled 2B log messages in a month. I laughed and showed himthat my small subset of the business handled 18B log messages that same month,without any of my systems breathing hard (other than the nightly log reportingrun, which will peg any server you use for it, the question is just how long itwill peg it :-)

Based on my experience, unless you have a lot more logs than I expect from onlya couple thousand servers, I don't think you need to do anything fancy. Amid-range system with a modest RAID with XFS of ext4 should be able to handleyour log volume without a problem (well, you want it to be a HA pair of systems,but only one needs to be active at a time)

If you are comfortable going into more details in public, we can continue thediscussion here on the list. If not, contact me directly.


David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Re: [rsyslog] Very high throughput options

Reply via email to