2013/6/18 Mahesh V <[email protected]> > Folks, > > I urgently need your help in finalizing my logging architecture. > > I am almost there in terms of getting rsyslog and elasticsearch up and > running. > > I get a performance of 4000 odd messages per second for a short test, but I > dont > know if this will sustain for an hour long tests. >
As your data set gets bigger, your indexing speed will slow down because of merging <http://www.elasticsearch.org/guide/reference/index-modules/merge/>. You'll have to see what works for your use-case. Also, as your dataset gets bigger, ES will need more memory for searches. Make sure you give ES enough memory<http://www.elasticsearch.org/guide/reference/setup/installation/>. The rule of thumb is to five it half of your RAM, but it's better to just test how much it needs by monitoring it with something like SPM<http://sematext.com/spm/elasticsearch-performance-monitoring/>. Then you can give it what it needs, plus some buffer so that the garbage collector won't keep your CPU busy all the time. I'd say +50% of what it needs is fine. Again, rule of thumb :) > > So here is what I request you to help me with. > > 1) rsyslog does not seem to write to elasticsearch when running as a > service. > If I run using rsyslogd -nd, it seems to work. Need help in rectifying > this. > Answered in the other thread. > > 2) If I use rsyslog, there is only one field ("message") which has the > complete log. > If I want to split the log before sending it to elasticsearch or split > it after it reaches elasticsearch, > how can I do it? > I'd say you should split it before sending it to ES. If you can't, you can index the whole message as "not_analyzed", like I suggested in the other thread, and you can search with wildcards<http://www.elasticsearch.org/guide/reference/query-dsl/wildcard-query/>. But that's very expensive, so I'd do that only as the last resort or temporary solution until you find a better one. On the "how" front, you have some options: - if you can change the way your apps log, make them use RFC5424<http://tools.ietf.org/html/rfc5424>or CEE ( here<http://blog.sematext.com/2013/05/28/structured-logging-with-rsyslog-and-elasticsearch/>'s a blog post about it). This way, rsyslog can easily parse those messages and you can put the fields it finds in your JSON and you're done (see the blog post for how) - if you can't change the logs, you'll have to parse them. I assume mmnormalize <http://www.rsyslog.com/doc/mmnormalize.html> can help you with that, but I've never used it, so I can't say more than this link: http://www.rsyslog.com/normalizer-first-steps-for-mmnormalize/ Maybe you can give it a shot and ask here if you need help. > > e.g. my log can be > "ip=1.1.1.1 name=abcd loglevel=3 this is a test message" > I would like to later, query based on ip address or name using curl > (CLI) > > 3) what other parameters can I tune to get even better performance. > I might have maxed out in disk inserts, but I would like to tune every > possible parameter > before I conclude, this is the max I can get. > (I havent tried bulk_mode yet -- will try shortly) > Answered in the other thread. If you need more advice that what's already in there, please come back with your configuration (ES and rsyslog), and make sure you monitor your server and see what maxes out (CPU, I/O...) and why (merges?). I would say you shouldn't over-optimize. Because if you try too hard on indexing you will hurt searching. Sometimes it's just easier to slap a new server to your cluster :) Or two, if you need HA as well. > > > Thanks a lot for being patient > > regards > Mahesh > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > DON'T LIKE THAT. > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

