On Thu, Jan 12, 2012 at 11:01 PM, Florian Haas <flor...@hastexo.com> wrote: > On Thu, Jan 5, 2012 at 10:15 PM, Florian Haas <flor...@hastexo.com> wrote: >> Florian Haas (2): >> extra: add rsyslog configuration snippet >> extra: add logrotate configuration snippet >> >> configure.ac | 4 +++ >> extra/Makefile.am | 2 +- >> extra/logrotate/Makefile.am | 5 ++++ >> extra/logrotate/pacemaker.conf.in | 7 ++++++ >> extra/rsyslog/Makefile.am | 5 ++++ >> extra/rsyslog/pacemaker.conf.in | 39 >> +++++++++++++++++++++++++++++++++++++ >> 6 files changed, 61 insertions(+), 1 deletions(-) >> create mode 100644 extra/logrotate/Makefile.am >> create mode 100644 extra/logrotate/pacemaker.conf.in >> create mode 100644 extra/rsyslog/Makefile.am >> create mode 100644 extra/rsyslog/pacemaker.conf.in > > Any takers on these?
Sorry, I was off working on the new fencing logic and then corosync 2.0 support (when cman and all the plugins, including ours, go away). So a couple of comments... I fully agree that the state of our logging needs work and I can understand people wanting to keep the vast majority of our logs out of syslog. I'm less thrilled about one-file-per-subsystem, the cluster will often do a lot within a single second and splitting everything up really hurts the ability to correlate messages. I'd also suggest that /some/ information not coming directly from the RAs is still appropriate for syslog (such as "I'm going to move A from B to C" or "I'm about to turn of node D"), so the nuclear option isn't really thrilling me. In addition to the above distractions, I've been coming up to speed on libqb's logging which is opening up a lot of new doors and should hopefully help solve the underlying log issues. For starters it lets syslog/stderr/logfile all log at different levels of verbosity (and formats), it also supports blackboxes of which a dump can be triggered in response to an error condition or manually by the admin. The plan is something along the lines of: syslog gets NOTICE and above, anything else (depending on debug level and trace options) goes to /var/log/(cluster/?)pacemaker or whatever was configured in corosync. However, before I can enact that there will need to be an audit of the messages currently going to INFO (674 entries) and NOTICE(160 entries) with some getting bumped up, others down (possibly even to debug). I'd certainly be interested in feedback as to which logs should and should not make it. If you want to get analytical about it, there is also an awk script that I use when looking at what we log. I'd be interested in some numbers from the field. -- Andrew #!/bin/bash awk 'BEGIN{ keys[0] = "openais" keys[1] = "heartbeat:" keys[2] = "ccm:" keys[3] = "lrmd:" keys[4] = "crmd:" keys[5] = "pengine:" keys[6] = "cib:" keys[7] = "CTS:" keys[8] = "stonithd:" format[0] = "" format[1] = "" format[2] = "\t" format[3] = "\t" format[4] = "\t" format[5] = "" format[6] = "\t" format[7] = "\t" format[8] = "" l_format[0] = "\t" l_format[1] = "\t" l_format[2] = "\t" l_format[3] = "" l_format[4] = "\t" l_format[5] = "\t" level[0] = "CRIT:" level[1] = "ERROR:" level[2] = "WARN:" level[3] = "notice:" level[4] = "info:" level[5] = "debug:" max = 9; l_max = 6; for( i = 0 ;i < max; i++){ values[i] = 0; } for( i = 0 ;i < l_max; i++){ l_values[i] = 0; } } { i =0 while( i < max ){ if ( NF < 5) { break } if ( $5 == keys[i]){ values[i]++ i =0 while( i < l_max ){ if ( NF < 5) { break } if ( $7 == level[i]){ l_values[i]++ break } i++ } break } i++ } }END{ total = 0 for( i = 0 ;i < max; i++){ total=values[i] + total } print "total line number is " total print "progs", "\t","\t", "#of lines","\t" "percentage" for( i = 0 ;i < max; i++){ print keys[i], format[i],"\t", values[i],"\t\t" values[i]/total*100 "%" } print "\nLog levels:" for( i = 0 ;i < l_max; i++){ print level[i], l_format[i],"\t", l_values[i],"\t \t" l_values[i]/total*100 "%" } }' $1 _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org