[ https://issues.apache.org/jira/browse/HADOOP-6107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12725351#action_12725351 ]
Todd Lipcon commented on HADOOP-6107: ------------------------------------- It's not so much a thrift back-end for log4j - the issue is that the log messages themselves are unstructured. If you're logging Strings, you're faced with a parsing challenge. If you're logging some kind of Event objects, you can choose to output strings or structured data. > Have some log messages designed for machine parsing, either real-time or > post-mortem > ------------------------------------------------------------------------------------ > > Key: HADOOP-6107 > URL: https://issues.apache.org/jira/browse/HADOOP-6107 > Project: Hadoop Common > Issue Type: Improvement > Affects Versions: 0.21.0 > Reporter: Steve Loughran > > Many programs take the log output of bits of Hadoop, and try and parse it. > Some may also put their own back end behind commons-logging, to capture the > input without going via Log4J, so as to keep the output more machine-readable. > These programs need log messages that > # are easy to parse by a regexp or other simple string parse (consider > quoting values, etc) > # push out the full exception chain rather than stringify() bits of it > # stay stable across versions > # log the things the tools need to analyse: events, data volumes, errors > For these logging tools, ease of parsing, retention of data and stability > over time take the edge over readability. In HADOOP-5073, Jiaqi Tan proposed > marking some of the existing log events as evolving towards stability. As > someone who regulary patches log messages to improve diagnostics, this > creates a conflict of interest. For me, good logs are ones that help people > debug their problems without anyone else helping, and if that means improving > the text, so be it. Tools like Chukwa have a different need. > What to do? Some options > # Have some messages that are designed purely for other programs to handle > # Have some logs specifically for machines, to which we log alongside the > human-centric messages > # Fix many of the common messages, then leave them alone. > # Mark log messages to be left alone (somehow) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.