"Andi Clemens" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > Hi, > > we had some problems in the last weeks with our mailserver. > Some messages were not delivered and we wanted to know why. > But looking through the logfile is a time consuming process. > So I wanted to write a parser to analyse the logs and parse them as XML. > <snip>
Andi - Well, pyparsing does have *some* XML connection, but I don't think it will be as direct as you might like. I have attached below a pyparsing program that will probably parse 90% of your log messages, and give you some pretty easy-to-access data fields which you can then use to create your own Python data structures, such as dict keyed by queue id, dict keyed by message-id, etc., and then navigate through them to generate your XML. -- Paul logdata = """\ Sep 18 04:15:22 mailrelay postfix/cleanup[12103]: 755387301: message-id=<[EMAIL PROTECTED]> Sep 18 04:15:22 mailrelay spamd[1364]: spamd: processing message <[EMAIL PROTECTED]> for nobody:65534 Sep 18 04:15:25 mailrelay spamd[1364]: spamd: result: Y 15 - BAYES_99,DATE_IN_PAST_03_06,DNS_FROM_RFC_ABUSE,DNS_FROM_RFC_DSN,DNS_FROM_RFC_POST,DNS_FROM_RFC_WHOIS,FORGED_MUA_OUTLOOK,SPF_SOFTFAIL scantime=3.1,size=8086,user=nobody,uid=65534,required_score=5.0,rhost=localhost,raddr=127.0.0.1,rport=55277,mid=<[EMAIL PROTECTED]>,bayes=1,autolearn=no Sep 18 04:15:25 mailrelay postfix/cleanup[12074]: DA1431965E: message-id=<[EMAIL PROTECTED]> Sep 18 04:15:26 mailrelay postfix/cleanup[13057]: EF90720AD: message-id=<[EMAIL PROTECTED]> Sep 18 04:15:26 mailrelay postfix/smtp[10879]: EF90720AD: to=<[EMAIL PROTECTED]>, relay=10.49.0.7[10.49.0.7], delay=1, status=sent (250 2.6.0 <[EMAIL PROTECTED]> Queued mail for delivery) Sep 18 02:15:11 mailrelay postfix/smtpd[10841]: 755387301: client=unknown[194.25.242.123] Sep 18 04:15:22 mailrelay postfix/cleanup[12103]: 755387301: message-id=<[EMAIL PROTECTED]> Sep 18 04:15:22 mailrelay postfix/qmgr[11082]: 755387301: from=<[EMAIL PROTECTED]>, size=8152, nrcpt=7 (queue active) Sep 18 04:15:25 mailrelay postfix/pipe[11659]: 755387301: to=<[EMAIL PROTECTED]>, relay=procmail, delay=14, status=sent (filter) Sep 18 04:15:25 mailrelay postfix/pipe[11659]: 755387301: to=<[EMAIL PROTECTED]>, relay=procmail, delay=14, status=sent (filter) Sep 18 04:15:25 mailrelay postfix/pipe[11659]: 755387301: to=<[EMAIL PROTECTED]>, relay=procmail, delay=14, status=sent (filter) Sep 18 04:15:25 mailrelay postfix/pipe[11659]: 755387301: to=<[EMAIL PROTECTED]>, relay=procmail, delay=14, status=sent (filter) Sep 18 04:15:25 mailrelay postfix/qmgr[11082]: 755387301: removed Sep 18 04:15:25 mailrelay postfix/pickup[13175]: DA1431965E: uid=65534 from=<nobody> Sep 18 04:15:25 mailrelay postfix/cleanup[12074]: DA1431965E: message-id=<[EMAIL PROTECTED]> Sep 18 04:15:25 mailrelay postfix/qmgr[11082]: DA1431965E: from=<[EMAIL PROTECTED]>, size=11074, nrcpt=1 (queue active) Sep 18 04:15:26 mailrelay postfix/smtp[11703]: DA1431965E: to=<[EMAIL PROTECTED]>, relay=localhost[127.0.0.1], delay=1, status=sent (250 Ok: queued as EF90720AD) Sep 18 04:15:26 mailrelay postfix/qmgr[11082]: DA1431965E: removed Sep 18 04:15:25 mailrelay postfix/smtpd[11704]: EF90720AD: client=localhost[127.0.0.1] Sep 18 04:15:26 mailrelay postfix/cleanup[13057]: EF90720AD: message-id=<[EMAIL PROTECTED]> Sep 18 04:15:26 mailrelay postfix/smtp[11703]: DA1431965E: to=<[EMAIL PROTECTED]>, relay=localhost[127.0.0.1], delay=1, status=sent (250 Ok: queued as EF90720AD) Sep 18 04:15:26 mailrelay postfix/qmgr[11082]: EF90720AD: from=<[EMAIL PROTECTED]>, size=11263, nrcpt=1 (queue active) Sep 18 04:15:26 mailrelay postfix/smtp[10879]: EF90720AD: to=<[EMAIL PROTECTED]>, relay=10.49.0.7[10.49.0.7], delay=1, status=sent (250 2.6.0 <[EMAIL PROTECTED]> Queued mail for delivery) Sep 18 04:15:26 mailrelay postfix/qmgr[11082]: EF90720AD: removed """.split('\n') from pyparsing import * month = oneOf("Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec") dayOfMonth = Word(nums,max=2) timeOfDay = Combine(Word(nums,exact=2)+":"+ Word(nums,exact=2)+":"+Word(nums,exact=2)) timeStamp = month + dayOfMonth + timeOfDay # may need to expand this if log contains other entries in this field source = Literal("mailrelay") emailAddr = QuotedString("<",endQuoteChar=">") ipAddr = Combine(Word(nums)+"."+Word(nums)+"."+\ Word(nums)+"."+Word(nums)) ipRef = ( "localhost" | ipAddr ) + "[" + ipAddr + "]" command = Combine(Word(alphas) + Optional("/" + Word(alphas))) pid = "[" + Word(nums) + "]" queueId = Word(hexnums) integer = Word(nums) msgValue = ( integer | emailAddr | ipRef | Word(alphas) ) + \ Optional( QuotedString("(",endQuoteChar=")") ) nvList = Dict(delimitedList( Group( Word(alphas+"-") + Suppress("=") + msgValue ) )) msgBody = "removed" | nvList spamdMsg = "spamd:" + restOfLine regularMsg = queueId.setResultsName("queueId") + ":" + \ msgBody.setResultsName("body") logMessage = timeStamp + source + command.setResultsName("command") +\ pid.setResultsName("pid") + ":" + (spamdMsg | regularMsg) # parse each line in log for log in logdata: if log: results = logMessage.parseString(log) print results.dump() for fieldName in "message-id queueId from to".split(): print fieldName,":", try: print results[fieldName] except KeyError,ke: print Prints out (excerpt): - body: ['message-id', '[EMAIL PROTECTED]'] - command: postfix/cleanup - message-id: [EMAIL PROTECTED] - pid: ['[', '13057', ']'] - queueId: EF90720AD ['Sep', '18', '04:15:26', 'mailrelay', 'postfix/cleanup', '[', '13057', ']', ':', 'EF90720AD', ':', ['message-id', '[EMAIL PROTECTED]']] message-id : [EMAIL PROTECTED] queueId : EF90720AD from : to : - body: ['to', '[EMAIL PROTECTED]'] - command: postfix/smtp - pid: ['[', '10879', ']'] - queueId: EF90720AD - relay: 10 - to: [EMAIL PROTECTED] ['Sep', '18', '04:15:26', 'mailrelay', 'postfix/smtp', '[', '10879', ']', ':', 'EF90720AD', ':', ['to', '[EMAIL PROTECTED]'], ['relay', '10']] message-id : queueId : EF90720AD from : to : [EMAIL PROTECTED] - body: ['client', 'unknown'] - client: unknown - command: postfix/smtpd - pid: ['[', '10841', ']'] - queueId: 755387301 ['Sep', '18', '02:15:11', 'mailrelay', 'postfix/smtpd', '[', '10841', ']', ':', '755387301', ':', ['client', 'unknown']] message-id : queueId : 755387301 from : to : - body: ['message-id', '[EMAIL PROTECTED]'] - command: postfix/cleanup - message-id: [EMAIL PROTECTED] - pid: ['[', '12103', ']'] - queueId: 755387301 ['Sep', '18', '04:15:22', 'mailrelay', 'postfix/cleanup', '[', '12103', ']', ':', '755387301', ':', ['message-id', '[EMAIL PROTECTED]']] message-id : [EMAIL PROTECTED] queueId : 755387301 from : to : - body: ['from', '[EMAIL PROTECTED]'] - command: postfix/qmgr - from: [EMAIL PROTECTED] - nrcpt: ['7', 'queue active'] - pid: ['[', '11082', ']'] - queueId: 755387301 - size: 8152 ['Sep', '18', '04:15:22', 'mailrelay', 'postfix/qmgr', '[', '11082', ']', ':', '755387301', ':', ['from', '[EMAIL PROTECTED]'], ['size', '8152'], ['nrcpt', '7', 'queue active']] message-id : queueId : 755387301 from : [EMAIL PROTECTED] to : -- http://mail.python.org/mailman/listinfo/python-list