Hi, I am trying to split up the re pattern for Apache log file format and seem to be having some trouble in getting Python to understand multi-line pattern:
#!/usr/bin/python import re #this is a single line string = '192.168.122.3 - - [29/Sep/2013:03:52:33 -0700] "GET / HTTP/1.0" 302 276 "-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"' #trying to break up the pattern match for easy to read code pattern = re.compile(r'(?P<ip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s+' r'(?P<ident>\-)\s+' r'(?P<username>\-)\s+' r'(?P<TZ>\[(.*?)\])\s+' r'(?P<url>\"(.*?)\")\s+' r'(?P<httpcode>\d{3})\s+' r'(?P<size>\d+)\s+' r'(?P<referrer>\"\")\s+' r'(?P<agent>\((.*?)\))') match = re.search(pattern, string) if match: print match.group('ip') else: print 'not found' The python interpreter is skipping to the 'math = re.search' and then the 'if' statement right after it looks at the <ip>, instead of moving onto <ident> and so on. mybox:~ user$ python -m pdb /Users/user/Documents/Python/apache.py > /Users/user/Documents/Python/apache.py(3)<module>() -> import re (Pdb) n > /Users/user/Documents/Python/apache.py(5)<module>() -> string = '192.168.122.3 - - [29/Sep/2013:03:52:33 -0700] "GET / HTTP/1.0" 302 276 "-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"' (Pdb) n > /Users/user/Documents/Python/apache.py(7)<module>() -> pattern = re.compile(r'(?P<ip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s+' (Pdb) n > /Users/user/Documents/Python/apache.py(17)<module>() -> match = re.search(pattern, string) (Pdb) Thank you.
-- https://mail.python.org/mailman/listinfo/python-list