Hi, I am trying to get this stuff working, but I still fail.
I have a format which consists of three elements: \d{4}M?-\d (4 numbers, optional M, dash, another number) EMPTY (the <EMPTY> token) [Empty line] (the <PAGEBREAK> token. The line may contain whitespaces, but nothing else) While the ``watchname`` and ``leaveempty`` were trivial, I cannot get ``pagebreak`` to work properly. #!/usr/bin/env python # -*- coding: UTF-8 -*- from pyparsing import (Word, Literal, Optional, Group, OneOrMore, Regex, Combine, ParserElement, nums, LineStart, LineEnd, White, replaceWith) ParserElement.setDefaultWhitespaceChars(' \t\r') watchseries = Word(nums, exact=4) watchrev = Word(nums, exact=1) watchname = Combine(watchseries + Optional('M') + '-' + watchrev) leaveempty = Literal('EMPTY') def breaks(s, loc, tokens): print repr(tokens[0]) #return ['<PAGEBREAK>' for token in tokens[0]] return ['<PAGEBREAK>'] #pagebreak = Regex('^\s*$').setParseAction(breaks) pagebreak = LineStart() + LineEnd().setParseAction(replaceWith ('<PAGEBREAK>')) parser = OneOrMore(watchname ^ pagebreak ^ leaveempty) tests = [ "2134M-2", """3245-3 3456M-5""", """3256-4 4563-4""", """4562M-6 EMPTY 3246-5""" ] for test in tests: print parser.parseString(test) The output should be: ['2134M-2'] ['3245-3', '3456M-5'] ['3256-4', '<PAGEBREAK>' '4563-4'] ['4562M-6', '<EMPTY>', '3246-5'] Thanks in advance! regards, Marek -- http://mail.python.org/mailman/listinfo/python-list