John Machin wrote: > On 13/05/2006 7:39 PM, Paddy wrote: > [snip] > > Extension; named RE variables, with arguments > > =================================== > > In this, all group definitions in the body of the variable definition > > reference the literal contents of groups appearing after the variable > > name, (but within the variable reference), when the variable is > > referenced > > > > So an RE variable definition like: > > defs = r'(?smx) (?P/GO/ go \s for \s \1 )' > > > > Used like: > > rgexp = defs + r""" > > (?P=GO (it) ) > > \s+ > > (?P=\GO (broke) ) > > """ > > Would match the string: > > "go for it go for broke" > > > > As would: > > defs2 = r'(?smx) (?P/GO/ go \s for \s (?P=subject) )' > > rgexp = defs2 + r""" > > (?P=GO (?P<subject> it) ) > > \s+ > > (?P=\GO (?P<subject> broke) ) > > """ > > > > The above would allow me to factor out sections of REs and define > > named, re-ussable RE snippets. > > > > > > Please comment :-) > > > 1. Regex syntax is over-rich already.
First, thanks for the reply John. Yep, regex syntax is rich, but one of the reasons I went ahead with my post was that it might add a new way to organize regexps into more managable chunks, rather ike functions do. > 2. You may be better off with a parser for this application instead of > using regexes. unfortunately my experience counts against me going for parser solutions rather than regxps. Although, being a Python user I always think again before using a regexp and remember to think if their might be a clearer string method solution to tasks; I am not comfotable with parsers/parser generators. The reason I used to dismiss parsers this time is that I have only ever seen parsers for complete languages. I don't want to write a complete parser for Verilog, I want to take an easier 'good enough' route that I have used with success, from my AWK days. (Don't laugh, my exposure to AWK after years of C, was just as profound as more recent speakers have blogged about their fealings of release from Java after exposure to new dynamic languages - all hail AWK, not completely put out to stud :-) I intend to write a large regexp that picks out the things that I want from a verilog file, skipping the bits I am un-iterested in. With a regular expression, if I don't write something to match, say, always blocks, then, although if someone wrote ssignal definitions (which I am interested in), in the task, then I would pick those up as well as module level signal definitions, but that would be 'good enough' for my app. All the parser examples I see don't 'skip things', - Hell, despite writing my own interpreted, recursive descent, language many (many..), years ago in C; too much early lex &yacc'ing about left me with a grudge! > 3. "\\" is overloaded to the point of collapse already. Using it as an > argument marker could make the universe implode. Did I truly write '=\GO' ? Twice! Sorry, the example should have used '=GO' to refer to RE variables. I made, then copied the error. Note: I also tried to cut down on extra syntax by re-using the syntax for referring to named groups (Or I would have if my proof reading were better). > 4. You could always use Python to roll your own macro expansion gadget, > like this: Thanks for going to the trouble of writing the expander. I too had thought of that, but that would lead to 'my little RE syntax' that would be harder to maintain and others might reinvent the solution but with their own mini macro syntax. > > Cheers, > John - Paddy. -- http://mail.python.org/mailman/listinfo/python-list