O.K. Mark. Since you seem to accept the basic requirement to build an *external* DSL I can provide some help. I'm the author of EasyExtend ( EE ) which is a system to build external DSLs for Python.
http://www.fiber-space.de/EasyExtend/doc/EE.html EE is very much work in progress and in the last year I was more engaged with increasing power than enhance accessibility for beginners. So be warned. A DSL in EE is called a *langlet*. Download the EE package and import it in a Python shell. A langlet can then be built this way: >>> import EasyExtend >>> EasyExtend.new_langlet("my_langlet", prompt = "myl> ", source_ext = ".dsl") This creates a bunch of files in a directory <site-packages-path>/EasyExtend/langlets/my_langlet Among them is run_my_langet.py and langlet.py. You can cd to the directory and apply $python run_my_langlet.py which opens a console with prompt 'myl>'. Each langlet is immediatly interactive. A user can also run a langlet specific module like $python run_my_langlet.py mod.dsl with the suffix .dsl defined in the langlet builder function. Each module xxx.dsl can be imported from other modules of the my_langlet langlet. EE provides a generic import hook for user defined suffixes. In order to do anything meaningful one has to implement langlet transformations in the langlet.py module. The main transformations are defined in a class called LangletTransformer. It defines a set of visitor methods that are marked by a decorator called @trans. Each @trans method is named like a terminal/non-terminal in a grammar file and responds to a terminal or non-terminal node of the parse tree which is traversed. The structure of the parse tree is the same as those you'd get from Pythons builtin parser. It is entirely determined by 4 files: - Grammar which is precisely the Python grammar found in the Python source distribution. - Grammar.ext which defines new non-terminals and overwrites old ones. - Token which defines Pythons token. - Token.ext which is the analog of Grammar.ext for token definitions. The Grammar.ext file is in the directory my_langlet/parsedef. There is also an analog lexdef directory for Token.ext. A possible Grammar.ext extension of the Python grammar that overwrites two non-terminals of looks like this: Grammar.ext ----------- trailer: '(' [arglist] ')' | '[' subscriptlist ']' | '.' NAME | NAME | NUMBER | STRING atom: ('(' [yield_expr|testlist_gexp] ')' | '[' [listmaker] ']' | '{' [dictmaker] '}' | '`' testlist1 '`' | NAME | NUMBER | STRING) ----------- Once this has been defined you can start a new my_langlet session and type myl> navigate_to 'www.openstreetmap.org' website Traceback (most recent call last): File "C:\lang\Python25\lib\site-packages\EasyExtend\eeconsole.py", line 270, in compile_cst _code = compile(src,"<input>","single", COMPILER_FLAGS) File "<input>", line 1 navigate_to 'www.openstreetmap.org' website ^ SyntaxError: invalid syntax myl> It will raise a SyntaxError but notice that this error stems from the *compiler*, not the parser. The parser perfectly accepts the modified non-terminals and produces a parse tree. This parse tree has to be transformed into a valid Python parse tree that can be accepted by Pythons bytecode compiler. I'm not going into detail here but recommend to read the tutorial http://www.fiber-space.de/EasyExtend/doc/tutorial/EETutorial.html that walks through a complete example that defines a few terminals, non-terminals and the transformations accordingly. It also shows how to use command line options to display parse tree properties, unparse parse trees back into source code ( you can eliminate DSL code from the code base entirely and replace it by equivalent Python code ), do some validation on transformed parse trees etc. -- http://mail.python.org/mailman/listinfo/python-list