Over the last couple of years, I've built a module called rex that lays on top of (and from the user's point of view, hides) the re module. rex offers the following advantages over re.
* Construction of re's is object oriented, and does not require any knowledge of re syntax. * rex is _very good_ at combining small re's into larger re's. In fact, this sort of composition is almost effortless. It greatly simplifies the creation of complex re's, since you can build smaller re's, build test code for them, compose them, build test code for the composition, compose again, etc. * Reading and understanding re's built with rex is much easier than understanding primitive re's. * No re metacharacters are used by rex. for example, a pattern to match any character except a, b, or c is just 'not CHAR("abc")' rather [^abc]. A pattern to recognize all of a, b, c and "^" is CHAR("^abc"). * Many useful predefined patterns are defined. For example PAT.float matches floating point numbers, so a simple pattern to recognize complex numbers is PAT.float + ALT("+", "-") + PAT.float + "i" * The match result object returned by rex is much more flexible, and has many more functions than that returned by re. Commonly performed operations on match results can often be done with much less (and much clearer) code than re. I had hoped to polish this for a 'true' 1.0 release, but it's become obvious to me that I won't get to do this in the foreseeable future. Here is the current status of the project. * rex is already highly functional. I use it all the time, and I have had very few bugs emerge. The testing code is fairly comprehensive, and every time I do find a bug, I've put in another test case. I haven't use pure re's in over a year and a half. * rex supports almost all re functionality. Backrefs and a couple of the new re features added in (I think) python 2.4 are not yet supported, but should be easy to put in. * There are some other very useful functions I've partially implemented, but not finished to the point they can be used. This should be quite easy, I just haven't had the need. * I'm not entirely sure the API is ideal. Some discussion is needed on this. * Internal documentation is decent, but not great. * Internal code is again decent, but not great. * User's documentation is not bad, but somewhat out of date. One of the problems here is that rex makes use of a lot of constants, which cannot be documented using Python's docstrings. In addition, rex is complex enough that it needs an external manual or good html ref, and none of the multiple attempts at pure Python doc systems do this well for everything that is needed. Now that d'oxygen works with Python, I would like to redo all of the documentation in D'oxygen. * Everything is in a single file. This should be split up. I would like to avoid putting this up on sourceforge as I think it would do much better at a site aimed specifically at Python development. Given the above, are people interested in seeing this as a project they might be interested in working on? And where should I created the project? Thanks, Ken -- http://mail.python.org/mailman/listinfo/python-list