On Apr 16, 2:50 am, Thomas Dybdahl Ahle <[EMAIL PROTECTED]> wrote: > Hi, I'm writing a program with a large data stream to which modules can > connect using regular expressions. > > Now I'd like to not have to test all expressions every time I get a line, > as most of the time, one of them having a match means none of the others > can have so. > > But ofcource there are also cases where a regular expression can > "contain" another expression, like in: > "^strange line (\w+) and (\w+)$" and "^strange line (\w+) (?:.*?)$" in > which case I'd like to first test the seccond and only if it mathces test > the seccond. > > Do anybody know if such a test is possible? > if exp0.contains(exp1): ...
What you want is finding if R2 is a superset of R1 for two given regular languages R1 and R2. I know of some methods for finding intersection of two regular languages; and I think the time/space complexity is big. So the simple answer is it is not feasible to provide such support for two generic r.e.s without a large time/space usage. You may consult any of the math/theory groups for more insights. If you know already R2 >= R1 (that is you precompute and remember), then it's a trivial to skip checking for R1 if R2 turned up negative. You can even arrange all the Rs in a binary tree like fashion and skip checking a whole subtree if the sub-tree's root node gave negative for r.e. match. Karthik -- http://mail.python.org/mailman/listinfo/python-list