Yes, there is literature on the generating side of the regular expression/FSM model. In fact, the matching problem and the generating problems are exactly equivalent. A slight variation of the definition of how a matcher works, turns it into a generator and vice versa. To directly generate (rather than match) from an FSM, one simply does a walk (e.g. depth first search or breath first search) over the machine writing out (rather than matching) the symbols used as labels. Thus, all the theorems about matching apply to generating and also in reverse.
You can see this to some extent form the problem posed. If one generates Sigma* and subtracts out the elements from some regular language (say by matching them), that is exactly equivalent (in strings generated) to generating the complement of the regular language. In fact, it is quite easy (with the correct regular expression tool, i.e. one that handles regular expression differences) to take the problems posed and generate provably minimal (i.e. provably maximally efficient) generation (or matching) programs as FSMs. The provably minimal FSM won't go down any paths that don't have some sequence that generates an "accept" value. It is worth noting the regular languages are closed under the difference operator, so that resulting language from substracting one regular expression from another is still a regular language, which can be used to prove that the machine is minimal. Therefore, while the output can be exponentially larger than the input, one should expect that implementations should be able to generate the output in linear time (relative to the size of the output), since FSMs run in linear time relative to the string they are processing (whether generating or matching). Under a reasonable model of computation, one cannot do better than linear in the size of string processed. I'm sure if you asked under comp.theory, you would get tons of other relevant facts from someone who understands automata theory better than I. Note, if one does ask there, one should correct the notation. The "*" symbol was used as in globbing, not as commonly used in regular expressions. The notation ".*" (as someone corrected in one of their replies) is the normal notation for what the original poster wanted by "*". Hope this helps, -Chris ***************************************************************************** Chris Clark Internet : [EMAIL PROTECTED] Compiler Resources, Inc. Web Site : http://world.std.com/~compres 23 Bailey Rd voice : (508) 435-5016 Berlin, MA 01503 USA fax : (978) 838-0263 (24 hours) ------------------------------------------------------------------------------ -- http://mail.python.org/mailman/listinfo/python-list