> Lars> Sure it can. "^[ \t]*([12][0-9][0-9])[ \t]+\"([^ ]+)\"[ \t]+.*"
>
> Wouldn't something like
> "^[ \t]*([12][0-9][0-9])[ \t]+\"(.*)\"[ \t]*$"
> be better? It avoids junk at the end, does not force a space after the
> final ".
You are mis-using regular expressions to solve a semantic problem.
It's much better to have a simpler regular expression that captures
the syntax, and then use a semantic routine to check that the
semantics are correct.
To illustrate, an alternative regexp could look like:
"^[ \t]*(0|1|2|3|4|5|6|7|8|9|10|11|...|255) ..."
Using this reg-exp, it's possible to capture the exact
semantics we tolerate, but obviously, it's a misuse of
regular expressions.
Personally, I agree with Jean-Marc that it's better to
use a lexer for this task. The lexer is part of the
parsing effort, and it's a mistake to use regexps as
a substitute.
Maybe LyXLex is a lousy lexer, but the right solution
would be to fix LyXLex to be good, and then still use
a lexer.
Greets,
Asger