Jean-Marc Lasgouttes wrote: >>>>>> "Angus" == Angus Leeming <[EMAIL PROTECTED]> writes: > > Angus> Not convincingly. I have been looking at this code hard over > Angus> the w/e in an effort to understand it. I'll get back to you > Angus> when I do. Mean while, throw the patch away. > > OK. > > Angus> Here, however is my current state of knowledge (just for your > Angus> delectation ;-) > > :) It seems to be an interesting mess...
Ok, JMarc. I dug deep enough to understand how it works and have developed a gruding admiration for it. I append 'lookAheadToken' (with some extra print statements). The guts of it is this line: if ($$in =~ /^(?:\s*)(?:$Text::TeX::commentpattern)?($Text::TeX::tokenpattern)/o) { This says: if this regex matches '$$in' then... It could be rewritten as if ($$in =~ /^{?:$RE1)(?:$RE2)?($RE3) { which says: if $RE1 is matched, store the match in $1 if $RE2 is matched, store the match in $2 else execute $RE3. $RE1 is ALWAYS matched so long as $$in is not ''. $RE3 contains our friend $macro. $macro = '\\\\(?:\)|((([^a-zA-Z)])|([a-zA-Z]+))\*?)\s*)'; # 4 321 1 1 12 3 4 This says, if there is a match throw away the leading '\' and store the remainder. In this case, it is stored in $3. $macro will match \X where X is: case 1: ')' (but not ')*'). case 2: a single char 'not alphabetical, a-z or A-Z', followed by a single '*', if present, followed by an arbitrary amount of whitespace. Thus, matches both '\\' and '\\*'. case 3: a multi-char, alphabetical string, followed by a single '*', if present, followed by an arbitrary amount of whitespace. Thus, this matches both '\section' and '\section*'. In conclusion, therefore, my patch _is_ safe and does specialise the test so that '\)' (but not '\)*' is counted as a macro. Convinced yet? If not, add the three print statements below to 'lookAheadToken' and run $ reLyX -f yourfile.tex | less You'll see it parsing 'syntax.default'. Angus sub lookAheadToken { # If arg2, will eat one token - WHY!? -Ak my $txt = shift; # Call paragraph with no argument to say we're "just looking" my $in = $txt->paragraph; return '' unless $in; # To be able to match without warnings my $comment = undef; if ($$in =~ /^(?:\s*)(?:$Text::TeX::commentpattern)?($Text::TeX::tokenpattern)/o) { + print "\$1 is '$1'\n" if (defined $1); + print "\t\$2 is '$2'\n" if (defined $2); + print "\t\$3 is '$3'\n" if (defined $3); if (defined $2) {return $1} #if 1 usualtokenclass char, return it ($1==$2) elsif (defined $3) {return "\\$3"} # Multiletter (\[a-zA-Z]+) elsif (defined $1) {return $1} # \" or notusualtokenclass } return ''; }