Re: [patch] reLyX and \(...\)*

Angus Leeming Mon, 10 Feb 2003 14:36:10 -0800

Jean-Marc Lasgouttes wrote:

>>>>>> "Angus" == Angus Leeming <[EMAIL PROTECTED]> writes:
> 
> Angus> Not convincingly. I have been looking at this code hard over
> Angus> the w/e in an effort to understand it. I'll get back to you
> Angus> when I do. Mean while, throw the patch away.
> 
> OK.
> 
> Angus> Here, however is my current state of knowledge (just for your
> Angus> delectation ;-)
> 
> :) It seems to be an interesting mess...


Ok, JMarc. I dug deep enough to understand how it works and have developed a 
gruding admiration for it. I append 'lookAheadToken' (with some extra print 
statements). The guts of it is this line:

    if ($$in =~ 
        /^(?:\s*)(?:$Text::TeX::commentpattern)?($Text::TeX::tokenpattern)/o) {

This says: if this regex matches '$$in' then...

It could be rewritten as
        if ($$in =~ /^{?:$RE1)(?:$RE2)?($RE3) {

which says: 
        if $RE1 is matched, store the match in $1
        if $RE2 is matched, store the match in $2
        else execute $RE3.
$RE1 is ALWAYS matched so long as $$in is not ''.
$RE3 contains our friend $macro.

$macro = '\\\\(?:\)|((([^a-zA-Z)])|([a-zA-Z]+))\*?)\s*)';
#             4     321          1 1         12   3   4

This says, if there is a match throw away the leading '\' and store the 
remainder. In this case, it is stored in $3.

$macro will match \X where X is:
case 1: ')' (but not ')*').
case 2: a single char 'not alphabetical, a-z or A-Z', followed by a single 
'*', if present, followed by an arbitrary amount of whitespace.
Thus, matches both '\\' and '\\*'.
case 3: a multi-char, alphabetical string, followed by a single '*', if 
present, followed by an arbitrary amount of whitespace.
Thus, this matches both '\section' and '\section*'.

In conclusion, therefore, my patch _is_ safe and does specialise the test so 
that '\)' (but not '\)*' is counted as a macro.

Convinced yet? If not, add the three print statements below to 
'lookAheadToken' and run 
$ reLyX -f yourfile.tex | less
You'll see it parsing 'syntax.default'.

Angus

  sub lookAheadToken {          # If arg2, will eat one token - WHY!? -Ak
    my $txt = shift;
    # Call paragraph with no argument to say we're "just looking"
    my $in = $txt->paragraph;
    return '' unless $in;       # To be able to match without warnings
    my $comment = undef;
    if ($$in =~ 
        /^(?:\s*)(?:$Text::TeX::commentpattern)?($Text::TeX::tokenpattern)/o) {
+       print "\$1 is '$1'\n" if (defined $1);
+       print "\t\$2 is '$2'\n" if (defined $2);
+       print "\t\$3 is '$3'\n" if (defined $3);
      if (defined $2) {return $1} #if 1 usualtokenclass char, return it 
($1==$2)
      elsif (defined $3) {return "\\$3"} # Multiletter (\[a-zA-Z]+)
      elsif (defined $1) {return $1} # \" or notusualtokenclass
    }
    return '';
  }

Re: [patch] reLyX and \(...\)*

Reply via email to