On Sun, Dec 17, 2000 at 02:11:50PM -0700, Nathan Torkington wrote:
> Nicholas Clark writes:
> > Would it be sane to get the parser to return suitable information on the
> > source to let a syntax analyser (such as a highlighting editor) know that
> > character positions 5123 to 5146 are a qq() string (So it can change the
> > font or the colour or whatever)
> 
> I think the problems with this that were raised in the past are:
>  * parsing partial source

If I understood Simon's suggestion, making the parser take truncated
"source" (ie start to arbitrary point in the "middle") was as simple as
changing the tokeniser's reaction to end-of-file

When I was thinking about this for prompting the user as part of multi-line
entry to an interactive perl shell, I wasn't thinking that the shell should
concatenate the lines of the script and send progressively longer
start-to-middle scripts until the parser is happy that "middle" is legal as
end. Instead, I was actually assuming that the parser would return a
complete opaque parser-state object, plus a hint of what what expected (if
asked, for the interactive user) and that next time round the parser is
re-started with its state object and only the next bit of script.

>  * does this mean that the parser has to reparse the whole sourcefile
>    every time you type a character?

I think if the parser can do the above (take an opaque state structure,
but hold no internal state) and perl provides API calls to copy the state
structure (and destroy copies) you don't need to reparse the whole
sourcefile.

Instead, the editor (say) parses tha partial source file to some point (say
just above the file position visible in the user's screen).  It holds onto
this partial state (partial-state-to-known-good-point), makes a copy, and
feeds this copy back to the parser along with just the partial source
visible on the user's screen, which the editor then uses for its display.
If the user types a character (or some small localised edit) the editor
just takes another copy of the partial-state-to-known-good-point and
re-runs the parser with that and the (now changed) visible source.
If the user moves outside this region, the editor has to do the hard work
of finding another known good point (downwards is easier - just carry on
and save a later point. upwards, and you have to start from the beginning)

Of course, things all grind much more and this doesn't work if you have
two views of the same file open and are editing the earlier part of the file
in one, whilst expecting the view of the later part to update in real time.

But I think it would work providing

0: The parser is fast. (how fast is perl5's (absolute rather than relative?))
1: The parser can be fed incremental units of source
2: The stores no state internally, and can encapsulate all its state into
   something it can hand back to you
3: Copying the state object is fast compared with re-parsing the script

> I think the better solution is to make Perl just a little more regular
> (as suggested in RFCs about making m in matches mandatory) so that it
> becomes easier to parse.  Of course with
> 
>   $a = \"foo";
> 
> your text editor needs to learn a little more about Perl :-)  Or you

it's emacs (I will now alienate >50% of readers who will say "but you
brought it on your self." :-)  Which was why I avoided mentioning which
editor it was). it's written in Lisp. Which here is :-( became I understand
about enough Lisp to write a .emacs file. If you've got an embedded perl
parser to syntax highlight your perl, you might as well put the rest in and
let it be used as another editor macro language.

> could write your own little language, so that you would write:
> 
>   $a = reference_to("foo");

it's actually happy with \("foo")
There's more than one way to do it.

> Or, better yet, simply write in unambiguous XML:
> 
>   <operator name="=">
>     <operand position=0>
>       <variable type="scalar">a</variable>
>     </operand>
>     <operand position=1>
>       <operator name="reference_to">
>         <operand position=0>
>           <constant type="string">foo</constant>
>         </operand>
>       </operator>
>     </operand>
>   </operator>
> 
> :-)

But as you knew when you suggested it, neither of those last 2 suggestions
are perl for that fluent native speaker. :-(

Nicholas Clark

Reply via email to