On Mon, Jun 27, 2011 at 3:57 AM, Reiner Pope <reiner.p...@gmail.com> wrote:

> This makes sense. I haven't really understood the code that implements this,
> though.

Right, in the end I am not sure the current code hits a proper balance
between intricacy and performance.


> I'm interested in making a better LaTeX mode for yi, and for this I need
> better parsing. Some things I'd eventually like are:

Just a few remarks on this ...

> Syntax checking with error reporting. For example, if I'd like to report if
> \frac{}{} is given too few arguments, or if it is used outside maths mode.

That's actually supported by the lazy-incremental parsing, if one
cares to do (a lot of) work on an error-correcting LaTeX parser.

> Reference tracking, like RefTeX in Emacs. To support this, we would need to
> know where all the \label{} macros are in the document, both before and
> after the point.

Agreed, this is badly supported by the lazy-incremental parsing approach.

> it could be significantly faster to cache results somewhere
> in the process

The lazy-incremental approach actually supports caching all results
that can be computed "online" (ie. compositionally). I do believe that
a renderer of equations could be made to behave properly, unless a
global layout algorithm is applied.

> I thought it might be fun to try something along the lines of Edward Kmett's
> talks on monoidal parsing[1], in particular his Bitonic datatype.

I am not sure I could understand enough from that webpage, unfortunately.

> To handle the above, I would design the parser cache along the following
> lines:
> [***]

I can't see a (useful) monoidal structure there. Indeed, you don't
seem to propose to handle it monoidally in the operations you give
below... (Aside: I strongly believe that the parser should not have to
know about window refs)

> If the user has modified a region, then here's how we update the parser cache
> [***]

I have thought about a similar algorithm before, but I could never get
it to work in all cases. Eg. what happens when the user edits the tree
structure itself? (That is, when you have incorrect intermediate
states).

> Update the ContextTree. This can be done quickly: by splitting the tree, we
> can reuse the parts of the tree from both before and after the changes.

> needs to know the window focuses during parsing;

I still don't see why the window has any importance.

> I would like the Highlighter datatype to support a parser like this. The 
> Highlighter datatype is already mostly abstracted from the incremental 
> parsing framework; I'm just looking to go a little bit further.

That makes sense overall. However, as I said before, it is useful to
change that only if one can see concrete uses.

> I'd certainly be interested in your ideas for how I could implement the
> LaTeX parser, and also your general plans for parsers that you mentioned.

Even though the current lazy-incremental parsers are pretty good in
many cases, I agree that they do not blend well with multiple window
editing. Also, it appears to be quite challenging to construct good
error-correcting parsers that preserves the online properties.

Therefore, I think the next approach to explore is that of monoidal
parsers (one could also say "bottom-up" parsers). I already discussed
it a bit wit Edward a couple of years ago, but we did not get to
implement anything together. What's the status of his parsing
framework? As far as I can see, intermediate parsing results should be
stored at each node of a finger tree (maybe the Rope type) and updated
along with the text contained in the tree, which seem to be in
agreement with the stuff you linked to.

In any case, I do believe that the first step would be to integrate
monoidal lexers (eg.
http://blog.sigfpe.com/2009/01/fast-incremental-regular-expression.html).
Ideally this should be done as another backend to Alex, since we have
already the appropriate definitions (.x files).


How to implement the LaTeX parser?  Half Following Edward's ideas,
half making it up myself, and assuming we concentrate on begin/end
aspect of it, I imagine the "building block" (monoid) should be:

* A stack of locally unmatched "\end"s.
* A body consisting of a fully parsed structure
* A stack of locally unmatched "\begin"s


* The unit of the monoid is the empty structure.
* The append should match the ends of the left argument with the
begins of the right one, and build up the body in consequence.
Leftover begins/ends just accumulate.

If the begins don't happen to match the ends, then error-correction
must be performed, and a number of possibilities open. However a
trivial algorithm can be applied if one does not care too much about
minimally failing error-states. (eg. just report unmatching pairs).

To handle references, another triple must be added to the structure:

* Provided labels
* Resolved references
* (locally) unresolved references.

Note that there is no need whatsoever to know about windows, because
the parsing information is constantly kept in sync with buffers.
Rendering at a given window is not problematic because the finger tree
retains (eagerly computed) position information.

> In fact, I
> tested this by moving the calls to 'pureM clearAllSyntaxAndHideSelection'
> and 'pureM focusAllSyntax' next to each other, and nothing appears to have
> been broken.

This change appears to be harmless indeed.

Cheers,
JP.

-- 
Yi development mailing list
yi-devel@googlegroups.com
http://groups.google.com/group/yi-devel

Reply via email to