Hi Andrew,

Le 17/11/2017 à 12:26, Prof. Andrew P. Black a écrit :

On 17 Nov 2017, at 14:10 , Thierry Goubier <thierry.goub...@gmail.com> wrote:


there is an 'E O F' token generated by SmaCC; I haven't tried to use it in a 
parser yet.

I tried patching the tokenActions table to trap on this, but the token id for E 
O F is outside of the range of the table.   The Python example that you pointed 
me to is a little different.  It overrides scannerError, and explicitly adds a 
newline token if there is an error at the end of the file.  It doesn’t actually 
use the E O F token, but it is probably a pattern that I can steal.

In all honesty, I wasn't thinking about that, but instead to be able to write '<eof>' in the grammar itself to terminate statements.

The Python approach is necessary because you may have to emit additional dedent tokens at the end of a file (this is a typical issue of those meaningfull identation whitespace languages: an idea used in the very beginning of programming languages, then considered harmfull, then coming back up again...).


In the meantime, I made the final StatementSeparator (<newline> or ";") 
optional in all the productions.  The grammar is a bit ugly, but the parser is cleaner.

Which is the cleanest way to do it (at least, like that, you have a documented way around that instead of carrying around a grammar + hacks in the scanner)(*)

I also gave up trying to eliminate intermediate parseTree nodes.  Instead, I 
eliminated intermediate productions form the grammar.  This makes the grammar 
more ugly (it has several repetitions where I inlined the intermediate 
productions), but the
tree construction is a lot more straightforward.

Sorry for having been unable to answer your questions on that :( I'm happy to learn you've found a way around it.

Thierry

(*) Which is still way better than a hand-written, recursive descent parser where any line can hide a hack...

        Andrew





Reply via email to