Re: "Eating" comments: not with Flex but with Bison

2005-06-14 Thread Akim Demaille
>>> "Frans" == Frans Englich <[EMAIL PROTECTED]> writes:

 > I would prefer to do this at the Bison/Parser level because it is
 > convenient: I have access to various information passed to the
 > parse function,

You can easily make them available to the scanner.  And in fact, you
probably should, to have a clean, pure, interface bw the two.

 > the YYERROR macro, and the error function.

There is nothing wrong with defining a token, but not allowing it in
the grammar.  Just return it when there is a comment (and you don't
want them), then you'll have the expected result: a parser error.

 > The problem I see if I let Flex return a COMMENT token and add a
 > non-terminal in the Bison grammar to implement the checking, is how
 > to make it play well with the other rules -- the token gets in the
 > way.

Of course you must not try to use the token COMMENT when comment are
valid, just discard them.




___
Help-bison@gnu.org http://lists.gnu.org/mailman/listinfo/help-bison


Re: "Eating" comments: not with Flex but with Bison

2005-06-14 Thread Frans Englich
On Tuesday 14 June 2005 11:36, Akim Demaille wrote:
> >>> "Frans" == Frans Englich <[EMAIL PROTECTED]> writes:
>  >
>  > I would prefer to do this at the Bison/Parser level because it is
>  > convenient: I have access to various information passed to the
>  > parse function,
>
> You can easily make them available to the scanner.  And in fact, you
> probably should, to have a clean, pure, interface bw the two.

Ok, not fully following here, what you mean with "make them available to the 
scanner". From what I can tell, I must in either case have patterns that 
catches the comments(perhaps that's what you mean with "make them 
available"); what I then do, if I return tokens, is another matter AFAICT.

>
>  > the YYERROR macro, and the error function.
>
> There is nothing wrong with defining a token, but not allowing it in
> the grammar.  Just return it when there is a comment (and you don't
> want them), then you'll have the expected result: a parser error.
>
>  > The problem I see if I let Flex return a COMMENT token and add a
>  > non-terminal in the Bison grammar to implement the checking, is how
>  > to make it play well with the other rules -- the token gets in the
>  > way.
>
> Of course you must not try to use the token COMMENT when comment are
> valid, just discard them.

Again a bit confused by the wording; tokens must first exist in order to be 
able to discard them, is my thinking. Perhaps you mean that "by discarding 
them" that the scanner never returns them at all.

So, to see if I've understood you correctly, you suggest that:

1. The (Bison) parser have no knowledge about COMMENT tokens(e.g no rules), 
except that it merely declares the token.

2. The (Flex) scanner has patterns that matches the comments, combined with 
that the scanner have business logic that decides whether to return a COMMENT 
token or not(and the former leading to a syntax error since the parser can't 
handle it).

Hm, then have error handling been placed in the scanner, which confuses me 
with respect to "You can easily make them available to the scanner. And in 
fact, you probably should, to have a clean, pure, interface bw the 
two."(assuming my interpretation is correct).


Cheers,

Frans


___
Help-bison@gnu.org http://lists.gnu.org/mailman/listinfo/help-bison


Re: "Eating" comments: not with Flex but with Bison

2005-06-14 Thread Kelly Leahy
Frans,

If I understand you correctly, you want to know
whether comments were in the source, but you don't
care about the exact contents of the comments, right?

If this is the case, how much do you need to know
about the comments?  Do you need, for instance, the
line number on which they appeared, and the location,
or just that they were there?

You said that it is an error to have comments in some
of your source files.  How do you want the error
message to look?  Should it just say "comments not
allowed in xxx source", or should it output one error
message for each comment, along with location info as
mentioned above?

If all you need to do is know whether comments existed
(anywhere) in the source, just have your lexer set a
flag (for every comment it sees) and never reset this
flag for a given source file.  Then, when you are
finished parsing, raise an error about the comments. 
If parsing is too intensive to do this, then add a
step in the yyparse to reject the parse at first sign
of this flag being set.

If you need to know where the comments occurred, you
can do this one of several ways.  You could, for
instance, keep some sort of list of the comments and
their locations in the lexer (accessible to the
parser), or you could add comment information to your
token type and wrap your yylex function with another
function that is used by the parser instead of yylex. 
This function can call yylex and if it gets your
COMMENT token, push it on a local stack.  Then, call
yylex again.  Another COMMENT, push it again.  Keep
going until you get a non-COMMENT, and then attach the
stack to the non-COMMENT node and pass it back.

Kelly



___
Help-bison@gnu.org http://lists.gnu.org/mailman/listinfo/help-bison


Re: "Eating" comments: not with Flex but with Bison

2005-06-14 Thread Tim Van Holder
Frans Englich wrote:
> On Tuesday 14 June 2005 11:36, Akim Demaille wrote:
> 
>"Frans" == Frans Englich <[EMAIL PROTECTED]> writes:
>>
>> >
>> > I would prefer to do this at the Bison/Parser level because it is
>> > convenient: I have access to various information passed to the
>> > parse function,
>>
>>You can easily make them available to the scanner.  And in fact, you
>>probably should, to have a clean, pure, interface bw the two.
> 
> 
> Ok, not fully following here, what you mean with "make them available to the 
> scanner". 
> [snip]
> Hm, then have error handling been placed in the scanner, which confuses me 
> with respect to "You can easily make them available to the scanner. And in 
> fact, you probably should, to have a clean, pure, interface bw the 
> two."(assuming my interpretation is correct).

What Akim meant is that whatever information you pass to the parser
should also be passed down to the lexer.

Example:

=> in parserctx.h:

struct my_parse_context {
  bool comments_allowed;
};

=> parser invocation:
...
struct my_parse_context pc;
  fooparse(&pc);
...

=> in foo.y:

%{
...
#include "parserctx.h"
...
%}

%parse-param {my_parse_context* context}
%lex-param {my_parse_context* context}

...

=> in foo.l:

%{
...
#include "parserctx.h"
#define YY_DECL int yylex(my_parse_context* context)
...
%}

...

%%

{COMMENT} {
  if (!context->comments_allowed) {
/* alternative: yyerror() and exit() if this is a fatal problem */
REJECT;
  }
}



This way it becomes easy to pass information from the caller to parser &
lexer, and/or between lexer and parser.  It also avoids using global
variables for such purposes, which keeps things thread-safe.


___
Help-bison@gnu.org http://lists.gnu.org/mailman/listinfo/help-bison