Forcing multiple parse stacks to 'reduce'
All, I have written a parser for C that processes a single statement or declaration at a time. So after each statement/declaration yyparse returns. Originally I used various ad-hoc rules in yylex to figure out which was the last token of a statement/declaration and then returned a zero value as the subsequent token. These ad-hoc rules are complicated (for instance, a ; can occur in a for loop header and a { can occur in a cast) and I had the 'great' idea of allowing the token following the last token of a statement/declaration to 'naturally' cause yyparse to generate the outstanding reductions. So in: w=3; y=5; when yylex returns id (because it has encountered y) the sequence w=3; reduces to statement and yyparse returns (yes, I need to hang onto y in order to correctly parse the next statement). In some cases multiple parse stacks are outstanding, and in most cases the following token causes the expected return to deterministic parsing and subsequent reduction to the accept state. But in some cases it does not. For instance, in typedef x y; by the time the ; is encountered there are three parse stacks. Depending on the token that follows the ; these three parse stacks may or may not be reduced to a single stack (which then reduces to a declaration). In the case: typedef x, y; typedef i, j; the second typedef token is shifted onto all three stacks and subsequent tokens are processed like a declaration (which they do form part of)! So I don't get a parse of a single declaration (in fact yyparse eventually reports an ambiguity). I know that for some grammars the optimizations performed by Bison result in ambiguities being reported (i.e., the BOGUS example in 5.7 of bison.info). In my case no ambiguity is being reported (I am in glr mode after all ;-). But the parse stacks are not being reduced as expected (all three are in the state I would expect them to be in when the second typedef token is encountered). Does anybody have any ideas on how I can force yyparse to consider reductions in the number of parse stacks and thence reductions to the accept state? derek -- Derek M Jones tel: +44 (0) 1252 520 667 Knowledge Software Ltdmailto:[EMAIL PROTECTED] Applications Standards Conformance Testing http://www.knosof.co.uk ___ Help-bison@gnu.org http://lists.gnu.org/mailman/listinfo/help-bison
Re: Forcing multiple parse stacks to 'reduce'
At 17:03 + 2005/02/28, Derek M Jones wrote: >I have written a parser for C that processes >a single statement or declaration at a time. >So after each statement/declaration yyparse >returns. Bison is clearly not built to handle such applications. The normal thing is to handle the whole language in one go. (How do you handle environments, like "{...}" statement-by-statement?) >Originally I used various ad-hoc rules in yylex >to figure out which was the last token of a >statement/declaration and then returned a zero >value as the subsequent token. > >These ad-hoc rules are complicated (for instance, >a ; can occur in a for loop header and a { can >occur in a cast) and I had the 'great' idea of allowing >the token following the last token of a statement/declaration >to 'naturally' cause yyparse to generate the outstanding >reductions. So in: > >w=3; >y=5; > >when yylex returns id (because it has encountered y) >the sequence w=3; reduces to statement and yyparse >returns (yes, I need to hang onto y in order to correctly >parse the next statement). > >In some cases multiple parse stacks are outstanding, >and in most cases the following token causes the expected >return to deterministic parsing and subsequent reduction to >the accept state. But in some cases it does not. For instance, >in > >typedef x y; > >by the time the ; is encountered there are three parse stacks. >Depending on the token that follows the ; these three parse stacks >may or may not be reduced to a single stack (which then reduces >to a declaration). > >In the case: > >typedef x, y; >typedef i, j; > >the second typedef token is shifted onto all three stacks and >subsequent tokens are processed like a declaration (which they >do form part of)! So I don't get a parse of a single declaration >(in fact yyparse eventually reports an ambiguity). > >I know that for some grammars the optimizations performed by >Bison result in ambiguities being reported (i.e., the BOGUS >example in 5.7 of bison.info). > >In my case no ambiguity is being reported (I am in glr mode >after all ;-). But the parse stacks are not being reduced as >expected (all three are in the state I would expect them to be >in when the second typedef token is encountered). > >Does anybody have any ideas on how I can force yyparse >to consider reductions in the number of parse stacks and >thence reductions to the accept state? In general, there is none, it seems. The idea of the GLR is that reduce/reduce conflicts may split the deterministic parser, until the ambiguity is resolved later in the parsing. The manual mentions the use of %dprec to choose between different actions, but that seems to be it. (I am not using %glr myself, so I am guessing here.) You might try to rewrite your input grammar so that when the ";" of a typedef is encountered, it reduces. Alternatively, you might try an entirely new approach to your whole program setup. You have not told us why you need this feature, to halt after each statement. Is it imperative that the parser halts right after the statement is finished? If not, assuming that the you have written a correct parser, you might attempt to halt it in the actions. The feature is similar to that of so called coroutines (part of Simula, for example). Coroutines can be implemented using threads. So you might use threads to achieve the same effect: Put the parser in a thread. Activate it. When an action where you want the parser to halt is executed, as a part of the action, halt the parser thread. Then do the other stuff you want to do. When parsing needs to be resumed, reactivate the parser thread. Hans Aberg ___ Help-bison@gnu.org http://lists.gnu.org/mailman/listinfo/help-bison
Re: Forcing multiple parse stacks to 'reduce'
Hans, >>I have written a parser for C that processes >>a single statement or declaration at a time. >>So after each statement/declaration yyparse >>returns. > >Bison is clearly not built to handle such applications. The normal thing is >to handle the whole language in one go. (How do you handle environments, >like "{...}" statement-by-statement?) I don't see why not. Bison is built to handle grammars. My grammar does not support sequences of statements or declarations. The whole language, in my case, consists of one statement or one declaration (which includes the header of a function definition). Characters such as { } are not part of the grammar (they are handled by a higher level routine). My aim is to parse the visible source (i.e., preprocessing directives are ignored). Error recovery is better if statements/declarations are parsed individually. 'Errors' might be caused by conditional preprocessor directives that select how a statement, for instance, should be processed. Single statement/declaration at a time localises syntax errors. >>the accept state. But in some cases it does not. For instance, >>in >> >>typedef x y; >> >>by the time the ; is encountered there are three parse stacks. >>Depending on the token that follows the ; these three parse stacks >>may or may not be reduced to a single stack (which then reduces >>to a declaration). >> >>In the case: >> >>typedef x, y; >>typedef i, j; >> >>the second typedef token is shifted onto all three stacks and >>subsequent tokens are processed like a declaration (which they >>do form part of)! So I don't get a parse of a single declaration >>(in fact yyparse eventually reports an ambiguity). >> >>I know that for some grammars the optimizations performed by >>Bison result in ambiguities being reported (i.e., the BOGUS >>example in 5.7 of bison.info). >> >>In my case no ambiguity is being reported (I am in glr mode >>after all ;-). But the parse stacks are not being reduced as >>expected (all three are in the state I would expect them to be >>in when the second typedef token is encountered). >> >>Does anybody have any ideas on how I can force yyparse >>to consider reductions in the number of parse stacks and >>thence reductions to the accept state? > >In general, there is none, it seems. The idea of the GLR is that >reduce/reduce conflicts may split the deterministic parser, until the >ambiguity is resolved later in the parsing. The manual mentions the use of >%dprec to choose between different actions, but that seems to be it. (I am >not using %glr myself, so I am guessing here.) %dprec is designed to resolve ambiguities between multiple parse stacks. My multiple parse stacks are in agreement with each other. >You might try to rewrite your input grammar so that when the ";" of a >typedef is encountered, it reduces. My question is how do I do that? My grammar is copied straight from annex A of the C Standard (plus six or so %dprecs and two changes from right recursion to left recursion). I am guessing that some form of lazy evaluation is being used internally by Bison. After all, in most cases an end-of-file indicator is used to force reductions. >Alternatively, you might try an entirely new approach to your whole program >setup. You have not told us why you need this feature, See above. derek -- Derek M Jones tel: +44 (0) 1252 520 667 Knowledge Software Ltdmailto:[EMAIL PROTECTED] Applications Standards Conformance Testing http://www.knosof.co.uk ___ Help-bison@gnu.org http://lists.gnu.org/mailman/listinfo/help-bison
Identifying rule responsible for lookahead
Following is a sample grammar. There is one r/r conflict. % cat test.y %token YYID YYDOT %% identifier : hier_id ; hier_id : simple_id | hier_id opt_select YYDOT simple_id ; opt_select : | opt_select '[' expr ']' ; simple_id : YYID ; expr : hier_id | function_call ; function_call : expr YYDOT YYID ; % bison -r all -v test.y -g test.y: conflicts: 1 reduce/reduce >From test.output: state 10 3 hier_id: hier_id . opt_select YYDOT simple_id 4 opt_select: . [YYDOT, '['] 5 | . opt_select '[' expr ']' 7 expr: hier_id . [YYDOT, ']'] YYDOT reduce using rule 4 (opt_select) YYDOT [reduce using rule 7 (expr)] YYDOT is in the lookahead set of rule 7 (expr: hier_id) because of rule 9 (function_call: expr YYDOT YYID). If nonterminal expr is used is many rules, it gets difficult to find out which usage is causing problem. Basically I want to annotate the rules responsible for a lookahead token as follows. state 10 3 hier_id: hier_id . opt_select YYDOT simple_id 4 opt_select: . [YYDOT (3), '[' (5)] 5 | . opt_select '[' expr ']' 7 expr: hier_id . [YYDOT (9), ']' (5)] YYDOT reduce using rule 4 (opt_select) YYDOT [reduce using rule 7 (expr)] I was wondering if there is a way to get this information out of bison. Please help me. -Soumitra. __ Do you Yahoo!? Yahoo! Mail - Helps protect you from nasty viruses. http://promotions.yahoo.com/new_mail ___ Help-bison@gnu.org http://lists.gnu.org/mailman/listinfo/help-bison
Re: Identifying rule responsible for lookahead
Your grammar is ambigious. It can be seen if you make the following transformation, factoring out the null transition of opt_select, and you will see the shift/reduce conflict: %token YYID YYDOT %% identifier : hier_id; hier_id : simple_id | hier_id YYDOT simple_id | hier_id opt_select YYDOT simple_id ; opt_select : '[' expr ']' | opt_select '[' expr ']'; simple_id : YYID ; expr : hier_id | function_call; function_call : expr YYDOT YYID ; now you see the function_call is the same as the added hier_id YYDOT simple_id hope this helps Henrik On Monday 28 February 2005 21.46, Soumitra Kumar wrote: > %token YYID YYDOT > %% > identifier : hier_id > ; > hier_id : simple_id > | hier_id opt_select YYDOT simple_id > ; > opt_select : > | opt_select '[' expr ']' > ; > > simple_id : YYID ; > expr : hier_id > | function_call > ; > function_call : expr YYDOT YYID > ; ___ Help-bison@gnu.org http://lists.gnu.org/mailman/listinfo/help-bison
Re: Forcing multiple parse stacks to 'reduce'
Derek M Jones wrote: > >>I have written a parser for C that processes > >>a single statement or declaration at a time. > >>So after each statement/declaration yyparse > >>returns. > > > >Bison is clearly not built to handle such applications. The normal thing is > >to handle the whole language in one go. (How do you handle environments, > >like "{...}" statement-by-statement?) > > I don't see why not. Bison is built to handle grammars. Because bison expects the lexer to return a EOF (or perhaps more precisely, end-of-grammar) token to signal the end of input (of one "grammar", i.e. one statement in your case). So you'd have to supply this token, which is exactly the problem you're having ... Frank -- Frank Heckenbach, [EMAIL PROTECTED] http://fjf.gnu.de/ GnuPG and PGP keys: http://fjf.gnu.de/plan (7977168E) ___ Help-bison@gnu.org http://lists.gnu.org/mailman/listinfo/help-bison