Re: "nested" languages

2005-06-28 Thread Laurence Finston
On Tue, 28 Jun 2005, Neil Conway wrote:

> I'm trying to use Bison to construct a parser for a language A that
> needs to allow syntax from a second language B to be embedded within it

[...]

> Ideally I'd like to have A's parser
> "call into" B's parser, have it accept as much input as it can, and then
> return control to A's parser.

[...]

> Is this possible?
> One idea would be to use hand-written parsers for both
> languages, although it would be nice to stick with Bison if possible.

I think this would be possible using Bison.  "Accepting as much input as
it can" is tricky.  I believe it would be best for it to be a complete
statement (in the generic sense) of language B, i.e., the code passed to
your B parser will have to reduce to the start symbol of the B language.

I see no difficulty in principle in creating a second parser function
using Bison, say 'zzparse()', and calling it in one or more of the actions
in your A parser.

Laurence Finston


___
Help-bison@gnu.org http://lists.gnu.org/mailman/listinfo/help-bison


Re: "nested" languages

2005-06-28 Thread Hans Aberg

At 11:40 +1000 2005/06/28, Neil Conway wrote:
I'm trying to use Bison to construct a parser for a language A that 
needs to allow syntax from a second language B to be embedded within 
it (at well-defined positions within A). For example, suppose a 
production from the grammar of A is:


if_stmt: IF expr THEN if_body END IF ;

where "expr" is a production in *another* Bison grammar. Language B 
is defined by a fairly complex grammar that changes with some 
regularity (in this case, the embedded language B is SQL, defined by 
an ~8500 line bison grammar). B's grammar is maintained separately 
-- merging B's grammar into the grammar for A and maintaining two 
copies is a headache I'd like to avoid, if possible. Ideally I'd 
like to have A's parser "call into" B's parser, have it accept as 
much input as it can, and then return control to A's parser.


Is this possible? One idea would be to use hand-written parsers for 
both languages, although it would be nice to stick with Bison if 
possible.


Bison only one single .y file as input, and must be presented such a 
file. One way to do it, might to use a preprocessor (Perl, M4 
perhaps) so merge the files.


The problem might be, though, that the combined grammar isn't LALR(1) anymore.

(At present this is implemented by having the lexer look for a 
delimiter in the input that we know marks the end of the tokens of 
the embedded language. We then concatenate these tokens into a 
string, and eventually pass that string to the yyparse() of the 
embedded language's parser. So in the above example we would 
basically consume tokens until we see a "THEN", and assume that 
anything between the "IF" and the "THEN" is a SQL statement. This is 
ugly, to say the least.)


There are a few methods in use to combine grammars. One is to merge 
grammars together, as you suggested. Another is what you already is 
using. Sometimes, one is using a hybrid:


If one has a large number dynamic operator precedences, as in Prolog 
for example, then it is not really feasible to write that into a 
large Bison's grammar directly. So one would a small grammar, putting 
the operators on a stack, and then let a sort out the precedences in 
the actions. If the number of precedence is small, as Haskell, which 
only has ten levels, one could write it in directly in a static 
grammar, though. There, some do it, others use the first method.


In general, no method is better than the other; in the specific case, 
one chooses the one most convenient, simply.


You might also get more inputs on this question in the newsgroup 
comp.compilers.

--
  Hans Aberg


___
Help-bison@gnu.org http://lists.gnu.org/mailman/listinfo/help-bison