Flex/Bison: which version to use?
Hello, From a vague memory of fellow developers' experiences, I have the idea that what version of Bison/Flex one use is significant. But perhaps that isn't relevant anymore. For my project, written in C++, I simply downloaded the latest versions(at least at that time); I use Bison 2.0, and Flex 2.5.31. Does it matter what version one uses? What is recommended? For what scenarios does version matter(portability, programming language, functionality, etc)? (if at all) Another question; I noticed that I had "flex++" installed, as well as that "bison++" exists, after a quick googling. Are those "++" versions established software which people use? What are their advantages? When should I need to think about whether I should use the "++" programs? Thanks in advance, Frans ___ Help-bison@gnu.org http://lists.gnu.org/mailman/listinfo/help-bison
Re: Flex/Bison: which version to use?
At 11:28 + 2005/06/13, Frans Englich wrote: For my project, written in C++, I simply downloaded the latest versions(at least at that time); I use Bison 2.0, and Flex 2.5.31. Those are the latest official versions. For Bison, there is a test version 2.0a, you may want to try out, to make sure the next release works for you: ftp://alpha.gnu.org/gnu/bison/bison-2.0a.tar.gz Does it matter what version one uses? What is recommended? For what scenarios does version matter(portability, programming language, functionality, etc)? (if at all) Typically, only the last official version is supported. If you have problems with an earlier version, people helping will not remember those versions, and don't expect there will be bug fix; more likely, it has already been reported and fixed in a later version. If one needs bug fixes, sometimes one may need to pick down the latest alpha or from the CVS. I did that with Flex on Mac OS 10.3.9. Another question; I noticed that I had "flex++" installed, as well as that "bison++" exists, after a quick googling. Are those "++" versions established software which people use? What are their advantages? When should I need to think about whether I should use the "++" programs? Those are independent programs, supposedly old. Bison and Flex do not have anything with them to do, what I know. -- Hans Aberg ___ Help-bison@gnu.org http://lists.gnu.org/mailman/listinfo/help-bison
Modularized parser files covering similar grammars
Hello, I have a design dilemma that will become real some time in the future, and consider how large it is, I thought it could be a good idea to take a quick look forward. I am building a Bison parser for a language, or to be precise, multiple languages which all are very similar. I have a "main" language, followed by three other languages which all are subsets of the main language. To be precise, I'm building a parser for the XPath language, and the different flavours I need to be able to distinguish are: * XPath 2.0. This is as broad as it gets. * XPath 1.0. A subset of XPath 2.0. XPath 2.0 is an extension of XPath 1.0 * XSL-T 2.0 Patterns. A small subset of XPath 2.0 * XSL-T 1.0 Patterns. A small subset of XPath 1.0 * W3C XML Schema Selectors. An even smaller subset of XPath 1.0 My wondering is how I practically should modularize the code in order to efficiently support these different languages. First of all, my thought is that the scanner(flex) is the same in either case(e.g, support all tokens in XPath 2.0), and that distinguishing the various "languages" is done on a higher level(parser). Distinguishing XPath 1.0/2.0 is from what I can tell the easiest. Since XPath 2.0 is an extension to 1.0, one can pass the parser an argument which signifies whether it's 1.0 that is parsed, and in the actions for 2.0 expressions error out if 1.0 is being parsed. In other words, conditional checks on an action basis. This approach, however, easily becomes complex when taking the other grammars into account, because one needs to be "context" aware. For example, XSL-T Patterns is a sub-set, but the constructs that are disallowed are only done so in certain scenarios. Hence, if one continued with conditional tests("What language am I parsing?") inside actions, it would require to implement "non-terminal awareness". Another approach, which seems attractive to me if it's possible, is to modularize the grammar on the API/file level. For example, the tokens are declared in one file, non-terminals grouped in files, and a separate parser is constructed for each language. It would be preferred if it was also modularized on the object level, but I guess the disadvantage wouldn't be that big if it wasn't. In other words, if one could "select start token depending on language" it would solve my problems, it seems. I don't know how this "bison modularization" would be done practically though. What are people's experiences with these kind of problems? What are the approaches for solving them? Cheers, Frans PS. For those interested, here are the EBNF productions for what I'm talking about: XPath 2.0(1.0 is merely a subset): http://www.w3.org/TR/xpath20/#nt-bnf XSL-T Patterns: http://www.w3.org/TR/xslt20/#pattern-syntax W3C XML Schema Selectors: http://www.w3.org/TR/xmlschema-1/#coss-identity-constraint btw, there's also an interesting document wrt to parser/scanner construction & XPath, "Building a Tokenizer for XPath or XQuery": http://www.w3.org/TR/2005/WD-xquery-xpath-parsing-20050404/ ___ Help-bison@gnu.org http://lists.gnu.org/mailman/listinfo/help-bison
Re: Modularized parser files covering similar grammars
>>> "Frans" == Frans Englich <[EMAIL PROTECTED]> writes: > My wondering is how I practically should modularize the code in order to > efficiently support these different languages. In the future, I would like to have something like %import in Bison, but currently, you'll have to put everything into a single file (or run your own process beforehand). > What are people's experiences with these kind of problems? What are the > approaches for solving them? I don't know how similar/different are your different languages, but if they share some large parts, say there are common sublanguages covered by equal nonterminals, then the following technique might useful. I have two similar languages (in fact it's almost a single grammar with two entry points). First, in the parser I have: %start program %% program: /* Parsing a source program. */ "seed-source" exp { tp.exp_ = $2; } | /* Parsing an imported file. */ "seed-import" "let" decs "end"{ tp.decs_ = $3; } ; In fact I'm looking either for `exp' or `"let" decs "end"', but I have fake tokens seed-*. Then in the (Flex) scanner, I have: ... %% %{ /* Be ready to insert the seed. */ if (seed) { int res = seed; seed = 0; return res; } %} where seed is initialized in some way to the first token you want to send (that depends whether your parser is pure or not etc.). There is no limitation on the number of initial tokens, i.e., the actual number of start symbols. ___ Help-bison@gnu.org http://lists.gnu.org/mailman/listinfo/help-bison
"Eating" comments: not with Flex but with Bison
Hello, In some languages there are constructs which are insignificant to the parse tree in the same way as white space (sometimes) is. Comments is one such example. The Flex manual have an example on how to do it at the scanner level. Ã…atterns which matches the comments, but doesn't return tokens: http://www.gnu.org/software/flex/manual/html_chapter/flex_11.html#SEC11 I think I have a special scenario wrt. to comment handling: in one version of my language, comments are allowed while in another it is not. Hence, depending on version I want to flag the existence of comments as syntax errors, regardless of whether they are valid. I would prefer to do this at the Bison/Parser level because it is convenient: I have access to various information passed to the parse function, the YYERROR macro, and the error function. The problem I see if I let Flex return a COMMENT token and add a non-terminal in the Bison grammar to implement the checking, is how to make it play well with the other rules -- the token gets in the way. What would solve my problem(AFAICT) is if I could write a non-terminal("Comment") that matched the COMMENT token and then simply ate it, such that the parser could continue to deduce the "real" tokens, as if the COMMENT had never existed(while the action code nevertheless did the check whether the comment was allowed). AFAICT, something like that must be done, since I can't add the COMMENT token everywhere(it can appear between every token). Any ideas how to do that? (something with yyclearin..?) Or I am perhaps trying to solve the problem in a wrong way? (perhaps I should put the handling in the scanner, for example) Also, I've ask a lot of questions -- tell if I'm asking too much, or point me to docs if I haven't RTFM. Cheers, Frans ___ Help-bison@gnu.org http://lists.gnu.org/mailman/listinfo/help-bison
Re: "Eating" comments: not with Flex but with Bison
At 20:46 + 2005/06/13, Frans Englich wrote: I think I have a special scenario wrt. to comment handling: in one version of my language, comments are allowed while in another it is not. Hence, depending on version I want to flag the existence of comments as syntax errors, regardless of whether they are valid. You could try to set a context switch in the lexer (see the Flex manual, "start conditions") on the comment lexing rule(s). Then this context switch can be turned on/off in various ways. For example, the lexer could be initiated (right after %% in the .l file) with checking a global variable doing this. This variable can be turned on/off from the parser. -- Hans Aberg ___ Help-bison@gnu.org http://lists.gnu.org/mailman/listinfo/help-bison