Hi, wanting to obtain gcc's parse tree I found the -fdump-tree option. Specifically the -fdump-tree-original-raw output contains all of the information I need. When I tried to write a parser for the tree dump I noticed, however that the output might contain ambiguities which can't be resolved and render parsing the output correctly impossible. For example, if your code contains a string constant with a colon you might get into trouble:
warning("wrong type: foobar"); This code will make -fdump-tree-original-raw produce something like this: @54 string_cst type: @61 strg: wrong type: foobar lngt: 19 Now try to make your parser extract the correct type. Similar problem occur when a string constant contains "\n" or even "\0". The former will produce a linebreak in the output making it hard if not impossible to parse. The latter will produce output where the string constant is truncated after the \0 character event though the "lngt" property specifies the length of the whole, non-truncated string. So to resolve that problem I took the gcc 4.0.1 source code and patched tree.h and tree-dump.c. The patched version introduces two new options for -fdump-tree: The "parseable" option which produces unambiguous and easier to parse but otherwise similar output to "raw" and the "maskstringcst" option which produces output with the string constants masked since this makes parsing the output even easier and I'm not interested in the string constants. My question is: Does anybody think that these new features are of any use to someone else but me and does anybody want to have a look at my patches and maybe merge them with the official gcc sources? Hans-Christian