Re: [Haskell-cafe] More Language.C work for Google's Summer of Code

Aaron Tomb Tue, 30 Mar 2010 13:58:01 -0700

Yes, that would definitely be one productive way forward. One concernis that Language.C is BSD-licensed (and it would be nice to keep itthat way), and cpphs is LGPL. However, if cpphs remained a separateprogram, producing C + extra stuff as output, and the Language.Cparser understood the extra stuff, this could accomplish what I'minterested in. It would be interesting, even, to just extend theLanguage.C parser to support comments, and to tell cpphs to leave themin.

There's also another pre-processor, mcpp [1], that is quite featurefuland robust, and which supports an output mode with special syntaxdescribing the origin of the code resulting from macro expansion.


Aaron

[1] http://mcpp.sourceforge.net/

On Mar 30, 2010, at 12:14 PM, austin seipp wrote:

(sorry for the dupe aaron! forgot to add haskell-cafe to senderslist!)
Perhaps the best course of action would be to try and extend cpphs to
do things like this? From the looks of the interface, it can already
do some of these things e.g. do not strip comments from a file:

http://hackage.haskell.org/packages/archive/cpphs/1.11/doc/html/Language-Preprocessor-Cpphs.html#t%3ABoolOptions

Malcolm would have to attest to how complete it is w.r.t. say, gcc's
preprocessor, but if this were to be a SOC project, extending cpphs to
include needed functionality would probably be much more realistic
than writing a new one.

On Tue, Mar 30, 2010 at 12:30 PM, Aaron Tomb <at...@galois.com> wrote:
Hello,
I'm wondering whether there's anyone on the list with an interestin doingadditional work on the Language.C library for the Summer of Code.There area few enhancements that I'd be very interested seeing, and I'd lovebe amentor for such a project if there's a student interested inworking on
them.
The first is to integrate preprocessing into the library.Currently, thelibrary calls out to GCC to preprocess source files before parsingthem.This has some unfortunate consequences, however, because commentsand macro
information are lost. A number of program analyses could benefit from
metadata encoded in comments, because C doesn't have any sort offormalannotation mechanism, but in the current state we have to resort tougly
hacks (at best) to get at the contents of comments. Also, effective
diagnostic messages need to be closely tied to original sourcecode. In thepresence of pre-processed macros, column number information isunreliable,so it can be difficult to describe to a user exactly what portionof aprogram a particular analysis refers to. An integrated preprocessorcouldretain comments and remember information about macros, eliminatingboth of
these problems.
The second possible project is to create a nicer interface fortraversalsover Language.C ASTs. Currently, the symbol table is built toinclude onlyinformation about global declarations and those other declarationscurrentlyin scope. Therefore, when performing multiple traversals over anAST, eachtraversal must re-analyze all global declarations and the entireAST of thefunction of interest. A better solution might be to build atraversal thatcreates a single symbol table describing all declarations in atranslationunit (including function- and block-scoped variables), for easyreferenceduring further traversals. It may also be valuable to have thistraversal
produce a slightly-simplified AST in the process. I'm not thinking of
anything as radical as the simplifications performed by somethinglike CIL,however. It might simply be enough to transform variable referencesinto aform suitable for easy lookup in a complete symbol table like I'vejustdescribed. Other simple transformations such as making all implicitcasts
explicit, or normalizing compound initializers, could also be good.

A third possibility, which would probably depend on the integrated
preprocessor, would be to create an exact pretty-printer. That is, a
pretty-printing function such that pretty . parse is the identity.
Currently, parse . pretty should be the identity, but it's not truetheother way around. An exact pretty-printer would be very useful increating
rich presentations of C source code --- think LXR on steroids.
If you're interested in any combination of these, or anythingsimilar, letme know. The deadline is approaching quickly, but I'd be happy toworktogether with a student to flesh any of these out into a fullproposal.
Thanks,
Aaron

--
Aaron Tomb
Galois, Inc. (http://www.galois.com)
at...@galois.com
Phone: (503) 808-7206
Fax: (503) 350-0833

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
--
- Austin
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] More Language.C work for Google's Summer of Code

Reply via email to