On 05/02/2017 01:08 PM, David Malcolm wrote:
Currently the C/C++ frontends discard comments when parsing.
It's possible to set up libcpp to capture comments as tokens,
by setting CPP_OPTION (pfile, discard_comments) to false),
and this can be enabled using the -C command line option (see
also -CC), but c-family/c-lex.c then discards any CPP_COMMENT
tokens it sees, so they're not seen by the frontend parser.
The following patch adds an (optional) callback to libcpp
for handling comments, giving the comment content, and the
location it was seen at. This approach allows arbitrary
logic to be wired up to comments, and avoids having to
copy the comment content to a new buffer (which the CPP_COMMENT
approach does).
This could be used by plugins to chain up on the callback
e.g. to parse specially-formatted comments, e.g. for
documentation generation, or e.g. for GObject introspection
annotations [1].
As a proof of concept, the patch uses this to add a spellchecker
for comments. It uses the Enchant meta-library:
https://abiword.github.io/enchant/
(essentially a wrapper around 8 different spellchecking libraries).
I didn't bother with the autotool detection for enchant, and
just hacked it in for now.
Example output:
test.c:3:46: warning: spellcheck_word: "evaulate"
When NONCONST_PRED is false the code will evaulate to constant and
^~~~~~~~
test.c:3:46: note: suggestion: "evaluate"
When NONCONST_PRED is false the code will evaulate to constant and
^~~~~~~~
evaluate
test.c:3:46: note: suggestion: "ululate"
When NONCONST_PRED is false the code will evaulate to constant and
^~~~~~~~
ululate
test.c:3:46: note: suggestion: "elevate"
When NONCONST_PRED is false the code will evaulate to constant and
^~~~~~~~
elevate
License-wise, Enchant is LGPL 2.1 "or (at your option) any
later version." with a special exception to allow non-LGPL
spellchecking providers (e.g. to allow linking against an
OS-provided spellchecker).
Various FIXMEs are present (e.g. hardcoded "en_US" for the
language to spellcheck against).
Also, the spellchecker has a lot of false positives e.g.
it doesn't grok URLs (and thus complains when it seens them);
similar for DejaGnu directives etc.
Does enchant seem like a reasonable dependency for the compiler?
(it pulls in libpthread.so.0, libglib-2.0.so.0, libgmodule-2.0.so.0).
Or would this be better pursued as a plugin? (if so, I'd
prefer the plugin to live in the source tree as an example,
rather than out-of-tree).
Unrelated to spellchecking, I also added two new options:
-Wfixme and -Wtodo, for warning when comments containing
"FIXME" or "TODO" are encountered.
I use such comments a lot during development. I thought some
people might want a warning about them (I tend to just use grep
though). [TODO: document these in invoke.texi, add test cases]
Thoughts? Does any of this sound useful?
[not yet bootstrapped; as noted above, I haven't yet done
the autoconf stuff for handling Enchant]
[1] https://wiki.gnome.org/Projects/GObjectIntrospection/Annotations
gcc/ChangeLog:
* Makefile.in (LIBS): Hack in -lenchant for now.
(OBJS): Add spellcheck-enchant.o.
* common.opt (Wfixme): New option.
(Wtodo): New option.
* spellcheck-enchant.c: New file.
* spellcheck-enchant.h: New file.
gcc/c-family/ChangeLog:
* c-lex.c: Include spellcheck-enchant.h.
(init_c_lex): Wire up spellcheck_enchant_check_comment to the
comment callback.
* c-opts.c: Include spellcheck-enchant.h.
(c_common_post_options): Call spellcheck_enchant_init.
(c_common_finish): Call spellcheck_enchant_finish.
libcpp/ChangeLog:
* include/cpplib.h (struct cpp_callbacks): Add "comment"
callback.
* lex.c (_cpp_lex_direct): Call the comment callback if non-NULL.
enchant seems a bit out of the sweet spot, particular just to catch
mis-spellings in comments. But it might make an interesting plugin.
IIRC from our meeting earlier this week, you had another use case that
might have been more compelling, but I can't remember what it was.
I do like the ability to at least capture the comments better and while
we don't have a strong need for that capability now, we might in the future.
Jeff