On Jul 23, 2009, at 12:36 PM, tiefeng wu wrote:

2009/7/24 Philip Semanchuk <phi...@semanchuk.com>:

I know this will sound like a sarcastic comment, but it is sincere: my
suggestion is that if you want to parse C/C++ (or Python, or Perl, or
Fortran, etc.), use a real parser, not regexes unless you're willing to sacrifice some accuracy. Sooner or later you'll come across some code that
your regexes won't handle, like this --

#ifdef FOO_BAR
#include <this.h>
/* #else */
#include <that.h>
#endif


Parsing code is difficult...

I understand your point, thanks for your suggestion, Philip. And I've
met the problem like in your example
The reason I choose regex because I barely know about "real parser",
for me it still in some "dark area" :)
But I'll find something to learn.

Yes! Learning is always good. And as I said, if you don't mind missing some unusual cases, regexes are fine. I don't know how accurate you want your results to be.

As for real parsers, there's lots of them out there, although they may be overkill for what you want to do. Here's one written entirely in Python:
http://www.dabeaz.com/ply/

Whatever you choose, good luck with it.

Cheers
Philip

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to