Tom Deco wrote: > Hi, > > I'm trying to use a regular expression to match a string containing a # > (basically i'm looking for #include ...) > > I don't seem to manage to write a regular expression that matches this. > > My (probably to naive) approach is: p = re.compile(r'\b#include\b) > I also tried p = re.compile(r'\b\#include\b) in a futile attempt to use > a backslash as escape character before the # > None of the above return a match for a string like "#include <stdio>". > > I know a # is used for comments, hence my attempt to escape it... > > Any suggestion on how to get a regular expression to find a #? > > Thanks >
You definitely shouldn't have the first \b -- match() works only at the beginning of the target string, so it is impossible for there to be a word boundary just before the "#". You probably shouldn't have the second \b. You probably should read section A12 of K&R2. You probably should be using a parser, but if you persist in using regular expressions: (a) read the manual. (b) try something like this: >>> pat1 = re.compile(r'\s*#\s*include\s*<\s*([^>\s]+)\s*>\s*$') >>> pat1.match(" # include < fubar.h > ").group(1) 'fubar.h' N.B. this is based the assumption that sane programmers don't have whitespace embedded in the names of source files ;-) HTH, John -- http://mail.python.org/mailman/listinfo/python-list