This regexp '<widget class=".*" id=".*">' works well with 'grep' for matching lines of the kind <widget class="GtkWindow" id="window1">
on a XML .glade file However that's not true for the re module in python, since this one takes the regexp as if were specified this way: '^<widget class=".*" id=".*">' For some reason regexp on python decide to match from the start of the line, no matter if you used or not the caret symbol '^'. I have a hard time to note why this regexp wasn't working: regexp = re.compile(r'<widget class=".*" id="(.*)">') The solution was to consider spaces: regexp = re.compile(r'\s*<widget class=".*" id="(.*)">\s*') To reproduce behaviour just take a .glade file and this python script: <code> import re glade_file_name = 'some.glade' bad_regexp = re.compile(r'<widget class=".*" id="(.*)">') good_regexp = re.compile(r'\s*<widget class=".*" id="(.*)">\s*') for line in open(glade_file_name): if bad_regexp.match(line): print 'bad:', line.strip() if good_regexp.match(line): print 'good:', line.strip() </code> The thing is i should expected to have to put caret explicitly to tell the regexp to match at the start of the line, something like: r'^<widget class=".*" id="(.*)">' however python regexp is taking care of that for me. This is not a desired behaviour for what i know about regexp, but maybe i'm missing something. -- http://mail.python.org/mailman/listinfo/python-list