Hello re gurus, I wrote this pattern trying to get the "name" and the "content" of VHDL package I know that the file is a valid VHDL code, so actually there is no need to perform validation after 'end' token is found, but since it works fine I don't want to touch it.
this is the pattern pattern = re.compile(r'^\s*package\s+(?P<name>\w+)\s+is\s+(?P<content>.*?)\s+end(\s+package)?(\s+(?P=name))?\s*;', re.DOTALL | re.MULTILINE | re.IGNORECASE) and the problem is that package TEST is xyz end; works but package TEST123 is xyz end; fails \w is supposed to match [a-zA-Z0-9_] so I don't understand why numbers and undescore let the pattern fail? (there is a slight suspicion that it may be a re bug) I also tried this pattern with the same results pattern = re.compile(r'^\s*package\s+(?P<name>.+?)\s+is\s+(?P<content>.*?)\s+end(\s+package)?(\s+(?P=name))?\s*;', re.DOTALL | re.MULTILINE | re.IGNORECASE) something must be wrong with (?P<name>\w+) inside the main pattern thanks in advance -- Daniel -- http://mail.python.org/mailman/listinfo/python-list