Licheng Fang wrote: > Basically, the problem is this: > > >>> p = re.compile("do|dolittle") > >>> p.match("dolittle").group() > 'do'
>From what I understand, this isn't python specific, it is the expected behavior of that pattern in any implementation. You are using alternation, which means "either, or", and you have the shorter subexpression first, so the condition is satisfied by just 'do' and the matching terminates. > There's another example: > > >>> p = re.compile("one(self)?(selfsufficient)?") > >>> p.match("oneselfsufficient").group() > 'oneself' Again, I don't think this has anything to do with python. You pattern basically means "match 'one' whether it is followed by 'self' or not, and whether it is followed by 'selfsufficient' or not". For this particular example, you'd want something like "one(self)?(sufficient)?". I think you could construct a pattern that would do what you want in python without any problem. If you post a (short) example of your data, I'm sure someone could help you with it. Regards, Jordan -- http://mail.python.org/mailman/listinfo/python-list