In article <mailman.10370.1401191774.18130.python-l...@python.org>, Wolfgang Maier <wolfgang.ma...@biologie.uni-freiburg.de> wrote:
> On 27.05.2014 13:39, Aman Kashyap wrote: > >> On 27.05.2014 14:09, Vlastimil Brom wrote: > >> > >>> you can just escpape the pipe with backlash like any other metacharacter: > >>> > >>> r"start=\|ID=ter54rt543d" > >>> > >>> be sure to use the raw string notation r"...", or you can double all > >> > >>> backslashes in the string. > >> > > Thanks for the response. > > > > I got the answer finally. > > > > This is the regular expression to be > > used:\\|ID=[a-z]*[0-9]*[a-z]*[0-9]*[a-z]*\\| > > > > or, and more readable: > > r'\|ID=[a-z]*[0-9]*[a-z]*[0-9]*[a-z]*\|' > > This is what Vlastimil was talking about. It saves you from having to > escape the backslashes. Sometimes what I do, instead of using backslashes, I put the problem character into a character class by itself. It's a matter of personal opinion which way is easier to read, but it certainly eliminates all the questions about "how many backslashes do I need?" > r'[|]ID=[a-z]*[0-9]*[a-z]*[0-9]*[a-z]*[|]' Another thing that can help make regexes easier to read is the VERBOSE flag. Basically, it ignores whitespace inside the regex (see https://docs.python.org/2/library/re.html#module-contents for details). So, you can write something like: pattern = re.compile(r'''[|] ID= [a-z]* [0-9]* [a-z]* [0-9]* [a-z]* [|]''', re.VERBOSE) Or, alternatively, take advantage of the fact that Python concatenates adjacent string literals, and write it like this: pattern = re.compile(r'[|]' r'ID=' r'[a-z]*' r'[0-9]*' r'[a-z]*' r'[0-9]*' r'[a-z]*' r'[|]' ) -- https://mail.python.org/mailman/listinfo/python-list