On 10/22/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > Hi, > > I'm trying to learn regular expressions, but I am having trouble with > this. I want to search a document that has mixed data; however, the > last line of every entry has something like C5H4N4O3 or CH5N3.ClH. > All of the letters are upper case and there will always be numbers and > possibly one . > > However below only gave me none. > > import os, codecs, re > > text = 'C:\\text_samples\\sample.txt' > text = codecs.open(text,'r','utf-8') > > test = re.compile('\u+\d+\.') > > for line in text: > print test.search(line) > > -- > http://mail.python.org/mailman/listinfo/python-list >
I need a little more info. How can you know whether you're matching the text you're going for, and not other data which looks similar? Do you have a specific field length? Is it guaranteed to contain a digit? Is it required to start with a letter? Does it always start with 'C'? You need to have those kinds of rules in mind to write your regex. Shawn -- http://mail.python.org/mailman/listinfo/python-list