I have Question: How can I substitute an object as a pattern in making
a pattern.
x = 30 pattern = re.compile(x)
Kumar,
You can use string interpolation to insert x into a string, which can then be compiled into a pattern:
x = 30
pat = re.compile('%s'%x)I really doubt regular expressions will speed up your current searching algorithm. You probably need to reconsider the data structures you are using to represent your data.
I have a list of numbers that I have to match in another list and write them to a new file:
List 1: range_corsrange_cors[1:5]['161:378', '334:3', '334:4', '65:436']
List 2: seqseq[0:2]['>probe:HG-U133A_2:1007_s_at:416:177; Interrogation_Position=3330; Antisense;', 'CACCCAGCTGGTCCTGTGGATGGGA']
Can you re-process your second list? One option might be to store that list instead as a dict, where the keys are what you want to search by (maybe a string like '12:34' or a tuple like (12,34)).
Maybe something like the following:
>>> range_cors = ['12:34','34:56']
>>> seq = {'12:34': ['some 12:34 data'],
... '34:56': ['some 34:56'data','more 34:56 data']}
>>> for item in range_cors:
... print seq[item]
...
['some 12:34 data']
['some 34:56 data','more 34:56 data']Why is this better?
If you have m lines of data and n patterns to search for, then using either of your methods you perform n searches per line, totalling approx. m*n operations. You have to complete approx. m*n operations whether you use the string searching version, or re searching version.
If you pre-process the data so that it can be stored in and retrieved from a dict, pre-processing to get your data into that dict costs you roughly m operations, but your n pattern lookups into that dict cost you only n operations, so you only have to complete approx. m+n operations.
A slow method:sequences = [] for elem1 in range_cors:for index,elem2 in enumerate(seq): if elem1 in elem2: sequences.append(elem2) sequences.append(seq[index+1])
A faster method (probably):
for i in range(len(range_cors)):for index,m in enumerate(seq): pat = re.compile(i) if re.search(pat,seq[m]): p.append(seq[m]) p.append(seq[index+1])
I am getting errors, because I am trying to create an
element as a pattern in re.compile().
pat = re.compile('%s'%i) would probably get rid of the error message, but that's probably still not what you want.
Questions:
1. Is it possible to do this. If so, how can I do this.
You can try, but I doubt regular expressions will help; that approach will probably be even slower.
Can any one help correcting my piece of code and
suggesting where I went wrong.
I would scrap what you have and try using a better data structure. I don't know enough about your data to make more specific processing recommendations; but you can probably avoid those nested loops with some careful data pre-processing.
You'll likely get better suggestions if you post a more representative sample of your data, and explain exactly what you want as output.
Good luck.
Rich
_______________________________________________ Tutor maillist - [email protected] http://mail.python.org/mailman/listinfo/tutor
