At least I am not great at regular expressions. I agree that regex may reduce the number of lines,.
On Sun, Sep 18, 2011 at 12:55 PM, mahendra N <mahendra0...@gmail.com> wrote: > Fgt the link > > http://code.google.com/edu/languages/google-python-class/regular-expressions.html > > 2011/9/19 mahendra N <mahendra0...@gmail.com> > > > Have you thought of using regular expressions?. It might make ur job > > easier. > > > > Checkout this link good explaination of reg exps. > > > > Thanks and Regards, > > Mahendra Naik > > > > > > 2011/9/18 Gopalakrishnan Subramani <gopalakrishnan.subram...@gmail.com> > > > >> Senthil and Gora Mohanty pointed out whats wrong on the code. > >> > >> This is alternative option, not its best since I feel always good to > >> parse > >> the file B based on file spec instead of the following approach. > >> > >> > >> file_a_lines = open('FileA.txt').readlines() > >> file_b_content = open('FileB.txt').read() > >> > >> for line in file_a_lines: > >> start_pos = file_b_content.find(line) > >> > >> if start_pos >= 0: > >> end_pos = file_b_content.find(">", start_pos + 1) > >> > >> if end_pos > 0: > >> print file_b_content[start_pos:end_pos] > >> else: # to deal with end of the line > >> print file_b_content[start_pos:] > >> > >> > >> > >> On Sat, Sep 17, 2011 at 7:49 PM, Senthil Kumaran <sent...@uthcode.com > >> >wrote: > >> > >> > On Fri, Sep 16, 2011 at 11:26:34PM -0500, Ananya Sharma wrote: > >> > > > >> > > *File A-* > >> > > >PSUB.GBD61H402FPT34:0-372 > >> > > > >> > > *File B-* > >> > > >PSUB.GBD61H402FPT34:0-372 > >> > > XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX > >> > > XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX > >> > > XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX > >> > > XXXXXXXXNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN > >> > > NNNNCATTTCCTTGAGTATTAGGCCATTCATGCTGTCAATTTTCTTAACT > >> > > ATTTGGAAATCCTAGTTGTACAAGATGGCCTTTTTCCCACCTGTATTTGC > >> > > TTGGTCTGTGTACTGTAGTCTGCCTCTGCAAATGTTGTGGGAGGACTAAA > >> > > TGTGGCGGGGGTGGGCTGACAG > >> > > >> > Here is the simplest scenario of your case. In this what do you want > >> > to do Ignore XXX...CAG in File-B and print only the >PSUB.? > >> > > >> > If that is the case, you could iterate over file-b and look for lines > >> > starting with > and then put them to a list and then do your > >> > operations. > >> > > >> > In your code: > >> > > >> > > f1=open('fileA','r') > >> > > f2=open('fileB','r') > >> > > a="" > >> > > b="" > >> > > >> > > for n in f1: > >> > > while not b.startswith(n): > >> > > b=f2.readline() > >> > > >> > This loop will break when f2 has line starting with >PSUB. > >> > > >> > > if len(a)>0: > >> > > print a > >> > > >> > Won't have any effect. > >> > > >> > > b="" > >> > > >> > You are resetting b. > >> > > >> > > while not b.startswith(">"): > >> > > a=a+f2.readline()+"__" > >> > > > >> > Won't have any effect. > >> > > >> > > > >> > > Any help would be highly appreciated. Thanks. > >> > > >> > Do you see why your program is not working when reduced to the > >> > simplest case? > >> > > >> > If you are trying to find entities in B which are in A. > >> > Just recreate B so that you remove all the non > starting lines and > >> > then compare. > >> > > >> > -- > >> > Senthil > >> > > >> > > >> > > >> > _______________________________________________ > >> > BangPypers mailing list > >> > BangPypers@python.org > >> > http://mail.python.org/mailman/listinfo/bangpypers > >> > > >> _______________________________________________ > >> BangPypers mailing list > >> BangPypers@python.org > >> http://mail.python.org/mailman/listinfo/bangpypers > >> > > > > > _______________________________________________ > BangPypers mailing list > BangPypers@python.org > http://mail.python.org/mailman/listinfo/bangpypers > _______________________________________________ BangPypers mailing list BangPypers@python.org http://mail.python.org/mailman/listinfo/bangpypers