Re: [BangPypers] Parsing data

Gopalakrishnan Subramani Sun, 18 Sep 2011 16:30:18 -0700

At least I am not great at regular expressions. I agree that regex may
reduce the number of lines,.



On Sun, Sep 18, 2011 at 12:55 PM, mahendra N <mahendra0...@gmail.com> wrote:

> Fgt the link
>
> http://code.google.com/edu/languages/google-python-class/regular-expressions.html
>
> 2011/9/19 mahendra N <mahendra0...@gmail.com>
>
> > Have you thought of using regular expressions?. It might make ur job
> > easier.
> >
> > Checkout this link good explaination of reg exps.
> >
> > Thanks and Regards,
> > Mahendra Naik
> >
> >
> > 2011/9/18 Gopalakrishnan Subramani <gopalakrishnan.subram...@gmail.com>
> >
> >> Senthil and Gora Mohanty pointed out whats wrong on the code.
> >>
> >> This is alternative option, not its best since I feel  always good to
> >> parse
> >> the file B based on file spec instead of the following approach.
> >>
> >>
> >> file_a_lines = open('FileA.txt').readlines()
> >> file_b_content = open('FileB.txt').read()
> >>
> >> for line in file_a_lines:
> >>    start_pos =  file_b_content.find(line)
> >>
> >>    if start_pos >= 0:
> >>        end_pos = file_b_content.find(">", start_pos + 1)
> >>
> >>        if end_pos > 0:
> >>            print file_b_content[start_pos:end_pos]
> >>        else: # to deal with end of the line
> >>            print file_b_content[start_pos:]
> >>
> >>
> >>
> >> On Sat, Sep 17, 2011 at 7:49 PM, Senthil Kumaran <sent...@uthcode.com
> >> >wrote:
> >>
> >> > On Fri, Sep 16, 2011 at 11:26:34PM -0500, Ananya Sharma wrote:
> >> > >
> >> > > *File A-*
> >> > > >PSUB.GBD61H402FPT34:0-372
> >> > >
> >> > > *File B-*
> >> > > >PSUB.GBD61H402FPT34:0-372
> >> > > XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
> >> > > XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
> >> > > XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
> >> > > XXXXXXXXNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
> >> > > NNNNCATTTCCTTGAGTATTAGGCCATTCATGCTGTCAATTTTCTTAACT
> >> > > ATTTGGAAATCCTAGTTGTACAAGATGGCCTTTTTCCCACCTGTATTTGC
> >> > > TTGGTCTGTGTACTGTAGTCTGCCTCTGCAAATGTTGTGGGAGGACTAAA
> >> > > TGTGGCGGGGGTGGGCTGACAG
> >> >
> >> > Here is the simplest scenario of your case. In this what do you want
> >> > to do Ignore XXX...CAG in File-B and print only the >PSUB.?
> >> >
> >> > If that is the case, you could iterate over file-b and look for lines
> >> > starting with > and then put them to a list and then do your
> >> > operations.
> >> >
> >> > In your code:
> >> >
> >> > > f1=open('fileA','r')
> >> > > f2=open('fileB','r')
> >> > > a=""
> >> > > b=""
> >> >
> >> > > for n in f1:
> >> > >     while not b.startswith(n):
> >> > >         b=f2.readline()
> >> >
> >> > This loop will break when f2 has line starting with >PSUB.
> >> >
> >> > >     if len(a)>0:
> >> > >              print a
> >> >
> >> > Won't have any effect.
> >> >
> >> > >     b=""
> >> >
> >> > You are resetting b.
> >> >
> >> > >     while not b.startswith(">"):
> >> > >        a=a+f2.readline()+"__"
> >> > >
> >> > Won't have any effect.
> >> >
> >> > >
> >> > > Any help would be highly appreciated. Thanks.
> >> >
> >> > Do you see why your program is not working when reduced to the
> >> > simplest case?
> >> >
> >> > If you are trying to find entities in B which are in A.
> >> > Just recreate B so that you remove all the non > starting lines and
> >> > then compare.
> >> >
> >> > --
> >> > Senthil
> >> >
> >> >
> >> >
> >> > _______________________________________________
> >> > BangPypers mailing list
> >> > BangPypers@python.org
> >> > http://mail.python.org/mailman/listinfo/bangpypers
> >> >
> >> _______________________________________________
> >> BangPypers mailing list
> >> BangPypers@python.org
> >> http://mail.python.org/mailman/listinfo/bangpypers
> >>
> >
> >
> _______________________________________________
> BangPypers mailing list
> BangPypers@python.org
> http://mail.python.org/mailman/listinfo/bangpypers
>
_______________________________________________
BangPypers mailing list
BangPypers@python.org
http://mail.python.org/mailman/listinfo/bangpypers

Re: [BangPypers] Parsing data

Reply via email to