Hi All, So here is the problem... I have a FASTA file (used for DNA analyses) that looks like this:
... >gnl|SRA|SRR019045.10.1 SL-XAY_956090708:2:1:0:1028.1 length=152 NCTTTTTTTATTTTTTGTATAAATGAAGTTTCACTATATCGGACGAGCGGTTCAGCAGTCATTCCGAGAC CGATATAGTGAAACTTCATTTCTACAAAAANTACCAAACGTCGCTCGGCAGAGCGTCGTGTTGGGCAAGA GAGTAGCACTCG >gnl|SRA|SRR019045.11.1 SL-XAY_956090708:2:1:0:1151.1 length=152 NGGTNTGGNNNNCNCCNTNCTNCNNCNTCANCCTCCNGTCNCANNCCNCNTNNNNNCNNNNNCNNTNCTT CTNCNNTCTCCATTCCTTCTTNATAGCCTGCTCCANCGCACGTTGAACCTTCTGCACCACGAACGCACTC ACACCACTCATC >gnl|SRA|SRR019045.12.1 SL-XAY_956090708:2:1:0:1197.1 length=152 NGTCGGGTCTTCGCTATCACTGGACTGCTCCCATCAGCTATAGGTCCTCCCCGCCACACCCCATGCCCAC CGCCTATCCACGTCTGTCACAACCTCATACATCAGACAGTCACACTTACCAACATATCCAAGCACCTCAA GCAACACATCAT ... This snippet represents 3 individual DNA sequences. Each sequences is identified by the line starting with > The complete file has about 10 million individual sequences. A simple enough problem, I want to read in this data, and cut out the last 76 letters (nucleotides) from each individual sequence and send them to a new txt file with a similar format. Any help on how to do this would be appreciated. Thanks! -- http://mail.python.org/mailman/listinfo/python-list