I will make my question a little more clearer. I have close to 60,000
lines of the data similar to the one I posted. There are various
numbers next to the sequence (this is basically the number of times
the sequence has been found in a particular sample). So, I would need
to ignore the ones contain
Hi folks,
I am a newbie to python, and I would be grateful if someone could
point out the mistake in my program. Basically, I have a huge text
file similar to the format below:
AGACTCGAGTGCGCGGA 0
AGATAAGCTAATTAAGCTACTGG 0
AGATAAGCTAATTAAGCTACTGGGTT 1
AGCTCACAA