Hi,
I need to remove the first 52 bp sequences reads in a fastq
file,sequence is on line 2.
fastq file from wikipedia:A FASTQ file normally uses four lines per
sequence. Line 1 begins with a '@' character and is followed by a
sequence identifier and an /optional/ description. Line 2 is the raw
sequence letters. Line 3 begins with a '+' character and is /optionally/
followed by the same sequence identifier (and any description) again.
Line 4 encodes the quality values for the sequence in Line 2, and must
contain the same number of symbols as letters in the sequence.
A minimal FASTQ file might look like this:
@SEQ_ID
GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT
+
!''*((((***+))%%%++)(%%%%).1***-+*''))**55CCF>>>>>>CCCCCCC65
I have written this script to remove the first 52 bp on each sequence
and write this new line on newfile.txt document. It seems to do the job
, but what I need is to change my original bed file with the trimmed
seuqence lines and keep the other lines the same. I am not sure where to
start to modify the original fatsq.
this is my script to trim my sequence :
#!/software/bin/perl
use warnings;
use strict;
open (IN, "/file.fastq") or die "can't open in:$!";
open (OUT, ">>newfile.txt") or die "can't open out: $!";
while (<IN>) {
next unless (/^[A-Z]/);
my $new_line=substr($_,52);
print OUT $new_line;
}
thanks for any suggestions
Nat
--
The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.
--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/