Hi,

I need to remove the first 52 bp sequences reads in a fastq file,sequence is on line 2. fastq file from wikipedia:A FASTQ file normally uses four lines per sequence. Line 1 begins with a '@' character and is followed by a sequence identifier and an /optional/ description. Line 2 is the raw sequence letters. Line 3 begins with a '+' character and is /optionally/ followed by the same sequence identifier (and any description) again. Line 4 encodes the quality values for the sequence in Line 2, and must contain the same number of symbols as letters in the sequence.

A minimal FASTQ file might look like this:

@SEQ_ID
GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT
+
!''*((((***+))%%%++)(%%%%).1***-+*''))**55CCF>>>>>>CCCCCCC65


I have written this script to remove the first 52 bp on each sequence and write this new line on newfile.txt document. It seems to do the job , but what I need is to change my original bed file with the trimmed seuqence lines and keep the other lines the same. I am not sure where to start to modify the original fatsq.
this is my script to trim my sequence :

#!/software/bin/perl
use warnings;
use strict;


open (IN, "/file.fastq") or die "can't open in:$!";
open (OUT, ">>newfile.txt") or die "can't open out: $!";

while (<IN>) {
next unless (/^[A-Z]/);
   my $new_line=substr($_,52);
   print OUT $new_line;

}


thanks for any suggestions
Nat


--
The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.
--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to