fastq file modification help

Nathalie Conte Mon, 06 Jun 2011 03:30:32 -0700

Hi,

I need to remove the first 52 bp sequences reads in a fastqfile,sequence is on line 2.fastq file from wikipedia:A FASTQ file normally uses four lines persequence. Line 1 begins with a '@' character and is followed by asequence identifier and an /optional/ description. Line 2 is the rawsequence letters. Line 3 begins with a '+' character and is /optionally/followed by the same sequence identifier (and any description) again.Line 4 encodes the quality values for the sequence in Line 2, and mustcontain the same number of symbols as letters in the sequence.


A minimal FASTQ file might look like this:

@SEQ_ID
GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT
+
!''*((((***+))%%%++)(%%%%).1***-+*''))**55CCF>>>>>>CCCCCCC65

I have written this script to remove the first 52 bp on each sequenceand write this new line on newfile.txt document. It seems to do the job, but what I need is to change my original bed file with the trimmedseuqence lines and keep the other lines the same. I am not sure where tostart to modify the original fatsq.

this is my script to trim my sequence :

#!/software/bin/perl
use warnings;
use strict;


open (IN, "/file.fastq") or die "can't open in:$!";
open (OUT, ">>newfile.txt") or die "can't open out: $!";

while (<IN>) {

next unless (/^[A-Z]/);
   my $new_line=substr($_,52);
   print OUT $new_line;

}


thanks for any suggestions
Nat


--

The Wellcome Trust Sanger Institute is operated by Genome ResearchLimited, a charity registered in England with number 1021457 and acompany registered in England with number 2742969, whose registeredoffice is 215 Euston Road, London, NW1 2BE.

--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

fastq file modification help

Reply via email to