Re: regex parsing-Beginner

Gunnar Hjalmarsson Tue, 04 Dec 2007 08:51:16 -0800

minky arora wrote:

.A part of my file looks like this:


gene            410..1750
                     /gene="dnaA"
                     /db_xref="EMBL:2632267"
     CDS             410..1750
                     /gene="dnaA"
                     /function="initiation of chromosome replication (DNA
                     synthesis)"
                     /note="alternate gene name: dnaH, dnaJ, dnaK"
                     /codon_start=1
                     /transl_table=11
                     /protein_id="CAB11777.1"
                     /db_xref="GI:2632268"

I need to extract the range for gene as well as CDS and compare them.

SO far this is where Ive reached:
#!/usr/bin/perl
use warnings;
use strict;
my $line;
my @new;
my $line1;
open FILE,"/users/meenaksharora/bio.txt"or die"cannot open $!\n";
foreach $line(<FILE>){
if($line=~m/gene/)
   [EMAIL PROTECTED];
    }
}

At this point, @new contains one element with the _last_ line containingthe substring 'gene'.


Maybe you want to anchor the regex:

    if ( $line =~ m/^gene/ )
--------------------^

foreach $line1(@new){
 if($line1=~m/(\d)+\.\./)
  {   print $line1;print "hello";
  }
}


You may also want to replace those foreach loops with:

    while ( $line = <FILE> ) {
        if ( $line =~ /(gene|CDS)\s+(\d+\.\.\d+)/ ) {
            print "$1: $2\n";
        }
    }

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex parsing-Beginner

Reply via email to