Hi,
See code
On Tue, 18 May 2004 13:16:37 -0400, Michael Robeson <[EMAIL PROTECTED]> wrote:
Ok great. Most of what you show does make sense. However, there are some bits of code that I need further clarification with. Some bits I am able to tell what they are doing but I do not quite know how or why they work they way they do. I'll state these areas in the code we've got together at this point.
Hopefully, I have copied over the bits you wrote correctly. I find this is like learning Spanish. I can read and (roughly) get the gist of the code. But when it comes to writing the original code on my own is when I have trouble. I am sure this will go away when I practice more. :-)
I didn't finish everything because I just need some code explained / clarified.
>>>Start PERL code<<<<<
#!usr/bin/perl -w
use strict; use FileHandle;
# I am unsure of what this module is. I've tried looking it up # in the Camel and Llama book to no avail, not enough description. # I guess I have to figure out the whole object thing?
# write 'perldoc FileHandle' on the commandline to see # (you can do this with (hopefully) all new modules you come across).
my %organisms;
print "Enter in a list of files to be processed:\n";
# For example: # Cytb.fasta # NADH1.fasta # ...
# chomp (my @infiles = <STDIN>); # TODO we should make this nicer later my @infiles = ('genetics.txt');
foreach my $infile(@infiles) { my $FASTA = new FileHandle; # Does the above statement tell PERL to create a new # filehandle for each file it finds? I guess I need to understand # what "new" and the module "FileHandle" are doing.
Right on.
open ($FASTA, $infile) or die "Can't open INFILE:$!"; #$/='>' #Set input operator
my $orgID;
while (defined($_ = <$FASTA>)) { # Above I am unsure of why the "defined function # helps us here? I know it has something to do with an # expression containing a valid string, but I am unsure # of it's function here. This is something I would have # never thought to do. :-)
It's what while (<$FASTA>) actually do.
the defined function checks wheter $_ gets set or not.
chomp; print "\nworking on >>$_<<\n"; if (\s*>(\w+)/) { $orgID=$1; print "Found a new organism start line ('$orgID')\n"; # The above regex makes complete sense. Actually, I was going to put # something similar to that in my original post but wasn't sure # if this was appropriate at the time. I guess it was! } else { print "This is just some data: $_\n"; print "This data needs to be appended to the hash entry for $orgID/n";
# okay, in the above you are taking the left over # sequence ($_) and linking it as a "value" to "$orgID" ?
This if- then else statement should do what you want. I would do it like this instead:
$organism{$orgID} .= $_;
no if and no else just that single line. Perl will just make it work the wat it's supposed to work; if the hashkey don't exists it gets created and the contents of $_ is inserted in it (as a string).
if (exists ($organsims{$orgID})) { #TODO append the data to the hash here # I guess I would put the following to append to # the already existing hash: # $organism{$orgID} .= $_; } else { #create new hash entry for this data $organsims{$orgID} = $_; } } } # Do not forget to close the input file close ($FASTA) or die "Could not close INFILE: $!";
# We've processed all input files... print the resulting hash
print "\n*****************************************************\n";
while (my($orgID, $sequence) = each(%organisms)) { # since I want the output as: # >cat # actgac---cgatc-ag-cttag---acg # >dog # actatc---actat-at-accta---atc # I would change the print statement to: print "> . $orgID\n $sequence\n";
Hmm, you're trying to do string concatenation here but in that case it should be:
print ">" . $orgID . "\n" . $sequence . "\n";
but it's much easier to just do it like:
print ">$orgID\n$sequence\n";
}
end;
>>>end PERL code<<<
Thanks for all your help so far! Most of this is starting help my thinking. I will be doing a lot more of this multi-file parsing as most of my work entails manipulating data in several files or folders at once.
-Mike
/Johan
-- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>