Andrej Kastrin wrote:
Hello dears,
I have a file in row data format, which stores different terms (e.g.
genes) and look like:
------------
ABH
HD
HDD
etc.
------------
Then I have second file which looks like:
--------------------------------------------------------------
ID- 001 #ID number
TI- analysis of HD patients. #title of article
AB- The present article deals with HD patients. #abstract
ID- 002 #ID number
TI- In reply to analysis of HD patients. #title of article
AB- The present article deals with HDD patients. #abstract
--------------------------------------------------------------
etc., where the separator between records is blank line.
Now I have to extract those ID, TI and AB fields from the second file,
which involves any term in the first file.
Colleague from BioPerl mailing list helps me with the following code:
#!/usr/bin/perl
use strict;
use warnings;
my $file_terms = shift;
my $file_medline = shift;
open (TERM, $file_term) or die "Can't open TERM"; #open list of terms
open (MEDL, $file_medline) or die "Can't open MEDL"; #open records file
my @terms = <TERM>;
while (my ($pmid, $ti, $ab) = split <MEDL>) {
for my $term (@terms) {
if (/$term/ for ($pmid, $ti, $ab)) {
print "$pmid\t$ti\t$ab";
}
}
}
I'm little confused now, while above example doesn't work and I don't
know why (compilation error in 15th and 19th line).
I'm still learning...
So aren't the folks at BioPerl.
Question: Do you want to extract just the fields or the full record if a
field contain terms from file 1? The following will print the entire record.
#!/usr/bin/perl
use strict;
use warnings;
my $file_terms = shift;
my $file_medline = shift;
open (TERM, $file_terms) or die "Can't open $file_terms: $!"; #open list
of terms
open (MEDL, $file_medline) or die "Can't open $file_medline: $!"; #open
records file
chomp( my @terms = <TERM> );
{
local $/ = "\n\n";
while( my $record = <MEDL> ){
print $record if grep { $record =~ /\b$_\b/ } @terms;
}
}
__END__
--
Just my 0.00000002 million dollars worth,
--- Shawn
"Probability is now one. Any problems that are left are your own."
SS Heart of Gold, _The Hitchhiker's Guide to the Galaxy_
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>