Andrej Kastrin wrote:
Hi
I want to count words in the following file:
------------------------------
ID- some number
TI- some text BB
AB- some text
AU- some text
ID- some number
TI- some GGG text
AB- some text
AU- some text
ID- some number
TI- some text
AB- some text Z
AU- some text
------------------------------
So, the separators between records are blank lines ("\n\n"). I wrote
script, which count wods, which are defined in the @list array. So, I
need frequency of (A, BB, GGG and Z) in the lines, which begind with
TI (or AB or AU, but this is not problem at the moment).And here is
the catch, while TI is not at the beginning of each record!!!
-----------------------------------------------------------------
@list = qw(A BB GGG Z); #define term list
foreach $member(@list){
$words{$member}=0; #create hash form array
}
{ # begin block
$/="\n\n"; #set input separator to 1 blank line
while (<>){
chomp;
if(/^TI.+/){
foreach $w (split){
$wds++ if defined($words{$w})
}
}
}
} #end block
print "\n$wds words"; #print frequency of words, defined in @list
----------------------------------------------------------------
How to proceed? Thanks in advance, Andrej
Me, again. I modify it in the following way: look below the while
statement. Is that OK? (It's working btw.).
-----------------------------------------------------------------
@list = qw(A BB GGG Z); #define term list
foreach $member(@list){
$words{$member}=0; #create hash form array
}
while (<>){
$/="\n\n"; #set input separator to read record
$/="\n"; #set input separator to parse within a record
chomp;
if(/^TI.+/){
foreach $w (split){
$wds++ if defined($words{$w})
}
}
}
print "\n$wds words"; #print frequency of words, defined in @list
----------------------------------------------------------------
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>