Ing. Branislav Gerzo wrote:
I did this by hand...but anyone know how to this effectively in perl?
I think I have to build hash of all possibilities of 2 words sentences (in
input txt are allowed only [0-9a-z ]), in list I will have lines of
input txt, and iterate every key in hash over array, writing value to
hash its occurence ("foo bar" => 5)...hm ?
Not that this program reports that "foo bar" occurs twice on line 8.
"bar bar" - 2 times (lines: 4, 6)
"bar foo" - 3 times (lines: 5, 6, 8)
"foo bar" - 6 times (lines: 3, 4, 5, 7, 8, 8)
"foo bars" - 1 time (lines: 10)
"foo foo" - 1 time (lines: 7)
"foob bar" - 1 time (lines: 9)
To see the structure or %Pairs, uncomment the print Dumper line.
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my %Pairs = ();
while( <DATA> ){
chomp;
my @words = split /\s+/;
for( my $word = shift @words;
@words;
$word = shift @words
){
push @{ $Pairs{$word}{$words[0]}}, $.;
}
}
# print Dumper( \%Pairs );
for my $first ( sort keys %Pairs ){
for my $second ( sort keys %{ $Pairs{$first} } ){
my @lines = @{ $Pairs{$first}{$second} };
my $count = scalar( @lines ) . " time";
$count .= 's' unless scalar( @lines ) == 1;
print "\"$first $second\" - $count (lines: ", join( ', ', @lines ),
")\n";
}
}
__END__
foo
bar
foo bar
foo bar bar
bar foo bar
bar bar foo
foo foo bar
foo bar foo bar
foob bar
foo bars
--
Just my 0.00000002 million dollars worth,
--- Shawn
"Probability is now one. Any problems that are left are your own."
SS Heart of Gold, _The Hitchhiker's Guide to the Galaxy_
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>