Re: Repeated Words

Steven . Spears Wed, 25 Apr 2001 07:22:05 -0700

I believe if you add the (g)lobal modifier and optionally(i), Paul's line
of code may work:

 $line =~ /(\b\w+\b).*\1/ogi;

This is also addressed a little differently in Programming Perl, page 149

Steven Spears
(905)-405-0955
[EMAIL PROTECTED]



                                                                                       
                  
                    Paul                                                               
                  
                    <ydbxmhc@yaho        To:     "Helio S. Junior" 
<[EMAIL PROTECTED]>,             
                    o.com>               [EMAIL PROTECTED]                            
                  
                                         cc:                                           
                  
                    04/25/01             Subject:     Re: Repeated Words               
                  
                    10:31 AM                                                           
                  
                    Please                                                             
                  
                    respond to                                                         
                  
                    Hodges                                                             
                  
                                                                                       
                  
                                                                                       
                  




--- "Helio S. Junior" <[EMAIL PROTECTED]> wrote:
> Hello,

Hi =o)

> How do i read a simple file and look for "repeated
> words" in each line, writing out the words i have
> found and the numbers of line they were found?
>
> eg:
> File ==> Test.Dat
> sample line of text.
> this line follows another line.
> This is the last line.
>
> The program should report:
>
> Repeated Word(s): 'line'  on Line 2.

I was tempted to use something like
 $line =~ /(\b\w+\b).*\1/o;

to find repeats, but don't -- it only reads the first repeated word,
and gets more complex to fix after that.

Instead, try doing it manually:

====================================
open DAT, "Test.dat" or die $!;
my $ln = 0;
foreach my $line (<DAT>) {
   chomp $line;
   $ln++;
   my %hit = ();
   foreach my $word (split /\W+/o, $line) { $hit{$word}++ }
   foreach my $word (keys %hit) {
      print "Repeated Word(s): '$word'  on Line $ln.\n"
          if $hit{$word} > 1;
   }
}
close DAT;

=====Test.dat====================
sample line of text.
this line follows another line.
This is the last line.
and foo and foo and foo.
=================================

You could even use this to tell you how many times the word appeared on
the line by adding $jit{$word} to the printed line, etc.

This could be condensed, but it works.

__________________________________________________
Do You Yahoo!?
Yahoo! Auctions - buy the things you want at great prices
http://auctions.yahoo.com/
Re: Repeated Words

Reply via email to