What you need is Regexp::List
(http://search.cpan.org/dist/Regexp-Optimizer/lib/Regexp/List.pm).  It
will take a list of strings and produce an optimal regex to find those
strings.  You can then use a hash in the replacement to map to the
right key.  For example, the list Chas Owens, John Smith, John Smith
Hardy, Doug Burman, John Smitty, and John Smit. produces the following
regex

(?-xism:(?=[CDJ])(?:John\ Smit(?:h(?:\ Hardy)?|ty)?|Chas\ Owens|Doug\ Burman))

#!/usr/bin/perl

use strict;
use warnings;

use Regexp::List;

my $text = qq(
This is a passage that contains names like Chas
Owens, John Smith, John Smith Hardy, Doug
Burman, John Smitty, and John Smit.
);

my $i;
my %names_to_ids = (
        'Chas Owens'       => "id0001",
        'Doug Burman'      => "id0002",
        'John Smit'        => "id0003",
        'John Smith'       => "id0004",
        'John Smith Hardy' => "id0005",
        'John Smitty'      => "id0006"
);

my $re = Regexp::List->new->list2re(keys %names_to_ids);

print $text;

$text =~ s/\n/ /g;
$text =~ s/($re)/"$1 $names_to_ids{$1}"/ge;

print "$text\n";

print "$_ => $names_to_ids{$_}\n" for sort keys %names_to_ids;

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to