> Is there a good method to do this? I need to remove the stop > words from the comment field of every record. There are about > 20,000 records. The comments look like this: > > Yersinia pestis strain Nepal (aka CDC 516 or 369 isolated > from human) 16S-23S in tergenic region amplified with 16UNIX > and 23UNII primers. Sequencing primers were UNI1 and UNI2 5/25/99^^ > > I should remove 'and' 'in' 'with' 'The', etc. I have set up > the stop words array. Is there a efficient way to do this?
How about: ----code---- #!perl -w use strict; my ($r,$tmp) = '' x 2; my $input = 'blah srand and spin in with within the their'; my @s_words = qw(and in with the); for(@s_words) { $tmp .= " \\b$_\\b"; $tmp .= '|' unless $_ eq $s_words[$#s_words]; } $r = qr/$tmp/is; print $r; print "\n\n$input\n\n"; $input =~ s/$r//g; print "$input\n"; ----end---- It builds a regex using your search words and then applies it to a string. HTH, -dave -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]