On Jun 23, [EMAIL PROTECTED] said: >SRED. SREDNE >SEV. SEVERN
># Match it at beginning of line >$cgname =~ s/^SRED\.(?=[\W\s\-\d]+)/SREDNE:/g ; Three things -- the + modifier on the [...] isn't needed, you don't need to put \s and - in a character class you've already put \W in, and the /g modifier is totally worthless here... there's only ONE beginning of the line! $cgname =~ s/^SRED\.(?=[\W\d])/SREDNE:/; ># Match it within the line >$cgname =~ s/[\W\s\-]+SRED\.(?=[\W\s\-\d]+)/:SREDNE:/g ; I have a feeling you want to use \b instead of [\W\s-]. It's cleaner and doesn't actually absorb a character. $cgname =~ s/\bSRED\.(?=[\W\d])/:SREDNE:/g; ># Match it at end of line >$cgname =~ s/[\W\s\-]+SRED\.$/:SREDNE:/g ; Again, use \b, but there's no need for /g here. $cgname =~ s/\bSRED\.$/:SREDNE:/; ># Match if it begins & ends line >$cgname =~ s/^SRED\.$/:SREDNE:/g ; Ah, here's an interesting case. This is actually already handled by my modifications. The problem is that you were using /[\W\s\-]+SRED\.$/ but if the string is "SRED.", then [\W\s\-] can't match anything. So that's why using a word boundary (\b) is smarter. Also, we can change the look-aheads to go from positive to negative. Instead of saying "and I am followed by a non-letter", why not say "and I am NOT followed by a letter"? $cgname =~ s/^SRED\.(?![A-Za-z])/SREDNE:/; # front $cgname =~ s/\bSRED\.(?![A-Za-z])/:SREDNE:/g; # middle $cgname =~ s/\bSRED\.$/:SREDNE:/; # end If you're worried about hardcoding the letter set (A-Za-z), then you can use this character class instead: [^\W\d_]. It means "match anything that's not: a non-word character, a digit, or an underscore". It's a sneaky way of matching anything that would be matched by \w WITHOUT matching \d or _. $cgname =~ s/^SRED\.(?![^\W\d_])/SREDNE:/; # front $cgname =~ s/\bSRED\.(?![^\W\d_])/:SREDNE:/g; # middle $cgname =~ s/\bSRED\.$/:SREDNE:/; # end >Right now I'm generating the regexes in a standalone script, then inserting >the output code into the subroutine that processes names into a "matchable" >form. > >What I'd like to be able to do is take a *set* of abbreviation >"dictionaries," concatenate them together and dynamically generate the >regex code in the routine that is going to execute it. So you want to take the dictionary files, and use them to create a function that does all the regexes on its input? -- Jeff "japhy" Pinyan [EMAIL PROTECTED] http://www.pobox.com/~japhy/ RPI Acacia brother #734 http://www.perlmonks.org/ http://www.cpan.org/ <stu> what does y/// stand for? <tenderpuss> why, yansliterate of course. [ I'm looking for programming work. If you like my work, let me know. ] -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]