On 6/3/05, Jeff 'japhy' Pinyan <[EMAIL PROTECTED]> wrote:
> On Jun 2, Siegfried Heintze said:
> 
> > How do I write a pattern for removing roman numerals? The first 10 is
> > enough.
> 
> Well, the first ten roman numerals are:
> 
>    I, II, III, IV, V, VI, VII, VIII, IX, X
> 
> Just put those in a regex.
> 
>    s/\b(I|II|...)\b//g;
> 
> would remove roman numerals, provided they aren't touching any word
> characters.
> 
> --
> Jeff "japhy" Pinyan         %  How can we ever be the sold short or


This isn't going to get them all; it says to match (between word
boundaries) "I" or "II" or any three non-newlines.  So it will catch
"I", "II", "III", and "VII".  It will also catch "I" where it's a
pronoun (assuming this is an english text file), and any three-letter
words/constructs.

I would trysomething like this:

s/\bI(?:I+|V|X)?|VI*|XI*\b//

Note that this will "I".  You may want to go through and get those by
hand instead if there is any chance of "I" having another function. 
If you can identify the context where the numerals appear, you can
make it easier on yourself.


HTH,

-- jay 
--------------------
daggerquill [at] gmail [dot] com
http://www.engatiki.org

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to