Here's a reg exp problem that's got me up at night. I'm looking for
currency and numbers in various formats. The currency symbol and the
actual character code of the number I'm looking for may vary, depending on
the file I'm looking at, as we're working in Unicode and looking at
languages from all over the world. I'm also using utf8.
Basically valid formats would be:
$123
$ 123
123$
123 $
and up to here, I'm okay:
my $sText = "Foo $ 123 bar 123$ hello $123 world 123 $";
my $reNumber = "\x{0030}-\x{0039}"; # these may be different depending on
the language, but let's work with English
my $reCurrency = "\\x{0024}|\\x{00a3}"; # just to keep it simple
my @asCurrencies = $sText =~
/[$reNumber]?\x{0020}?[$reCurrency]\x{0020}?[$reNumber]?/g;
foreach my $currency (@asCurrencies)
{
print "$currency\n";
}
ok.
I want to add in the text for currency symbols, like "dollar" and "pound",
so that I match on either a currency symbol or a currency word and grab the
numbers to the left of to the right.
Here are the strings for those already formatted for utf8:
my $string =
"(\x{0064}\x{006f}\x{006c}\x{006c}\x{0061}\x{0072}\x{0073})|(\x{0070}\x{006f}\x{0075}\x{006e}\x{0064}\x{0073})";
# (dollar)|(pound)
I've tried all sorts of variations on parentheses etc. in the reg exp, to
no avail. I've checked the docs on forward and backward checking, and
messed around with it some, but either that's not what I need, or I haven't
completely grasped the concept yet.
Any ideas?
Aaron Craig
Programming
iSoftitler.com