On 3/10/06, Graeme McLaren <[EMAIL PROTECTED]> wrote:

> I've checked my XML file and it contains:

> <school_name>St. Patrick<92>s R.C. P.S.</school_name>
>
> This is because St. Patrick's contains an apostrophe.

I'm guessing that where I see four characters "<92>", the actual file
has a single character. Some tools render unusual characters that way.

> I have a couple of
> regexes to handle ampersands and apostrophes, however the apostrophe regex
> doesn't appear to be working correctly:
>
>
> ampersand regex works:
>
> $data->[$i] =~ s/&/&/g;

I'm not sure I know what you mean by "works". It seems to be replacing
every ampersand with an ampersand in the target string, which would be
a no-op if it didn't have side effects.

> apostrophe regex doesn't work:
>
> $data->[$i] =~ s/'/&apos;/g;

It doesn't? It's probably matching any true apostrophes.

> I've worked out that the character is a type of apostrophe which has
> a hex value of 92.  How would I write my regex to substitute this character
> for a normal apostrophe?

> I've tried: s/92/'/g;

> and it didn't work.

I think you're looking for one of these:

    s/\x92/'/g
    s/\x92/&apos;/g
    tr/\x92/'/

Backslash escapes are documented in perlop. Hope this helps!

--Tom Phoenix
Stonehenge Perl Training

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to