Rob Dixon wrote:
Dave Cardwell wrote:

Hello there, I'm having trouble constructing a regular expression that would do the following:

FOO...
...followed by anything but BAR (non-greedy)...
...followed by BAZ (captured)...
...followed by anything but BAR (greedy)...
...followed by BAR

I've been looking at zero-width negative look-ahead, but I haven't used this area of regular expressions before so I'm struggling. A solution or prod in the right direction would be lovely.

Please show us the real problem. I know you mean to clarify, but your
summary is so ambiguous that understanding it becomes the most difficult
part of providing a solution.

Thanks,

Rob


I was afraid of that, sorry. I'm using HTML::Parser to scan through a document, but I need to do one quick manipulation first that depends on seeing the document as a whole (unlike per-token as with HTML::Parser). Rather than attempting to fit all of the real work in a regular expression, I thought it best to simply mark the element with a custom attribute that HTML::Parser could pick up later.

To that end, I need to find an <a> (BAZ) that contains just plain text, somewhere between an opening <td> (FOO) and the closest closing </td> (BAR), ie something along the lines of:

s%
    <td([^>]*>
        {not </td>}*?
            <a[^>]*>[\w\s]+</a>
        {not </td>}*?
    </td>)
%<td foo="1"$1%gismx;

It's the {not </td>} bits I'm having difficulty with.


--
Best wishes,
Dave Cardwell.

http://perlprogrammer.co.uk/


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/


Reply via email to