On Nov 29, 2003, at 1:15 PM, Dan Anderson wrote:

I have a regular expression that looks like:

$foo =~ s[class.*?=.*?'.*?'][]sgi;

We're just looking for spaces with most of those .*?s, right? Why don't we say that. And between quotes we're looking for non-quote characters, right?


s/class\s*=\s*'[^']*'//sgi

The problem I run into is that if the following is presented to match:

<table class='foo'><tr class='baz'><td class='bar'>

The regular expression will match:

class='foo'><tr class='baz'><td class='bar'

And I'll get:

<table >

Is there any way I can tell the .*? to match "" as well as "."?

I don't understand this part of the question. What are you wanting to match, instead of the above?


And of course, I should mention the many excellent HTML parsing modules on the CPAN, that work on many more cases than you're own quick and dirty approach. Do you have a good reason for not using them?

James


-- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to