I have a regular expression that looks like:
$foo =~ s[class.*?=.*?'.*?'][]sgi;
We're just looking for spaces with most of those .*?s, right? Why don't we say that. And between quotes we're looking for non-quote characters, right?
s/class\s*=\s*'[^']*'//sgi
The problem I run into is that if the following is presented to match:
<table class='foo'><tr class='baz'><td class='bar'>
The regular expression will match:
class='foo'><tr class='baz'><td class='bar'
And I'll get:
<table >
Is there any way I can tell the .*? to match "" as well as "."?
I don't understand this part of the question. What are you wanting to match, instead of the above?
And of course, I should mention the many excellent HTML parsing modules on the CPAN, that work on many more cases than you're own quick and dirty approach. Do you have a good reason for not using them?
James
-- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]