Hi folks.  I'm having a little trouble with a regular expression and I'm
hoping someone can point out what I'm doing wrong.

I want to extract from a large number of html files everything between
the following specified comments, including the comments themselves:

<!--Begin CMS Content-->...<!-- End CMS Content-->

The string I'm testing the expression against is:

'Some code that will be ignored.
<!--Begin CMS Content-->
<span class="headline">Breadth Requirement</span>
<hr class="under" />
<!-- End CMS Content-->
This is some more content that will not be matched.
<!--Begin CMS Content-->
<strong>More Matched Content!</strong>
<!-- End CMS Content-->
Some more ignored code.'

And the regular expression I've got is

'/[<!--Begin CMS Content\-\->].+[<!-- End CMS Content\-\->]/s'

I expected that when I ran this using preg_match_all I would get two
matches, the comments and the content between them, but instead I get
the following:

Array
(
    [0] => Array
        (
            [0] => Some code that will be ignored.
<!--Begin CMS Content-->
<span class="headline">Breadth Requirement</span>
<hr class="under" />
<!-- End CMS Content-->
This is some more content that will not be matched.
<!--Begin CMS Content-->
<strong>More Matched Content!</strong>
<!-- End CMS Content-->
Some more ignored code
        )

)

which is just a match of the whole string minus the period at the very
end which is not matched.

Can anybody point out where I'm going wrong here?

Cheers and TIA,

Pablo.

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to