Travis Low wrote:
Break it up a little. Try something like this:
$title = preg_replace( "/.*<title>/i", "", $file_text ); $title = preg_replace( "/</title>.*/i", "", $title );
My title one worked, but I like your idea. I started to write that you were all flip-flopped on your title example, until I realized you were using preg_replace instead of preg_match.
quoting Mr. Spock:
An ancestor of mine maintained, that if you eliminate the impossible, whatever remains, however improbable, must be the truth.
Ditto for body.
now I have
// strip everything above and including <body...> $body = preg_replace("/.*<body.*>/iU",'',$file_text);
// strip everything after and including </body> $body = preg_replace("/<\/body>.*/i",'',$body);
Works lik a charm!
Thanks :)
Tim
Since you're looking for certain tags, it stands to reason that you're primarily interested in characters after "<" characters. So this might be faster:
$chunks = split( "<", $file_text ); for( $i = 0; $i < count( $chunks ); $i++ ) { if( 0 === strpos( $chunks[$i], "title" ) ) { # Title starts after the next ">", so get it. # If you're careful, you can modify $i here # as you gather up the title. } if( 0 == strpos( $chunks[$i], "body" ) ) { # Body starts after the next ">" # Modify $i, but be vewwwwwwy caweful. } }
The de-commenting is left to the reader.
cheers,
Travis
Tim wrote:
This script worked on one server, but started choking after I moved it to a new server.
Summary: I use the php_auto_prepend script to start output buffering, then use the php_auto_append script to extract the content between the title tags and between the body tags. When the size of the content between the body tags reaches around 11,500 characters, the ereg function stops working correctly.
What governs the number of characters that ereg can process?
I looked at the phpinfo from both servers but didn't find any clues...
Thanks for the help :)
FYI sample code:
prepend--
<?php
ob_start();
?>
append--
<?php
// load document into $file_text
$file_text = ob_get_contents(); ob_end_clean();
// extract title
unset($regs);
// use preg_match because its supposed to be faster... //eregi("<title>(.*)</title>",$file_text,$regs);
preg_match("|<title>(.*)</title>|i",$file_text,$regs);
$document_title = $regs[1];
// extract body of document (need to add onload statement in <body> tag)
unset($regs);
// I don't have the foggiest why preg_match doesn't seem to work here... //preg_match("|\<body(.*)</body>$|i",$file_text,$regs); //preg_match("|>(.*)|i",$regs[1],$temp);
eregi("<body(.*)</body>",$file_text,$regs); ereg(">(.*)",$regs[1],$temp); $template_body = $temp[1];
// stuff $document_title and $document_body into the "official" template
/* snip */ ?>
-- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php