On Thursday, August 15, 2002, at 08:04 , John W. Krahn wrote:

> Batch M wrote:
[..]
>>  Oreilly's book says..." To suppress this warning,
>> assign an initial value to your variables."
>>
>> what value should I attached to:
>>
>> $content = $header_html . $1 . $footer_html;

the problem is in the 'assignment' side - so you
need to have initialized

        $header_html
        $footer_html

to some reasonable value...

Since you are Ripping Out the 'guts' of a webPage,
the stuff inside the 'BODY' tags - you might
want to deal with all the stuff that had been inside
the <head>...</head> ...

or step back and do the basic schema approach to what
a webPage is composed of

        <!DOCTYPE ....>
        <html ..>
                <head>
                        .....
                </head>
                <body...>
                        .....
                </body>
        </html>

so it would seem that were you trying to 'retain'
that 'form' - then you should set

        $header_html to everything up to the first <body...>
and
        $footer_html to the stuff including and following </body>

Prior to trying to grot out the $content....

there are several really good html parsing modules at the CPAN
my fave is the HTML::TreeBuilder stuff....


>> from:
>>
>>      if($content=~m|<BODY.*?>(.*?)</BODY>|si) {
>>             $content = $header_html . $1 .
>> $footer_html;
>>             $content =~ s|%title%|$title|;
>>             &save_file("$fullpath",$content);
>>             print "Completed\n";
>>          } else{
>>             print "Couldn't parse: $!\n";
>>            }
>>
>> What does $1 point to?
>
>
> $1 is assigned to by the regular expression match inside the parentheses
> (.*?)  Because the variable $1 is used after a successful match it will
> contain a value that can be used safely in concatenation.

john always gets the cool answers....


ciao
drieux

---


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to