Thank you very much,
This has been very helpfull.
I have got the script running now, and it works. It's at
http://serverinfo.veldemanvalls.com/cgi-bin/linkpagegen.pl , but I still
have to modify some things.
Thanks again for all the help, subscribing to the list has been very
usefull.
Bruno.
------------------------------------------------------------
----- Original Message -----
From: "Chris Lott" <[EMAIL PROTECTED]>
To: "'Bruno Veldeman'" <[EMAIL PROTECTED]>;
<[EMAIL PROTECTED]>
Sent: Wednesday, June 06, 2001 8:45 PM
Subject: RE: Link check problem.
> > As I only want to know if the page is available for browsing,
> > what are the
> > result codes I should look for.
>
> 1)Explanation of response codes here:
> http://kbs.cs.tu-berlin.de/~jutta/ht/responses.html
>
> 2)The problem is that you need to set the user agent in your code so that
it
> masquerades as a browser that the site understands. This will get rid of
the
> error code you are seeing at http://encarta.msn.es/
>
> I have snipped from a program of mine to show you how this works. Note
that
> I also set a cookies file up so that I don't get rejected by sites that
need
> cookies.
>
> 3) To return the text of the page, look at the "content", ie print
> $response->content;
>
> (NOTE: I'm a perl newbie, so I'm sure there are more elegant ways to do
> this!)
>
> # Normal strictures to make me do a *little* better
> use strict;
> use warnings;
> use diagnostics;
>
> # LibWWW, used to actually get to the pages and check response and/or
> content
> use LWP::UserAgent;
>
> # To handle sites that need cookies
> use HTTP::Cookies;
>
> # All da variables
> my ($ua, $i, @site, $request, $response);
>
> # Unbuffer output or else it gets real boring waiting for the whole
> # program to finish before seeing any results
> $|++;
>
> # Create a new user agent that masquerades as a cookie eating MSIE 4
> $ua = LWP::UserAgent->new;
> $ua->cookie_jar(HTTP::Cookies->new(file => "lwpcookies.txt", autosave =>
> 1));
> $ua->agent('Mozilla/4.0 (compatible; MSIE 4.01; Windows 95)');
> $ua->timeout(20);
>
> # Sites I am checking
> @site = qw
> (
> http://www.grn.es/
> http://encarta.msn.es/
> );
>
> # for each site check for response and either print response or error
> message
> foreach $i (@site) {
> $request = HTTP::Request->new('GET' => "$i");
> $response = $ua->request($request);
> if ($response->is_success) {
> print $i, " ", $response->message, "\n";
> } else {
> print "Error: " . $response->status_line . "\n";
> };
> };