I forward this here to pose a few questions;
is there a right way to set up a 3 process test within the Test::* Framework ?
I borrowed the approach used in LWP t/robot/ua.t, but added a 2nd fork/spawn
to give the 3 layers.
Is there a good way to borrow tests from another distribution ? Im using several pieces of W3-Mech tests.
tia, jimc
-------- Original Message -------- Subject: a 3 process test for HTTP::Recorder Date: Sat, 13 Dec 2003 22:43:10 -0700 From: Jim Cromie <[EMAIL PROTECTED]> To: [EMAIL PROTECTED], [EMAIL PROTECTED]
Hi Linda, HT-Proxy folks.
Ive built a test-setup for HTTP::Recorder, a young module with a high potential usefulness. For easy experimentation, Ive supplied it as a patch against HTTP-Recorder-0.01. I send here cuz its mentioned on htproxy's home page.
The test -setup is still quite rough, but Ive tweaked it for while now, and would appreciate a fresh set of eyes. Please send any suggestions for improvements, or actual fixes here; I trust BooK wont mind, and I imagine that Linda will watch this for progress.
More detail is in the t/README, but here are the highlights.
3 process layers: client layer: uses WWW::Mechanize to conduct tests. htrec layer: runs an HTTP::Recorder/Proxy in separate process htdaemon layer: serves up content to support tests.
Tests are basically in mech layer, but must be written to test the HTREC/HTPROXY layer; probably by testing that the expected HTML transforms are done in the htrec layer. Further complicating things, daemon must supply the expected pages, etc.. a tedious coordinating job unless good decisions are made.
htdaemon slings pages foreach t/htdocs/*.html htrec/htproxy env-proxys to daemon. mech-layer talks to localhost:1025 (not 'proxied') mech-layer scripts can retest thru htrec-layer. !instant tests!
The setup borrows from W3Mech and LWP; I copied W3Mech /t/*.html to t/htdocs/, and t/find_link.t will attempt to repeat w3mech tests here, thru the htrec layer, and will eventually be followed by other t/*.t pilferings ;-). The daemon layer is borrowed and adapted (with added cruft!) from LWP t/robot/ua.t
diff -Nru HTTP-Recorder-0.01/MANIFEST HTTP-Recorder-0.01-test/MANIFEST --- HTTP-Recorder-0.01/MANIFEST Sun Jul 27 21:33:31 2003 +++ HTTP-Recorder-0.01-test/MANIFEST Sat Dec 13 16:34:14 2003 @@ -4,5 +4,19 @@ Makefile.PL t/load.t t/pod.t +t/README explains new stuff +t/all3.pl 3 process test, to be renamed to all3.t +t/find_link.pl 1. w3mech driver +t/htrec.pl 2. htproxy/htrec layer UNDER TEST +t/htdaemon.pl 3a. daemon test support services +t/daemon-loop.pl 3b. daemon boilerplate +t/mechtest.t borrowed, will delete +# +t/htdocs/area_link.html borrowed, served by daemon-layer +t/htdocs/field.html +t/htdocs/find_link.html +t/htdocs/frames.html +t/htdocs/google.html +t/htdocs/tick.html MANIFEST README diff -Nru HTTP-Recorder-0.01/recording HTTP-Recorder-0.01-test/recording --- HTTP-Recorder-0.01/recording Wed Dec 31 17:00:00 1969 +++ HTTP-Recorder-0.01-test/recording Sat Dec 13 15:32:30 2003 @@ -0,0 +1 @@ +$agent->get("http:/find_link"); diff -Nru HTTP-Recorder-0.01/t/README HTTP-Recorder-0.01-test/t/README --- HTTP-Recorder-0.01/t/README Wed Dec 31 17:00:00 1969 +++ HTTP-Recorder-0.01-test/t/README Sat Dec 13 17:50:07 2003 @@ -0,0 +1,136 @@ + +=head1 new test strategy: 3 layer harness, 3 process stack. (A Dagwood) + + client layer: uses WWW::Mechanize to conduct tests. + htrec layer: runs an HTTP::Recorder/Proxy in separate process + daemon layer: serves up content to support tests. + +Tests are basically in mech layer, but must be written to test the +HTREC/HTPROXY layer; probably by testing that the expected HTML +transforms are done in the htrec layer. Further complicating things; +daemon must supply the expected pages, etc.. a tedious coordinating +job unless good decisions are made. + + Daemon slings pages foreach t/htdocs/*.html + htrec/htproxy env-proxys to daemon. + mech-layer talks to localhost:1025 (not 'proxied') + mech-layer scripts can retest thru htrec-layer. + !instant tests! + +The setup borrows from W3Mech and LWP; I copied W3Mech /t/*.html to +t/htdocs/, and t/find_link.t will attempt to repeat w3mech tests here, +thru the htrec layer, and will eventually be followed by other t/*.t +pilferings ;-). The daemon layer is borrowed and adapted (with added +cruft!) from LWP t/robot/ua.t + +=head1 client layer + +t/all3.t attempts to integrate all 3 processes in 1 file, and +successfully spawns the daemon-layer, but has some probs spawning the +htrec layer. Could also be called master layer, esp wrt spawning. + +t/find_link.pl actually runs the 1st tests, and is invoked by all3.t +via do "file". + +On its 1st get(), the script fails, as browser shows. + + Scheme is not supported by this proxy. + +The extra space there is a clue, but I havent isolated it. Im pretty +certain however that theres some URI usage problem that eventually +causes loss of scheme. + + LWP::UserAgent::is_protocol_supported('HTTP::Recorder=HASH(0x88bbcd0)','undef') + called at /usr/local/lib/perl5/site_perl/5.8.2/HTTP/Proxy.pm line 508 + ie: + if ( !$self->agent->is_protocol_supported( my $s = $req->uri->scheme ) ) + +=head1 htrec layer + +htrec.pl is run as a separate process, its hardwired to +localhost:1025, and is env_proxyd over to localhost:1024 + +the 1st W3Mech::get() causes 501s from htrec-layer, as seen in log; + + [Sat Dec 13 14:48:17 2003] (13549) Request: GET /find_link + Use of uninitialized value in pattern match (m//) + at /usr/local/lib/perl5/site_perl/5.8.2/LWP/UserAgent.pm line 491. + Use of uninitialized value in concatenation (.) or string + at /usr/local/lib/perl5/site_perl/5.8.2/HTTP/Proxy.pm line 511. + Use of uninitialized value in pattern match (m//) + at /usr/local/lib/perl5/site_perl/5.8.2/HTTP/Proxy.pm line 624. + + [Sat Dec 13 13:29:55 2003] (13549) Response: 501 Not Implemented + [Sat Dec 13 13:29:55 2003] (13549) Response: Content-Length: 39 + +=head1 A ROOT CAUSE ?? + +A recording made by htrec-layer suggests that the url is broken by the +time this is commited to the log. + ++$agent->get("http:/find_link"); + + + +=head1 daemon layer + +consists of 2 scripts + +=head2 daemon-loop.pl + +is a boiler-plate HTTP::Daemon based on LWP t/robot/ua.t. +It still uses the open-pipe approach to demonizing itself, and still +communicates its url back to the master-process that way. The script +now accepts arguments, which are used in this setup to hardwire it to +localhost:1024. + +Its arguable that I should have modernized it a bit more, and +jettisoned the do "file" mechanics, but I wanted the comfort of a +legacy process setup. Its already crufty; it cannot use Test::More, +cuz that prints 1..NumTests, which fouls the pipe-read and url +extraction. I could read-pipe till URL found, but down that way lays +madness. + +=head2 htdaemon.pl + +This invokes above via do "daemon-loop.pl", and provides +Server::AUTOLOAD to handle requests received by the daemon loop. +AUTOLOAD matches requests against files in t/htdocs/*.html, and will +have some sort of non-match default soon. + + +=head1 STATUS + +Ive used it most successfully by running each layer in a separate +shell, tho backgrounding them works too + + perl t/daemon-layer + perl t/htrec.pl + perl t/find_link.pl + +after starting 1,2, you can test each by browsing to +/localhost:102[45]/find_link, or other htdoc/(\w+).html + +=head1 CURRENT PROBLEMS + +I try to split up for 3 audiences + +=head2 HTTP::Recorder + +501s, as mentioned above. A trace shows a suspicious 'undef' + + LWP::UserAgent::is_protocol_supported('HTTP::Recorder=HASH(0x88bbcd0)','undef') + called at /usr/local/lib/perl5/site_perl/5.8.2/HTTP/Proxy.pm line 508 + + HTTP::Proxy::serve_connections('HTTP::Proxy=HASH(0x88bbc70)', + 'HTTP::Daemon::ClientConn=GLOB(0x88c6cc8)', .... + called at /usr/local/lib/perl5/site_perl/5.8.2/HTTP/Proxy.pm line 348 + + as seen in extra space in browser: + 'Scheme is not supported by this proxy', + +My best guess at this point is that + +=head + + diff -Nru HTTP-Recorder-0.01/t/all3.pl HTTP-Recorder-0.01-test/t/all3.pl --- HTTP-Recorder-0.01/t/all3.pl Wed Dec 31 17:00:00 1969 +++ HTTP-Recorder-0.01-test/t/all3.pl Sat Dec 13 16:31:27 2003 @@ -0,0 +1,27 @@ +#!perl + +use Config; + +sub spawn { + my ($prog,@args) = @_; + my $perl = $Config{'perlpath'}; + $perl = $^X if $^O eq 'VMS' or -x $^X and $^X =~ m,^([a-z]:)?/,i; + print "$perl -I../lib $prog @args |"; + open(my $handle , "$perl -I../lib $prog @args |") + or die "Can't exec daemon: $!"; + + return( $handle); +} + +my $dout = spawn('t/daemon-layer', + qw( daemon LocalAddr localhost:1024 Timeout 300 Reuse 1 )); + +print "from daemon-layer: ",<$dout>,"\n"; + + +my $htout = spawn('t/htrec.pl'); +print "from htrec-layer: ",<$htout>,"\n"; + +print "ready to run w3mech tests\n"; + +do "t/find_link.pl"; diff -Nru HTTP-Recorder-0.01/t/all3.t HTTP-Recorder-0.01-test/t/all3.t --- HTTP-Recorder-0.01/t/all3.t Wed Dec 31 17:00:00 1969 +++ HTTP-Recorder-0.01-test/t/all3.t Sat Dec 13 12:28:45 2003 @@ -0,0 +1,27 @@ +#!perl + +use Config; + +sub spawn { + my ($prog,@args) = @_; + my $perl = $Config{'perlpath'}; + $perl = $^X if $^O eq 'VMS' or -x $^X and $^X =~ m,^([a-z]:)?/,i; + print "$perl -I../lib $prog @args |"; + open(my $handle , "$perl -I../lib $prog @args |") + or die "Can't exec daemon: $!"; + + return( $handle); +} + +my $dout = spawn('t/daemon-layer', + qw( daemon LocalAddr localhost:1024 Timeout 300 Reuse 1 )); + +print "from daemon-layer: ",<$dout>,"\n"; + + +my $htout = spawn('t/htrec.pl'); +print "from htrec-layer: ",<$htout>,"\n"; + +print "ready to run w3mech tests\n"; + +do "t/find_link.pl"; diff -Nru HTTP-Recorder-0.01/t/daemon-loop.pl HTTP-Recorder-0.01-test/t/daemon-loop.pl --- HTTP-Recorder-0.01/t/daemon-loop.pl Wed Dec 31 17:00:00 1969 +++ HTTP-Recorder-0.01-test/t/daemon-loop.pl Sat Dec 13 02:46:57 2003 @@ -0,0 +1,68 @@ +#!perl + +# daemon.pl is do $file'd by tests which need an HTTP::Daemon. It +# gets its service routines from the caller script. + +if($^O eq "MacOS") { + print "1..0\n"; + exit(0); +} + +$| = 1; # autoflush +require IO::Socket; # make sure this work before we try to make a HTTP::Daemon + +# First we make ourself a daemon in another process +my $D = shift || ''; +if ($D eq 'daemon') { + + require HTTP::Daemon; + #print "daemon: $0 @ARGV "; # no newline! dont foul up pipe read + + # pass cmdln args, allowing ex: LocalPort 1024 + my $d = new HTTP::Daemon (Timeout => 10, @ARGV); + + print "Please to meet you at: <URL:", $d->url, "> @ARGV\n"; + open(STDOUT, $^O eq 'MSWin32' ? ">nul" : $^O eq 'VMS' ? ">NL:" : ">/dev/null"); + + while ($c = $d->accept) { + $r = $c->get_request; + if ($r) { + my $p = ($r->url->path_segments)[1]; + $p =~ s/\W//g; + my $func = lc("httpd_" . $r->method . "_$p"); + print STDERR "Calling $func...\n"; + if (defined &$func) { + &$func($c, $r); + } + else { + eval { &{"Server::$func"}($c, $r) }; + $c->send_error(404, "dont know Server$func") + if $@; + + if(defined &{"Server::$func"}) { + &{"Server::$func"}($c, $r); + } else { + #$c->send_error(404, "dont know $func"); + } + } + } + $c = undef; # close connection + } + + print STDERR "HTTP Server terminated $! [EMAIL PROTECTED]"; # a bit extra info + exit; +} + + +else { + use Config; + my $perl = $Config{'perlpath'}; + $perl = $^X if $^O eq 'VMS' or -x $^X and $^X =~ m,^([a-z]:)?/,i; + #print "$perl -I../lib $0 daemon $D @ARGV |\n"; + open(DAEMON , "$perl -I../lib $0 daemon $D @ARGV |") + or die "Can't exec daemon: $!"; +} + +$greating = <DAEMON>; +#print "daemon: $greating"; + diff -Nru HTTP-Recorder-0.01/t/find_link.pl HTTP-Recorder-0.01-test/t/find_link.pl --- HTTP-Recorder-0.01/t/find_link.pl Wed Dec 31 17:00:00 1969 +++ HTTP-Recorder-0.01-test/t/find_link.pl Sat Dec 13 16:13:15 2003 @@ -0,0 +1,145 @@ +#!/usr/bin/perl + +use warnings; +use strict; +use Test::More tests => 55; +use URI::file; + +BEGIN { + use_ok( 'WWW::Mechanize' ); +} + +$ENV{HTTP_PROXY} = "http_proxy=http://localhost:1025/"; #"localhost:1025"; + +my $t = WWW::Mechanize->new( cookie_jar => undef, + env_proxy => 1 ); +isa_ok( $t, 'WWW::Mechanize' ); +is ($t->env_proxy, 1, "check env_proxy"); +#is ($t->env_proxy(1), 0, "set env_proxy"); +is ($t->env_proxy, 1, "check again env_proxy"); + +my $uri = URI->new( "http://localhost:1025/find_link", 'http' ); +isa_ok($uri, 'URI'); +is ($uri->scheme, 'http', "correct scheme"); +is ($uri->port, '1025', "correct port"); +is ($uri->host, 'localhost', "correct host"); + +#->as_string; +#my $uri = "URL:http://localhost:1025/find_link"; +diag "trying ", $uri; + +$t->get( $uri ); +ok( $t->success, "Fetched $uri" ) or die "Can't get test page $! $@ $t"; + +my $x; +$x = $t->find_link(); +isa_ok( $x, 'WWW::Mechanize::Link' ); +is( $x->[0], "http://blargle.com/", "First link on the page" ); +is( $x->url, "http://blargle.com/", "First link on the page" ); + +$x = $t->find_link( text => "CPAN A" ); +isa_ok( $x, 'WWW::Mechanize::Link' ); +is( $x->[0], "http://a.cpan.org/", "First CPAN link" ); +is( $x->url, "http://a.cpan.org/", "First CPAN link" ); + +$x = $t->find_link( url => "CPAN" ); +ok( !defined $x, "No url matching CPAN" ); + +$x = $t->find_link( text_regex => qr/CPAN/, n=>3 ); +isa_ok( $x, 'WWW::Mechanize::Link' ); +is( $x->[0], "http://c.cpan.org/", "3rd CPAN text" ); +is( $x->url, "http://c.cpan.org/", "3rd CPAN text" ); + +$x = $t->find_link( text => "CPAN", n=>34 ); +ok( !defined $x, "No 34th CPAN text" ); + +$x = $t->find_link( text_regex => qr/(?i:cpan)/ ); +isa_ok( $x, 'WWW::Mechanize::Link' ); +is( $x->[0], "http://a.cpan.org/", "Got 1st cpan via regex" ); +is( $x->url, "http://a.cpan.org/", "Got 1st cpan via regex" ); + +$x = $t->find_link( text_regex => qr/cpan/i ); +isa_ok( $x, 'WWW::Mechanize::Link' ); +is( $x->[0], "http://a.cpan.org/", "Got 1st cpan via regex" ); +is( $x->url, "http://a.cpan.org/", "Got 1st cpan via regex" ); + +$x = $t->find_link( text_regex => qr/cpan/i, n=>153 ); +ok( !defined $x, "No 153rd cpan link" ); + +$x = $t->find_link( url => "http://b.cpan.org/" ); +isa_ok( $x, 'WWW::Mechanize::Link' ); +is( $x->[0], "http://b.cpan.org/", "Got b.cpan.org" ); +is( $x->url, "http://b.cpan.org/", "Got b.cpan.org" ); + +$x = $t->find_link( url => "http://b.cpan.org", n=>2 ); +ok( !defined $x, "Not a second b.cpan.org" ); + +$x = $t->find_link( url_regex => qr/[b-d]\.cpan\.org/, n=>2 ); +isa_ok( $x, 'WWW::Mechanize::Link' ); +is( $x->[0], "http://c.cpan.org/", "Got c.cpan.org" ); +is( $x->url, "http://c.cpan.org/", "Got c.cpan.org" ); + +my @wanted_links= ( + [ "http://a.cpan.org/", "CPAN A", undef, "a" ], + [ "http://b.cpan.org/", "CPAN B", undef, "a" ], + [ "http://c.cpan.org/", "CPAN C", "bongo", "a" ], + [ "http://d.cpan.org/", "CPAN D", undef, "a" ], +); +my @links = $t->find_all_links( text_regex => qr/CPAN/ ); +is_deeply( [EMAIL PROTECTED], [EMAIL PROTECTED], "Correct links came back" ); + +my $linkref = $t->find_all_links( text_regex => qr/CPAN/ ); +is_deeply( $linkref, [EMAIL PROTECTED], "Correct links came back" ); + +# Check combinations of links +$x = $t->find_link( text => "News" ); +isa_ok( $x, 'WWW::Mechanize::Link' ); +is( $x->[0], "http://www.msnbc.com/", "First News is MSNBC" ); +is( $x->url, "http://www.msnbc.com/", "First News is MSNBC" ); + +$x = $t->find_link( text => "News", url_regex => qr/bbc/ ); +isa_ok( $x, 'WWW::Mechanize::Link' ); +is( $x->[0], "http://www.bbc.co.uk/", "First BBC news link" ); +is( $x->url, "http://www.bbc.co.uk/", "First BBC news link" ); +is( $x->[1], "News", "First BBC news text" ); +is( $x->text, "News", "First BBC news text" ); + +$x = $t->find_link( text => "News", url_regex => qr/cnn/ ); +isa_ok( $x, 'WWW::Mechanize::Link' ); +is( $x->[0], "http://www.cnn.com/", "First CNN news link" ); +is( $x->url, "http://www.cnn.com/", "First CNN news link" ); +is( $x->[1], "News", "First CNN news text" ); +is( $x->text, "News", "First CNN news text" ); + +AREA_CHECKS: { + my @wanted_links = ( + [ "http://www.cnn.com/", "CNN", undef, "a" ], + [ "http://www.cnn.com/", "News", "Fred", "a" ], + [ "http://www.cnn.com/area", undef, undef, "area" ], + ); + my @links = $t->find_all_links( url_regex => qr/cnn\.com/ ); + is_deeply( [EMAIL PROTECTED], [EMAIL PROTECTED], "Correct links came back" ); + + my $linkref = $t->find_all_links( url_regex => qr/cnn\.com/ ); + is_deeply( $linkref, [EMAIL PROTECTED], "Correct links came back" ); +} + +$x = $t->find_link( name => "bongo" ); +isa_ok( $x, 'WWW::Mechanize::Link' ); +is_deeply( $x, [ "http://c.cpan.org/", "CPAN C", "bongo", "a" ], 'Got the CPAN C link' ); + +$x = $t->find_link( name_regex => qr/^[A-Z]/, n => 2 ); +isa_ok( $x, 'WWW::Mechanize::Link' ); +is_deeply( $x, [ "http://www.cnn.com/", "News", "Fred", "a" ], 'Got 2nd link that begins with a capital' ); + +$x = $t->find_link( tag => 'a', n => 3 ); +isa_ok( $x, 'WWW::Mechanize::Link' ); +is_deeply( $x, [ "http://b.cpan.org/", "CPAN B", undef, "a" ], 'Got 3rd <A> tag' ); + +$x = $t->find_link( tag_regex => qr/^(a|frame)$/, n => 7 ); +isa_ok( $x, 'WWW::Mechanize::Link' ); +is_deeply( $x, [ "http://d.cpan.org/", "CPAN D", undef, "a" ], 'Got 7th <A> or <FRAME> tag' ); + +$x = $t->find_link( text => "Rebuild Index" ); +isa_ok( $x, 'WWW::Mechanize::Link' ); +is_deeply( $x, [ "/cgi-bin/MT/mt.cgi", "Rebuild Index", undef, "a" ], 'Got the JavaScript link' ); diff -Nru HTTP-Recorder-0.01/t/htdaemon.pl HTTP-Recorder-0.01-test/t/htdaemon.pl --- HTTP-Recorder-0.01/t/htdaemon.pl Wed Dec 31 17:00:00 1969 +++ HTTP-Recorder-0.01-test/t/htdaemon.pl Sat Dec 13 11:53:13 2003 @@ -0,0 +1,71 @@ +#!perl + +use warnings; +use strict; +use URI::file; + +################### +# set up the Serving daemon using LWP derived setup + +our $greating; [EMAIL PROTECTED](qw( daemon LocalAddr localhost:1024 Timeout 300 Reuse 1 )) unless @ARGV; # daemon +do "t/daemon.pl"; +print "server said: $greating"; + +package Server; +our $AUTOLOAD; +use UNIVERSAL::canAUTOLOAD; +sub httpd_get_find_link; +sub httpd_get_field; + +#---------------------------------------------------------------- +sub httpd_get_find_link #serve all calls +{ + my($c,$r) = @_; + + #(my $meth = $AUTOLOAD) =~ s/.*:://; + #return if $meth eq 'DESTROY'; + #$meth =~ s/httpd_get_//; + my $meth = 'find_link'; + + if (not -f "t/htdocs/$meth.html") { + #print "no $meth\n"; + $c->send_error(404, "dont know $meth.html"); + } else { + open(my $foo, "t/htdocs/$meth.html") or die; + $/ = undef; + my $buf = <$foo>; + $c->send_basic_header; + $c->print("Content-Type: text/html"); + $c->send_crlf; + $c->send_crlf; + $c->print($buf); + } +} + +use Data::Dumper::EasyOO; # ('singleton' => my $ezdd); +my $ezdd = Data::Dumper::EasyOO->new(); + +sub AUTOLOAD #serve all calls +{ + my($c,$r) = @_; + + (my $meth = $AUTOLOAD) =~ s/.*:://; + return if $meth eq 'DESTROY'; + $meth =~ s/httpd_get_//; + if (not -f "t/htdocs/$meth.html") { + print "no $meth\n"; + $c->send_error(404, "dont know $meth.html"); + } else { + open(my $foo, "t/htdocs/$meth.html") or die; + $/ = undef; + #my $buf = <$foo>; + $c->send_basic_header; + $c->print("Content-Type: text/html"); + $c->send_crlf; + $c->send_crlf; + $c->print(<$foo>); + #$c->print($ezdd->($r)); + } +} + diff -Nru HTTP-Recorder-0.01/t/htdocs/area_link.html HTTP-Recorder-0.01-test/t/htdocs/area_link.html --- HTTP-Recorder-0.01/t/htdocs/area_link.html Wed Dec 31 17:00:00 1969 +++ HTTP-Recorder-0.01-test/t/htdocs/area_link.html Sat Dec 13 02:03:15 2003 @@ -0,0 +1,19 @@ +<html> + <head> + <TITLE>Testing AREA tag handling</TITLE> + </head> + <body> + <MAP NAME="SOME_MAP"> + <AREA HREF="http://www.msnbc.com/area" COORDS="1,2,3,4"></AREA> + <AREA HREF="http://www.cnn.com/area" COORDS="5,6,7,8"> + <AREA HREF="http://www.cpan.org/area" COORDS="10,11,12,13" /> + </MAP> + <MAP NAME="OTHER_MAP"> + <AREA NOHREF COORDS="1,2,3,4"> + <AREA HREF="http://www.slashdot.org"> + </MAP> + <IMG SRC="SOME_IMAGE" USEMAP="#SOME_MAP"> + <IMG SRC="SOME_IMAGE" USEMAP="#OTHER_MAP"> + </body> +</html> + diff -Nru HTTP-Recorder-0.01/t/htdocs/field.html HTTP-Recorder-0.01-test/t/htdocs/field.html --- HTTP-Recorder-0.01/t/htdocs/field.html Wed Dec 31 17:00:00 1969 +++ HTTP-Recorder-0.01-test/t/htdocs/field.html Sat Dec 13 02:03:15 2003 @@ -0,0 +1,13 @@ +<HTML> +<HEAD> + Like a hole +</HEAD> +<BODY BGCOLOR="puce"> +<FORM ACTION="/shake-some/"> +<INPUT TYPE="text" NAME="dingo" VALUE="dingo1"> +<INPUT TYPE="text" NAME="bongo" VALUE="bongo!"> +<INPUT TYPE="radio" NAME="wango" VALUE="wango!"> +<INPUT TYPE="text" NAME="dingo" VALUE="dingo2"> +</FORM> +</BODY> +</HTML> diff -Nru HTTP-Recorder-0.01/t/htdocs/find_link.html HTTP-Recorder-0.01-test/t/htdocs/find_link.html --- HTTP-Recorder-0.01/t/htdocs/find_link.html Wed Dec 31 17:00:00 1969 +++ HTTP-Recorder-0.01-test/t/htdocs/find_link.html Sat Dec 13 11:25:32 2003 @@ -0,0 +1,36 @@ +<html> + <head> + <TITLE>Testing the links</TITLE> + </head> + <body> + <UL> + <LI> <A HREF="http://blargle.com/">blargle</A> + <LI> <A HREF="http://a.cpan.org/">CPAN A</A> + <LI> <A HREF="http://b.cpan.org/">CPAN B</A> + <FRAME SRC="foo.html"> + <FRAME SRC="bar.html"> + <LI> <A HREF="http://c.cpan.org/" NAME="bongo">CPAN C</A> + <LI> <A HREF="http://d.cpan.org/">CPAN D</A> + + <LI> <A HREF="http://www.msnbc.com/">MSNBC</A> + <FRAME SRC="http://www.oreilly.com/" NAME="wongo"> + <LI> <A HREF="http://www.cnn.com/">CNN</A> + <LI> <A HREF="http://www.bbc.co.uk/" NAME="Wilma">BBC</A> + <LI> <A HREF="http://www.msnbc.com/">News</A> + <LI> <A HREF="http://www.cnn.com/" NAME="Fred">News</A> + <LI> <A HREF="http://www.bbc.co.uk/">News</A> + <LI> <A onmouseover="window.status='Rebuild Files'; return true" href="#" onClick="window.open( '/cgi-bin/MT/mt.cgi', 'rebuild', 'width=400,height=200,resizable=yes')">Rebuild Index</A> + + <MAP NAME="SOME_MAP"> + <AREA HREF="http://www.msnbc.com/area" COORDS="1,2,3,4"></AREA> + <AREA HREF="http://www.cnn.com/area" COORDS="5,6,7,8" NAME="Marty"> + <AREA HREF="http://www.cpan.org/area" COORDS="10,11,12,13" /> + </MAP> + <IMG SRC="SOME_IMAGE" USEMAP="#SOME_MAP"> + + <!-- new stuff --> + <LI> <A HREF="http://nowhere.org/" Name="Here">NoWhere</A> + <LI> <A HREF="http://nowhere.org/padded" Name=" Here "> NoWhere </A> + </body> +</html> + diff -Nru HTTP-Recorder-0.01/t/htdocs/find_link.html.orig HTTP-Recorder-0.01-test/t/htdocs/find_link.html.orig --- HTTP-Recorder-0.01/t/htdocs/find_link.html.orig Wed Dec 31 17:00:00 1969 +++ HTTP-Recorder-0.01-test/t/htdocs/find_link.html.orig Sat Dec 13 11:24:16 2003 @@ -0,0 +1,35 @@ +<html> + <head> + <TITLE>Testing the links</TITLE> + </head> + <body> + <A HREF="http://blargle.com/">blargle</A> + <A HREF="http://a.cpan.org/">CPAN A</A> + <A HREF="http://b.cpan.org/">CPAN B</A> + <FRAME SRC="foo.html"> + <FRAME SRC="bar.html"> + <A HREF="http://c.cpan.org/" NAME="bongo">CPAN C</A> + <A HREF="http://d.cpan.org/">CPAN D</A> + + <A HREF="http://www.msnbc.com/">MSNBC</A> + <FRAME SRC="http://www.oreilly.com/" NAME="wongo"> + <A HREF="http://www.cnn.com/">CNN</A> + <A HREF="http://www.bbc.co.uk/" NAME="Wilma">BBC</A> + <A HREF="http://www.msnbc.com/">News</A> + <A HREF="http://www.cnn.com/" NAME="Fred">News</A> + <A HREF="http://www.bbc.co.uk/">News</A> + <A onmouseover="window.status='Rebuild Files'; return true" href="#" onClick="window.open( '/cgi-bin/MT/mt.cgi', 'rebuild', 'width=400,height=200,resizable=yes')">Rebuild Index</A> + + <MAP NAME="SOME_MAP"> + <AREA HREF="http://www.msnbc.com/area" COORDS="1,2,3,4"></AREA> + <AREA HREF="http://www.cnn.com/area" COORDS="5,6,7,8" NAME="Marty"> + <AREA HREF="http://www.cpan.org/area" COORDS="10,11,12,13" /> + </MAP> + <IMG SRC="SOME_IMAGE" USEMAP="#SOME_MAP"> + + <!-- new stuff --> + <A HREF="http://nowhere.org/" Name="Here">NoWhere</A> + <A HREF="http://nowhere.org/padded" Name=" Here "> NoWhere </A> + </body> +</html> + diff -Nru HTTP-Recorder-0.01/t/htdocs/frames.html HTTP-Recorder-0.01-test/t/htdocs/frames.html --- HTTP-Recorder-0.01/t/htdocs/frames.html Wed Dec 31 17:00:00 1969 +++ HTTP-Recorder-0.01-test/t/htdocs/frames.html Sat Dec 13 02:03:15 2003 @@ -0,0 +1,13 @@ +<html> + <head> + <title></title> + </head> + + <frameset rows="*,*" frameborder="1" framespacing="0" border="1"> + <frame name="top" src="find_link.html" marginwidth="8" +marginheight="8" scrolling="auto" frameborder="no"> + <frame name="bottom" src="google.html" marginwidth="0" +marginheight="0" scrolling="no" frameborder="no" noresize> + </frameset> + +</html> diff -Nru HTTP-Recorder-0.01/t/htdocs/google.html HTTP-Recorder-0.01-test/t/htdocs/google.html --- HTTP-Recorder-0.01/t/htdocs/google.html Wed Dec 31 17:00:00 1969 +++ HTTP-Recorder-0.01-test/t/htdocs/google.html Sat Dec 13 02:03:15 2003 @@ -0,0 +1,14 @@ +<html><head><meta http-equiv="content-type" content="text/html; charset=ISO-8859-1"><title>Google</title><style><!-- +body,td,a,p,.h{font-family:arial,sans-serif;} +.h{font-size: 20px;} +.q{text-decoration:none; color:#0000cc;} +//--> +</style> +<script> +<!-- +function sf(){document.f.q.focus();} +// --> +</script> +</head><body bgcolor=#ffffff text=#000000 link=#0000cc vlink=#551a8b alink=#ff0000 onLoad=sf()><center><table border=0 cellspacing=0 cellpadding=0><tr><td><img src="/images/logo.gif" width=276 height=110 alt="Google"></td></tr></table><br> +<table border=0 cellspacing=0 cellpadding=0><tr><td width=15> </td><td id=0 bgcolor=#3366cc align=center width=95 nowrap><font color=#ffffff size=-1><b>Web</b></font></td><td width=15> </td><td id=1 bgcolor=#efefef align=center width=95 nowrap onClick="" style=cursor:pointer;cursor:hand;><a id=1a class=q href="/imghp?hl=en&tab=wi&ie=UTF-8"><font size=-1>Images</font></a></td><td width=15> </td><td id=2 bgcolor=#efefef align=center width=95 nowrap onClick="" style=cursor:pointer;cursor:hand;><a id=2a class=q href="/grphp?hl=en&tab=wg&ie=UTF-8"><font size=-1>Groups</font></a></td><td width=15> </td><td id=3 bgcolor=#efefef align=center width=95 nowrap onClick="" style=cursor:pointer;cursor:hand;><a id=3a class=q href="/dirhp?hl=en&tab=wd&ie=UTF-8"><font size=-1>Directory</font></a></td><td width=15> </td><td id=4 bgcolor=#efefef align=center width=95 nowrap onClick="" style=cursor:pointer;cursor:hand;><a id=4a class=q href="/nwshp?hl=en&tab=wn&ie=UTF-8"><font size=-1>News</font></a></td><td width=15> </td></tr><tr><td colspan=12 bgcolor=#3366cc><img width=1 height=1 alt=""></td></tr></table><br><form action="/target-page" name="bob-the-form"><table cellspacing=0 cellpadding=0><tr><td width=75> </td><td align=center><input type=hidden name=hl value=en><span id=hf></span><input type=hidden name=ie value="ISO-8859-1"><input maxLength=256 size=55 name=q value=""><br><input type=submit value="Google Search" name=btnG><input type=submit value="I'm Feeling Lucky" name=btnI></td><td valign=top nowrap><font size=-2> • <a href=/advanced_search?hl=en>Advanced Search</a><br> • <a href=/preferences?hl=en>Preferences</a><br> • <a href=/language_tools?hl=en>Language Tools</a></font></td></tr></table></form><br><p><font size=-1>Want more from Google? Try these <a href="/tour/services/query.html">expert search tips</a></font><p> +<br><font size=-1><a href="/ads/">Advertise with Us</a> - <a href="/services/">Business Solutions</a> - <a href="/options/">Services & Tools</a> - <a href=/about.html>Jobs, Press, & Help</a></font><p><font size=-2>©2003 Google - Searching 3,083,324,652 web pages</font></p></center></body></html> diff -Nru HTTP-Recorder-0.01/t/htdocs/tick.html HTTP-Recorder-0.01-test/t/htdocs/tick.html --- HTTP-Recorder-0.01/t/htdocs/tick.html Wed Dec 31 17:00:00 1969 +++ HTTP-Recorder-0.01-test/t/htdocs/tick.html Sat Dec 13 02:03:15 2003 @@ -0,0 +1,14 @@ +<html> +<body> +<form action="http://localhost/" method="POST"> + +<input type="checkbox" name="foo" value="hello" /> Hello<br /> +<input type="checkbox" name="foo" value="bye" /> Bye<br /> +<input type="checkbox" name="foo" value="arse" /> Arse<br /> +<input type="checkbox" name="foo" value="wibble" /> Wibble<br /> +<input type="checkbox" name="foo" value="foo" /> Foo<br /> + +<input type="Submit" name="submit" value="Submit" label="Sumbit" /> +</form> +</body> + diff -Nru HTTP-Recorder-0.01/t/htrec.pl HTTP-Recorder-0.01-test/t/htrec.pl --- HTTP-Recorder-0.01/t/htrec.pl Wed Dec 31 17:00:00 1969 +++ HTTP-Recorder-0.01-test/t/htrec.pl Sat Dec 13 16:04:58 2003 @@ -0,0 +1,17 @@ +#!/usr/local/bin/perl -w + +#use lib '/home/jimc/perl.cpan/htrec/HTTP-Recorder-0.01-mod/lib'; +use HTTP::Proxy qw(:log); +use HTTP::Recorder; +$ENV{HTTP_PROXY} = "localhost:1024"; +my $proxy = HTTP::Proxy->new(port => 1025, maxchild => 0, + logmask => ALL ); + +# set HTTP::Recorder as the agent +my $agent = HTTP::Recorder->new( file => shift || "recording"); #, showwindow => 1); #/tmp/tmpfile" ); + +$proxy->agent( $agent ); + +$proxy->start(); + + diff -Nru HTTP-Recorder-0.01/t/mechtest.t HTTP-Recorder-0.01-test/t/mechtest.t --- HTTP-Recorder-0.01/t/mechtest.t Wed Dec 31 17:00:00 1969 +++ HTTP-Recorder-0.01-test/t/mechtest.t Sat Dec 13 01:31:25 2003 @@ -0,0 +1,145 @@ +#!perl + +use warnings; +use strict; +#use Test::More tests => 6; +use URI::file; + +################### +# set up the Serving daemon using LWP derived setup + +our $greating; [EMAIL PROTECTED](qw( LocalAddr localhost:1024 Timeout 300 )) unless @ARGV; # daemon +do "t/daemon.pl"; +#print "server said: $greating"; +#$greating = <DAEMON>; +$greating =~ /(<[^>]+>)/; + +require URI; +my $base = URI->new($1); +sub url { + my $u = URI->new(@_); + $u = $u->abs($_[1]) if @_ > 1; + $u->as_string; +} + +print "Will access HTTP server at $base\n"; + + +################### +# The Module under test, and support +use HTTP::Recorder; +use HTTP::Proxy; + +my $proxy = HTTP::Proxy->new; + +# set HTTP::Recorder as the agent +my $agent = HTTP::Recorder->new( file => "/tmp/tmpfile" ); +$proxy->agent( $agent ); + +#$proxy->start(); + +################### +# the Test Driver side + + +#BEGIN { use_ok( 'WWW::Mechanize' );} +use WWW::Mechanize; + +my $t = WWW::Mechanize->new( cookie_jar => undef ); +#isa_ok( $t, 'WWW::Mechanize' ); + +# my $uri = URI::file->new_abs( "t/find_link.html" )->as_string; + + +print "1..7\n"; + +################### +# the borrowed test TBadapted + +my ($ua, $req, $res); +require LWP::RobotUA; +require HTTP::Request; +$ua = new LWP::RobotUA 'lwp-spider/0.1', '[EMAIL PROTECTED]'; +$ua->delay(0.05); # rather quick robot + +#---------------------------------------------------------------- +sub httpd_get_robotstxt +{ + my($c,$r) = @_; + $c->send_basic_header; + $c->print("Content-Type: text/plain"); + $c->send_crlf; + $c->send_crlf; + $c->print("User-Agent: * +Disallow: /private + +"); +} + +sub httpd_get_someplace +{ + my($c,$r) = @_; + $c->send_basic_header; + $c->print("Content-Type: text/plain"); + $c->send_crlf; + $c->send_crlf; + $c->print("Okidok\n"); +} + +$req = new HTTP::Request GET => url("/someplace", $base); +$res = $ua->request($req); +#print $res->as_string; +print "not " unless $res->is_success; +print "ok 1\n"; + +$req = new HTTP::Request GET => url("/private/place", $base); +$res = $ua->request($req); +#print $res->as_string; +print "not " unless $res->code == 403 + and $res->message =~ /robots.txt/; +print "ok 2\n"; + +$req = new HTTP::Request GET => url("/foo", $base); +$res = $ua->request($req); +#print $res->as_string; +print "not " unless $res->code == 404; # not found +print "ok 3\n"; + +# Let the robotua generate "Service unavailable/Retry After response"; +$ua->delay(1); +$ua->use_sleep(0); +$req = new HTTP::Request GET => url("/foo", $base); +$res = $ua->request($req); +#print $res->as_string; +print "not " unless $res->code == 503 # Unavailable + and $res->header("Retry-After"); +print "ok 4\n"; + +#---------------------------------------------------------------- +print "Terminating server...\n"; +sub httpd_get_quit +{ + my($c) = @_; + $c->send_error(503, "Bye, bye"); + exit; # terminate HTTP server +} + +$ua->delay(0); +$req = new HTTP::Request GET => url("/quit", $base); +$res = $ua->request($req); + +print "not " unless $res->code == 503 and $res->content =~ /Bye, bye/; +print "ok 5\n"; + +#--------------------------------------------------------------- +$ua->delay(1); + +# host_wait() should be around 60s now +print "not " unless abs($ua->host_wait($base->host_port) - 60) < 5; +print "ok 6\n"; + +# Number of visits to this place should be +print "not " unless $ua->no_visits($base->host_port) == 4; +print "ok 7\n"; + diff -Nru HTTP-Recorder-0.01/t/ua.t HTTP-Recorder-0.01-test/t/ua.t --- HTTP-Recorder-0.01/t/ua.t Wed Dec 31 17:00:00 1969 +++ HTTP-Recorder-0.01-test/t/ua.t Fri Dec 12 13:45:30 2003 @@ -0,0 +1,103 @@ +#!perl + +our $greating; +do "robot/daemon.pl"; +$greating =~ /(<[^>]+>)/; + +print "1..7\n"; + +require URI; +my $base = URI->new($1); +sub url { + my $u = URI->new(@_); + $u = $u->abs($_[1]) if @_ > 1; + $u->as_string; +} + +print "Will access HTTP server at $base\n"; + +require LWP::RobotUA; +require HTTP::Request; +$ua = new LWP::RobotUA 'lwp-spider/0.1', '[EMAIL PROTECTED]'; +$ua->delay(0.05); # rather quick robot + +#---------------------------------------------------------------- +sub httpd_get_robotstxt +{ + my($c,$r) = @_; + $c->send_basic_header; + $c->print("Content-Type: text/plain"); + $c->send_crlf; + $c->send_crlf; + $c->print("User-Agent: * +Disallow: /private + +"); +} + +sub httpd_get_someplace +{ + my($c,$r) = @_; + $c->send_basic_header; + $c->print("Content-Type: text/plain"); + $c->send_crlf; + $c->send_crlf; + $c->print("Okidok\n"); +} + +$req = new HTTP::Request GET => url("/someplace", $base); +$res = $ua->request($req); +#print $res->as_string; +print "not " unless $res->is_success; +print "ok 1\n"; + +$req = new HTTP::Request GET => url("/private/place", $base); +$res = $ua->request($req); +#print $res->as_string; +print "not " unless $res->code == 403 + and $res->message =~ /robots.txt/; +print "ok 2\n"; + +$req = new HTTP::Request GET => url("/foo", $base); +$res = $ua->request($req); +#print $res->as_string; +print "not " unless $res->code == 404; # not found +print "ok 3\n"; + +# Let the robotua generate "Service unavailable/Retry After response"; +$ua->delay(1); +$ua->use_sleep(0); +$req = new HTTP::Request GET => url("/foo", $base); +$res = $ua->request($req); +#print $res->as_string; +print "not " unless $res->code == 503 # Unavailable + and $res->header("Retry-After"); +print "ok 4\n"; + +#---------------------------------------------------------------- +print "Terminating server...\n"; +sub httpd_get_quit +{ + my($c) = @_; + $c->send_error(503, "Bye, bye"); + exit; # terminate HTTP server +} + +$ua->delay(0); +$req = new HTTP::Request GET => url("/quit", $base); +$res = $ua->request($req); + +print "not " unless $res->code == 503 and $res->content =~ /Bye, bye/; +print "ok 5\n"; + +#--------------------------------------------------------------- +$ua->delay(1); + +# host_wait() should be around 60s now +print "not " unless abs($ua->host_wait($base->host_port) - 60) < 5; +print "ok 6\n"; + +# Number of visits to this place should be +print "not " unless $ua->no_visits($base->host_port) == 4; +print "ok 7\n"; +