[Fwd: a 3 process test for HTTP::Recorder]

Jim Cromie Sun, 14 Dec 2003 01:47:30 -0800

perl-qa,

I forward this here to pose a few questions;

is there a right way to set up a 3 process test within the Test::* Framework ? I borrowed the approach used in LWP t/robot/ua.t, but added a 2nd fork/spawn to give the 3 layers.

Is there a good way to borrow tests from another distribution ?
Im using several pieces of W3-Mech tests.

tia,
jimc

-------- Original Message --------
Subject:        a 3 process test for HTTP::Recorder
Date:   Sat, 13 Dec 2003 22:43:10 -0700
From:   Jim Cromie <[EMAIL PROTECTED]>
To:     [EMAIL PROTECTED], [EMAIL PROTECTED]

Hi Linda, HT-Proxy folks.

Ive built a test-setup for HTTP::Recorder, a young module with
a high potential usefulness.   For easy experimentation, Ive supplied it
as a patch against HTTP-Recorder-0.01.   I send here cuz its
mentioned on htproxy's home page.

The test -setup is still quite rough, but Ive tweaked it for while now,
and would appreciate a fresh set of eyes.   Please send any suggestions
for improvements, or actual fixes here;  I trust BooK wont mind,
and I imagine that Linda will watch this for progress.

More detail is in the t/README, but here are the highlights.

3 process layers:
client layer: uses WWW::Mechanize to conduct tests.
htrec layer:  runs an HTTP::Recorder/Proxy in separate process
htdaemon layer: serves up content to support tests.

Tests are basically in mech layer, but must be written to test the
HTREC/HTPROXY layer; probably by testing that the expected HTML
transforms are done in the htrec layer.  Further complicating things,
daemon must supply the expected pages, etc..  a tedious coordinating
job unless good decisions are made.

htdaemon slings pages foreach t/htdocs/*.html
htrec/htproxy env-proxys to daemon.
mech-layer talks to localhost:1025 (not 'proxied')
mech-layer scripts can retest thru htrec-layer.
!instant tests!

The setup borrows from W3Mech and LWP; I copied W3Mech /t/*.html to
t/htdocs/, and t/find_link.t will attempt to repeat w3mech tests here,
thru the htrec layer, and will eventually be followed by other t/*.t
pilferings ;-).  The daemon layer is borrowed and adapted (with added
cruft!) from LWP t/robot/ua.t

diff -Nru HTTP-Recorder-0.01/MANIFEST HTTP-Recorder-0.01-test/MANIFEST
--- HTTP-Recorder-0.01/MANIFEST Sun Jul 27 21:33:31 2003
+++ HTTP-Recorder-0.01-test/MANIFEST    Sat Dec 13 16:34:14 2003
@@ -4,5 +4,19 @@
 Makefile.PL
 t/load.t
 t/pod.t
+t/README                       explains new stuff
+t/all3.pl                      3 process test, to be renamed to all3.t
+t/find_link.pl                 1. w3mech driver
+t/htrec.pl                     2. htproxy/htrec layer UNDER TEST
+t/htdaemon.pl                  3a. daemon test support services
+t/daemon-loop.pl               3b. daemon boilerplate
+t/mechtest.t                   borrowed, will delete
+#
+t/htdocs/area_link.html                borrowed, served by daemon-layer
+t/htdocs/field.html
+t/htdocs/find_link.html
+t/htdocs/frames.html
+t/htdocs/google.html
+t/htdocs/tick.html
 MANIFEST
 README
diff -Nru HTTP-Recorder-0.01/recording HTTP-Recorder-0.01-test/recording
--- HTTP-Recorder-0.01/recording        Wed Dec 31 17:00:00 1969
+++ HTTP-Recorder-0.01-test/recording   Sat Dec 13 15:32:30 2003
@@ -0,0 +1 @@
+$agent->get("http:/find_link");
diff -Nru HTTP-Recorder-0.01/t/README HTTP-Recorder-0.01-test/t/README
--- HTTP-Recorder-0.01/t/README Wed Dec 31 17:00:00 1969
+++ HTTP-Recorder-0.01-test/t/README    Sat Dec 13 17:50:07 2003
@@ -0,0 +1,136 @@
+
+=head1 new test strategy: 3 layer harness, 3 process stack. (A Dagwood)
+
+ client layer: uses WWW::Mechanize to conduct tests.
+ htrec layer:  runs an HTTP::Recorder/Proxy in separate process
+ daemon layer: serves up content to support tests.
+
+Tests are basically in mech layer, but must be written to test the
+HTREC/HTPROXY layer; probably by testing that the expected HTML
+transforms are done in the htrec layer.  Further complicating things;
+daemon must supply the expected pages, etc..  a tedious coordinating
+job unless good decisions are made.
+
+ Daemon slings pages foreach t/htdocs/*.html
+ htrec/htproxy env-proxys to daemon.
+ mech-layer talks to localhost:1025 (not 'proxied')
+ mech-layer scripts can retest thru htrec-layer.
+ !instant tests!
+
+The setup borrows from W3Mech and LWP; I copied W3Mech /t/*.html to
+t/htdocs/, and t/find_link.t will attempt to repeat w3mech tests here,
+thru the htrec layer, and will eventually be followed by other t/*.t
+pilferings ;-).  The daemon layer is borrowed and adapted (with added
+cruft!) from LWP t/robot/ua.t
+
+=head1 client layer
+
+t/all3.t attempts to integrate all 3 processes in 1 file, and
+successfully spawns the daemon-layer, but has some probs spawning the
+htrec layer.  Could also be called master layer, esp wrt spawning.
+
+t/find_link.pl actually runs the 1st tests, and is invoked by all3.t
+via do "file".
+
+On its 1st get(), the script fails, as browser shows.
+
+ Scheme  is not supported by this proxy.
+
+The extra space there is a clue, but I havent isolated it.  Im pretty
+certain however that theres some URI usage problem that eventually
+causes loss of scheme.
+
+ LWP::UserAgent::is_protocol_supported('HTTP::Recorder=HASH(0x88bbcd0)','undef')
+  called at /usr/local/lib/perl5/site_perl/5.8.2/HTTP/Proxy.pm line 508
+ ie:
+    if ( !$self->agent->is_protocol_supported( my $s = $req->uri->scheme ) )
+
+=head1 htrec layer
+
+htrec.pl is run as a separate process, its hardwired to
+localhost:1025, and is env_proxyd over to localhost:1024
+
+the 1st W3Mech::get() causes 501s from htrec-layer, as seen in log;
+
+ [Sat Dec 13 14:48:17 2003] (13549) Request: GET /find_link
+ Use of uninitialized value in pattern match (m//)
+       at /usr/local/lib/perl5/site_perl/5.8.2/LWP/UserAgent.pm line 491.
+ Use of uninitialized value in concatenation (.) or string 
+       at /usr/local/lib/perl5/site_perl/5.8.2/HTTP/Proxy.pm line 511.
+ Use of uninitialized value in pattern match (m//) 
+       at /usr/local/lib/perl5/site_perl/5.8.2/HTTP/Proxy.pm line 624.
+
+ [Sat Dec 13 13:29:55 2003] (13549) Response: 501 Not Implemented
+ [Sat Dec 13 13:29:55 2003] (13549) Response: Content-Length: 39
+
+=head1 A ROOT CAUSE ??
+
+A recording made by htrec-layer suggests that the url is broken by the
+time this is commited to the log.
+
++$agent->get("http:/find_link");
+
+
+
+=head1 daemon layer
+
+consists of 2 scripts
+
+=head2 daemon-loop.pl 
+
+is a boiler-plate HTTP::Daemon based on LWP t/robot/ua.t.
+It still uses the open-pipe approach to demonizing itself, and still
+communicates its url back to the master-process that way.  The script
+now accepts arguments, which are used in this setup to hardwire it to
+localhost:1024.  
+
+Its arguable that I should have modernized it a bit more, and
+jettisoned the do "file" mechanics, but I wanted the comfort of a
+legacy process setup.  Its already crufty; it cannot use Test::More,
+cuz that prints 1..NumTests, which fouls the pipe-read and url
+extraction.  I could read-pipe till URL found, but down that way lays
+madness.
+
+=head2 htdaemon.pl
+
+This invokes above via do "daemon-loop.pl", and provides
+Server::AUTOLOAD to handle requests received by the daemon loop.
+AUTOLOAD matches requests against files in t/htdocs/*.html, and will
+have some sort of non-match default soon.
+
+
+=head1 STATUS
+
+Ive used it most successfully by running each layer in a separate
+shell, tho backgrounding them works too
+
+ perl  t/daemon-layer
+ perl  t/htrec.pl
+ perl  t/find_link.pl
+
+after starting 1,2, you can test each by browsing to
+/localhost:102[45]/find_link, or other htdoc/(\w+).html
+
+=head1 CURRENT PROBLEMS
+
+I try to split up for 3 audiences
+
+=head2 HTTP::Recorder
+
+501s, as mentioned above. A trace shows a suspicious 'undef'
+
+ LWP::UserAgent::is_protocol_supported('HTTP::Recorder=HASH(0x88bbcd0)','undef')
+  called at /usr/local/lib/perl5/site_perl/5.8.2/HTTP/Proxy.pm line 508
+
+ HTTP::Proxy::serve_connections('HTTP::Proxy=HASH(0x88bbc70)',
+       'HTTP::Daemon::ClientConn=GLOB(0x88c6cc8)', ....
+  called at /usr/local/lib/perl5/site_perl/5.8.2/HTTP/Proxy.pm line 348
+
+ as seen in extra space in browser:
+   'Scheme  is not supported by this proxy',
+
+My best guess at this point is that
+
+=head
+
+
diff -Nru HTTP-Recorder-0.01/t/all3.pl HTTP-Recorder-0.01-test/t/all3.pl
--- HTTP-Recorder-0.01/t/all3.pl        Wed Dec 31 17:00:00 1969
+++ HTTP-Recorder-0.01-test/t/all3.pl   Sat Dec 13 16:31:27 2003
@@ -0,0 +1,27 @@
+#!perl
+
+use Config;
+
+sub spawn {
+    my ($prog,@args) = @_;
+    my $perl = $Config{'perlpath'};
+    $perl = $^X if $^O eq 'VMS' or -x $^X and $^X =~ m,^([a-z]:)?/,i;
+    print "$perl -I../lib $prog @args |";
+    open(my $handle , "$perl -I../lib $prog @args |")
+       or die "Can't exec daemon: $!";
+
+    return( $handle);
+}
+
+my $dout = spawn('t/daemon-layer',
+                qw( daemon LocalAddr localhost:1024 Timeout 300 Reuse 1 ));
+
+print "from daemon-layer: ",<$dout>,"\n";
+
+
+my $htout = spawn('t/htrec.pl');
+print "from htrec-layer: ",<$htout>,"\n";
+
+print "ready to run w3mech tests\n";
+
+do "t/find_link.pl";
diff -Nru HTTP-Recorder-0.01/t/all3.t HTTP-Recorder-0.01-test/t/all3.t
--- HTTP-Recorder-0.01/t/all3.t Wed Dec 31 17:00:00 1969
+++ HTTP-Recorder-0.01-test/t/all3.t    Sat Dec 13 12:28:45 2003
@@ -0,0 +1,27 @@
+#!perl
+
+use Config;
+
+sub spawn {
+    my ($prog,@args) = @_;
+    my $perl = $Config{'perlpath'};
+    $perl = $^X if $^O eq 'VMS' or -x $^X and $^X =~ m,^([a-z]:)?/,i;
+    print "$perl -I../lib $prog @args |";
+    open(my $handle , "$perl -I../lib $prog @args |")
+       or die "Can't exec daemon: $!";
+
+    return( $handle);
+}
+
+my $dout = spawn('t/daemon-layer',
+                qw( daemon LocalAddr localhost:1024 Timeout 300 Reuse 1 ));
+
+print "from daemon-layer: ",<$dout>,"\n";
+
+
+my $htout = spawn('t/htrec.pl');
+print "from htrec-layer: ",<$htout>,"\n";
+
+print "ready to run w3mech tests\n";
+
+do "t/find_link.pl";
diff -Nru HTTP-Recorder-0.01/t/daemon-loop.pl HTTP-Recorder-0.01-test/t/daemon-loop.pl
--- HTTP-Recorder-0.01/t/daemon-loop.pl Wed Dec 31 17:00:00 1969
+++ HTTP-Recorder-0.01-test/t/daemon-loop.pl    Sat Dec 13 02:46:57 2003
@@ -0,0 +1,68 @@
+#!perl
+
+# daemon.pl is do $file'd by tests which need an HTTP::Daemon.  It
+# gets its service routines from the caller script.
+
+if($^O eq "MacOS") {
+    print "1..0\n";
+    exit(0);
+}
+
+$| = 1; # autoflush
+require IO::Socket;  # make sure this work before we try to make a HTTP::Daemon
+
+# First we make ourself a daemon in another process
+my $D = shift || '';
+if ($D eq 'daemon') {
+
+    require HTTP::Daemon;
+    #print "daemon: $0 @ARGV "; # no newline! dont foul up pipe read
+
+    # pass cmdln args, allowing ex: LocalPort 1024 
+    my $d = new HTTP::Daemon (Timeout => 10, @ARGV);
+
+    print "Please to meet you at: <URL:", $d->url, "> @ARGV\n";
+    open(STDOUT, $^O eq 'MSWin32' ?  ">nul" : $^O eq 'VMS' ? ">NL:"  : ">/dev/null");
+
+    while ($c = $d->accept) {
+       $r = $c->get_request;
+       if ($r) {
+           my $p = ($r->url->path_segments)[1];
+           $p =~ s/\W//g;
+           my $func = lc("httpd_" . $r->method . "_$p");
+           print STDERR "Calling $func...\n";
+            if (defined &$func) {
+               &$func($c, $r);
+           }
+           else {
+               eval { &{"Server::$func"}($c, $r) };
+               $c->send_error(404, "dont know Server$func")
+                   if $@;
+
+               if(defined &{"Server::$func"}) {
+                   &{"Server::$func"}($c, $r);
+               } else {
+                   #$c->send_error(404, "dont know $func");
+               }
+           }
+       }
+       $c = undef;  # close connection
+    }
+
+    print STDERR "HTTP Server terminated $! [EMAIL PROTECTED]"; # a bit extra info
+    exit;
+}
+
+
+else {
+    use Config;
+    my $perl = $Config{'perlpath'};
+    $perl = $^X if $^O eq 'VMS' or -x $^X and $^X =~ m,^([a-z]:)?/,i;
+    #print "$perl -I../lib $0 daemon $D @ARGV |\n";
+    open(DAEMON , "$perl -I../lib $0 daemon $D @ARGV |")
+        or die "Can't exec daemon: $!";
+}
+
+$greating = <DAEMON>;
+#print "daemon: $greating";
+
diff -Nru HTTP-Recorder-0.01/t/find_link.pl HTTP-Recorder-0.01-test/t/find_link.pl
--- HTTP-Recorder-0.01/t/find_link.pl   Wed Dec 31 17:00:00 1969
+++ HTTP-Recorder-0.01-test/t/find_link.pl      Sat Dec 13 16:13:15 2003
@@ -0,0 +1,145 @@
+#!/usr/bin/perl
+
+use warnings;
+use strict;
+use Test::More tests => 55;
+use URI::file;
+
+BEGIN {
+    use_ok( 'WWW::Mechanize' );
+}
+
+$ENV{HTTP_PROXY} = "http_proxy=http://localhost:1025/";; #"localhost:1025";
+
+my $t = WWW::Mechanize->new( cookie_jar => undef,
+                            env_proxy => 1 );
+isa_ok( $t, 'WWW::Mechanize' );
+is ($t->env_proxy, 1, "check env_proxy");
+#is ($t->env_proxy(1), 0, "set env_proxy");
+is ($t->env_proxy, 1, "check again env_proxy");
+
+my $uri = URI->new( "http://localhost:1025/find_link";, 'http' );
+isa_ok($uri, 'URI');
+is ($uri->scheme, 'http', "correct scheme");
+is ($uri->port, '1025', "correct port");
+is ($uri->host, 'localhost', "correct host");
+
+#->as_string;
+#my $uri = "URL:http://localhost:1025/find_link";;
+diag "trying ", $uri;
+
+$t->get( $uri );
+ok( $t->success, "Fetched $uri" ) or die "Can't get test page $! $@ $t";
+
+my $x;
+$x = $t->find_link();
+isa_ok( $x, 'WWW::Mechanize::Link' );
+is( $x->[0], "http://blargle.com/";, "First link on the page" );
+is( $x->url, "http://blargle.com/";, "First link on the page" );
+
+$x = $t->find_link( text => "CPAN A" );
+isa_ok( $x, 'WWW::Mechanize::Link' );
+is( $x->[0], "http://a.cpan.org/";, "First CPAN link" );
+is( $x->url, "http://a.cpan.org/";, "First CPAN link" );
+
+$x = $t->find_link( url => "CPAN" );
+ok( !defined $x, "No url matching CPAN" );
+
+$x = $t->find_link( text_regex => qr/CPAN/, n=>3 );
+isa_ok( $x, 'WWW::Mechanize::Link' );
+is( $x->[0], "http://c.cpan.org/";, "3rd CPAN text" );
+is( $x->url, "http://c.cpan.org/";, "3rd CPAN text" );
+
+$x = $t->find_link( text => "CPAN", n=>34 );
+ok( !defined $x, "No 34th CPAN text" );
+
+$x = $t->find_link( text_regex => qr/(?i:cpan)/ );
+isa_ok( $x, 'WWW::Mechanize::Link' );
+is( $x->[0], "http://a.cpan.org/";, "Got 1st cpan via regex" );
+is( $x->url, "http://a.cpan.org/";, "Got 1st cpan via regex" );
+
+$x = $t->find_link( text_regex => qr/cpan/i );
+isa_ok( $x, 'WWW::Mechanize::Link' );
+is( $x->[0], "http://a.cpan.org/";, "Got 1st cpan via regex" );
+is( $x->url, "http://a.cpan.org/";, "Got 1st cpan via regex" );
+
+$x = $t->find_link( text_regex => qr/cpan/i, n=>153 );
+ok( !defined $x, "No 153rd cpan link" );
+
+$x = $t->find_link( url => "http://b.cpan.org/"; );
+isa_ok( $x, 'WWW::Mechanize::Link' );
+is( $x->[0], "http://b.cpan.org/";, "Got b.cpan.org" );
+is( $x->url, "http://b.cpan.org/";, "Got b.cpan.org" );
+
+$x = $t->find_link( url => "http://b.cpan.org";, n=>2 );
+ok( !defined $x, "Not a second b.cpan.org" );
+
+$x = $t->find_link( url_regex => qr/[b-d]\.cpan\.org/, n=>2 );
+isa_ok( $x, 'WWW::Mechanize::Link' );
+is( $x->[0], "http://c.cpan.org/";, "Got c.cpan.org" );
+is( $x->url, "http://c.cpan.org/";, "Got c.cpan.org" );
+
+my @wanted_links= (
+   [ "http://a.cpan.org/";, "CPAN A", undef, "a" ], 
+   [ "http://b.cpan.org/";, "CPAN B", undef, "a" ], 
+   [ "http://c.cpan.org/";, "CPAN C", "bongo", "a" ], 
+   [ "http://d.cpan.org/";, "CPAN D", undef, "a" ], 
+);
+my @links = $t->find_all_links( text_regex => qr/CPAN/ );
+is_deeply( [EMAIL PROTECTED], [EMAIL PROTECTED], "Correct links came back" );
+
+my $linkref = $t->find_all_links( text_regex => qr/CPAN/ );
+is_deeply( $linkref, [EMAIL PROTECTED], "Correct links came back" );
+
+# Check combinations of links
+$x = $t->find_link( text => "News" );
+isa_ok( $x, 'WWW::Mechanize::Link' );
+is( $x->[0], "http://www.msnbc.com/";, "First News is MSNBC" );
+is( $x->url, "http://www.msnbc.com/";, "First News is MSNBC" );
+
+$x = $t->find_link( text => "News", url_regex => qr/bbc/ );
+isa_ok( $x, 'WWW::Mechanize::Link' );
+is( $x->[0], "http://www.bbc.co.uk/";, "First BBC news link" );
+is( $x->url, "http://www.bbc.co.uk/";, "First BBC news link" );
+is( $x->[1], "News", "First BBC news text" );
+is( $x->text, "News", "First BBC news text" );
+
+$x = $t->find_link( text => "News", url_regex => qr/cnn/ );
+isa_ok( $x, 'WWW::Mechanize::Link' );
+is( $x->[0], "http://www.cnn.com/";, "First CNN news link" );
+is( $x->url, "http://www.cnn.com/";, "First CNN news link" );
+is( $x->[1], "News", "First CNN news text" );
+is( $x->text, "News", "First CNN news text" );
+
+AREA_CHECKS: {
+    my @wanted_links = (
+       [ "http://www.cnn.com/";, "CNN", undef, "a" ],
+       [ "http://www.cnn.com/";, "News", "Fred", "a" ],
+       [ "http://www.cnn.com/area";, undef, undef, "area" ],
+    );
+    my @links = $t->find_all_links( url_regex => qr/cnn\.com/ );
+    is_deeply( [EMAIL PROTECTED], [EMAIL PROTECTED], "Correct links came back" );
+
+    my $linkref = $t->find_all_links( url_regex => qr/cnn\.com/ );
+    is_deeply( $linkref, [EMAIL PROTECTED], "Correct links came back" );
+}
+
+$x = $t->find_link( name => "bongo" );
+isa_ok( $x, 'WWW::Mechanize::Link' );
+is_deeply( $x, [ "http://c.cpan.org/";, "CPAN C", "bongo", "a" ], 'Got the CPAN C 
link' );
+
+$x = $t->find_link( name_regex => qr/^[A-Z]/, n => 2 );
+isa_ok( $x, 'WWW::Mechanize::Link' );
+is_deeply( $x, [ "http://www.cnn.com/";, "News", "Fred", "a" ], 'Got 2nd link that 
begins with a capital' );
+
+$x = $t->find_link( tag => 'a', n => 3 );
+isa_ok( $x, 'WWW::Mechanize::Link' );
+is_deeply( $x, [ "http://b.cpan.org/";, "CPAN B", undef, "a" ], 'Got 3rd <A> tag' );
+
+$x = $t->find_link( tag_regex => qr/^(a|frame)$/, n => 7 );
+isa_ok( $x, 'WWW::Mechanize::Link' );
+is_deeply( $x, [ "http://d.cpan.org/";, "CPAN D", undef, "a" ], 'Got 7th <A> or 
<FRAME> tag' );
+
+$x = $t->find_link( text => "Rebuild Index" );
+isa_ok( $x, 'WWW::Mechanize::Link' );
+is_deeply( $x, [ "/cgi-bin/MT/mt.cgi", "Rebuild Index", undef, "a" ], 'Got the 
JavaScript link' );
diff -Nru HTTP-Recorder-0.01/t/htdaemon.pl HTTP-Recorder-0.01-test/t/htdaemon.pl
--- HTTP-Recorder-0.01/t/htdaemon.pl    Wed Dec 31 17:00:00 1969
+++ HTTP-Recorder-0.01-test/t/htdaemon.pl       Sat Dec 13 11:53:13 2003
@@ -0,0 +1,71 @@
+#!perl
+
+use warnings;
+use strict;
+use URI::file;
+
+###################
+# set up the Serving daemon using LWP derived setup
+
+our $greating;
[EMAIL PROTECTED](qw( daemon LocalAddr localhost:1024 Timeout 300 Reuse 1 )) unless 
@ARGV; # daemon
+do "t/daemon.pl";
+print "server said: $greating";
+
+package Server;
+our $AUTOLOAD;
+use UNIVERSAL::canAUTOLOAD;
+sub httpd_get_find_link;
+sub httpd_get_field;
+
+#----------------------------------------------------------------
+sub httpd_get_find_link #serve all calls
+{
+   my($c,$r) = @_;
+
+   #(my $meth = $AUTOLOAD) =~ s/.*:://;
+   #return if $meth eq 'DESTROY';
+   #$meth =~ s/httpd_get_//;
+   my $meth = 'find_link';
+
+   if (not -f "t/htdocs/$meth.html") {
+       #print "no $meth\n";
+       $c->send_error(404, "dont know $meth.html");
+   } else {
+       open(my $foo, "t/htdocs/$meth.html") or die;
+       $/ = undef;
+       my $buf = <$foo>;
+       $c->send_basic_header;
+       $c->print("Content-Type: text/html");
+       $c->send_crlf;
+       $c->send_crlf;
+       $c->print($buf);
+   }
+}
+
+use Data::Dumper::EasyOO; # ('singleton' => my $ezdd);
+my $ezdd = Data::Dumper::EasyOO->new();
+
+sub AUTOLOAD #serve all calls
+{
+   my($c,$r) = @_;
+
+   (my $meth = $AUTOLOAD) =~ s/.*:://;
+   return if $meth eq 'DESTROY';
+   $meth =~ s/httpd_get_//;
+   if (not -f "t/htdocs/$meth.html") {
+       print "no $meth\n";
+       $c->send_error(404, "dont know $meth.html");
+   } else {
+       open(my $foo, "t/htdocs/$meth.html") or die;
+       $/ = undef;
+       #my $buf = <$foo>;
+       $c->send_basic_header;
+       $c->print("Content-Type: text/html");
+       $c->send_crlf;
+       $c->send_crlf;
+       $c->print(<$foo>);
+       #$c->print($ezdd->($r));
+   }
+}
+
diff -Nru HTTP-Recorder-0.01/t/htdocs/area_link.html 
HTTP-Recorder-0.01-test/t/htdocs/area_link.html
--- HTTP-Recorder-0.01/t/htdocs/area_link.html  Wed Dec 31 17:00:00 1969
+++ HTTP-Recorder-0.01-test/t/htdocs/area_link.html     Sat Dec 13 02:03:15 2003
@@ -0,0 +1,19 @@
+<html>
+    <head>
+       <TITLE>Testing AREA tag handling</TITLE>
+    </head>
+    <body>
+       <MAP NAME="SOME_MAP">
+           <AREA HREF="http://www.msnbc.com/area"; COORDS="1,2,3,4"></AREA>
+           <AREA HREF="http://www.cnn.com/area"; COORDS="5,6,7,8">
+           <AREA HREF="http://www.cpan.org/area"; COORDS="10,11,12,13" />
+       </MAP>
+       <MAP NAME="OTHER_MAP">
+           <AREA NOHREF COORDS="1,2,3,4">
+           <AREA HREF="http://www.slashdot.org";>
+       </MAP>
+       <IMG SRC="SOME_IMAGE" USEMAP="#SOME_MAP">
+       <IMG SRC="SOME_IMAGE" USEMAP="#OTHER_MAP">
+    </body>
+</html>
+
diff -Nru HTTP-Recorder-0.01/t/htdocs/field.html 
HTTP-Recorder-0.01-test/t/htdocs/field.html
--- HTTP-Recorder-0.01/t/htdocs/field.html      Wed Dec 31 17:00:00 1969
+++ HTTP-Recorder-0.01-test/t/htdocs/field.html Sat Dec 13 02:03:15 2003
@@ -0,0 +1,13 @@
+<HTML>
+<HEAD>
+    Like a hole
+</HEAD>
+<BODY BGCOLOR="puce">
+<FORM ACTION="/shake-some/">
+<INPUT TYPE="text" NAME="dingo" VALUE="dingo1">
+<INPUT TYPE="text" NAME="bongo" VALUE="bongo!">
+<INPUT TYPE="radio" NAME="wango" VALUE="wango!">
+<INPUT TYPE="text" NAME="dingo" VALUE="dingo2">
+</FORM>
+</BODY>
+</HTML>
diff -Nru HTTP-Recorder-0.01/t/htdocs/find_link.html 
HTTP-Recorder-0.01-test/t/htdocs/find_link.html
--- HTTP-Recorder-0.01/t/htdocs/find_link.html  Wed Dec 31 17:00:00 1969
+++ HTTP-Recorder-0.01-test/t/htdocs/find_link.html     Sat Dec 13 11:25:32 2003
@@ -0,0 +1,36 @@
+<html>
+    <head>
+       <TITLE>Testing the links</TITLE>
+    </head>
+    <body>
+    <UL>
+       <LI> <A HREF="http://blargle.com/";>blargle</A>
+       <LI> <A HREF="http://a.cpan.org/";>CPAN A</A>
+       <LI> <A HREF="http://b.cpan.org/";>CPAN B</A>
+       <FRAME SRC="foo.html">
+       <FRAME SRC="bar.html">
+       <LI> <A HREF="http://c.cpan.org/"; NAME="bongo">CPAN C</A>
+       <LI> <A HREF="http://d.cpan.org/";>CPAN D</A>
+
+       <LI> <A HREF="http://www.msnbc.com/";>MSNBC</A>
+       <FRAME SRC="http://www.oreilly.com/"; NAME="wongo">
+       <LI> <A HREF="http://www.cnn.com/";>CNN</A>
+       <LI> <A HREF="http://www.bbc.co.uk/"; NAME="Wilma">BBC</A>
+       <LI> <A HREF="http://www.msnbc.com/";>News</A>
+       <LI> <A HREF="http://www.cnn.com/"; NAME="Fred">News</A>
+       <LI> <A HREF="http://www.bbc.co.uk/";>News</A>
+       <LI> <A onmouseover="window.status='Rebuild Files'; return true" href="#" 
onClick="window.open(  '/cgi-bin/MT/mt.cgi', 'rebuild', 
'width=400,height=200,resizable=yes')">Rebuild Index</A>
+
+       <MAP NAME="SOME_MAP">
+           <AREA HREF="http://www.msnbc.com/area"; COORDS="1,2,3,4"></AREA>
+           <AREA HREF="http://www.cnn.com/area"; COORDS="5,6,7,8" NAME="Marty">
+           <AREA HREF="http://www.cpan.org/area"; COORDS="10,11,12,13" />
+       </MAP>
+       <IMG SRC="SOME_IMAGE" USEMAP="#SOME_MAP">
+
+       <!-- new stuff -->
+       <LI> <A HREF="http://nowhere.org/"; Name="Here">NoWhere</A>
+       <LI> <A HREF="http://nowhere.org/padded"; Name=" Here "> NoWhere </A>
+    </body>
+</html>
+
diff -Nru HTTP-Recorder-0.01/t/htdocs/find_link.html.orig 
HTTP-Recorder-0.01-test/t/htdocs/find_link.html.orig
--- HTTP-Recorder-0.01/t/htdocs/find_link.html.orig     Wed Dec 31 17:00:00 1969
+++ HTTP-Recorder-0.01-test/t/htdocs/find_link.html.orig        Sat Dec 13 11:24:16 
2003
@@ -0,0 +1,35 @@
+<html>
+    <head>
+       <TITLE>Testing the links</TITLE>
+    </head>
+    <body>
+       <A HREF="http://blargle.com/";>blargle</A>
+       <A HREF="http://a.cpan.org/";>CPAN A</A>
+       <A HREF="http://b.cpan.org/";>CPAN B</A>
+       <FRAME SRC="foo.html">
+       <FRAME SRC="bar.html">
+       <A HREF="http://c.cpan.org/"; NAME="bongo">CPAN C</A>
+       <A HREF="http://d.cpan.org/";>CPAN D</A>
+
+       <A HREF="http://www.msnbc.com/";>MSNBC</A>
+       <FRAME SRC="http://www.oreilly.com/"; NAME="wongo">
+       <A HREF="http://www.cnn.com/";>CNN</A>
+       <A HREF="http://www.bbc.co.uk/"; NAME="Wilma">BBC</A>
+       <A HREF="http://www.msnbc.com/";>News</A>
+       <A HREF="http://www.cnn.com/"; NAME="Fred">News</A>
+       <A HREF="http://www.bbc.co.uk/";>News</A>
+       <A onmouseover="window.status='Rebuild Files'; return true" href="#" 
onClick="window.open(  '/cgi-bin/MT/mt.cgi', 'rebuild', 
'width=400,height=200,resizable=yes')">Rebuild Index</A>
+
+       <MAP NAME="SOME_MAP">
+           <AREA HREF="http://www.msnbc.com/area"; COORDS="1,2,3,4"></AREA>
+           <AREA HREF="http://www.cnn.com/area"; COORDS="5,6,7,8" NAME="Marty">
+           <AREA HREF="http://www.cpan.org/area"; COORDS="10,11,12,13" />
+       </MAP>
+       <IMG SRC="SOME_IMAGE" USEMAP="#SOME_MAP">
+
+       <!-- new stuff -->
+       <A HREF="http://nowhere.org/"; Name="Here">NoWhere</A>
+       <A HREF="http://nowhere.org/padded"; Name=" Here "> NoWhere </A>
+    </body>
+</html>
+
diff -Nru HTTP-Recorder-0.01/t/htdocs/frames.html 
HTTP-Recorder-0.01-test/t/htdocs/frames.html
--- HTTP-Recorder-0.01/t/htdocs/frames.html     Wed Dec 31 17:00:00 1969
+++ HTTP-Recorder-0.01-test/t/htdocs/frames.html        Sat Dec 13 02:03:15 2003
@@ -0,0 +1,13 @@
+<html>
+   <head>
+     <title></title>
+   </head>
+
+   <frameset rows="*,*" frameborder="1" framespacing="0" border="1">
+     <frame name="top" src="find_link.html" marginwidth="8" 
+marginheight="8" scrolling="auto" frameborder="no">
+       <frame name="bottom" src="google.html" marginwidth="0" 
+marginheight="0" scrolling="no" frameborder="no" noresize>
+   </frameset>
+
+</html>
diff -Nru HTTP-Recorder-0.01/t/htdocs/google.html 
HTTP-Recorder-0.01-test/t/htdocs/google.html
--- HTTP-Recorder-0.01/t/htdocs/google.html     Wed Dec 31 17:00:00 1969
+++ HTTP-Recorder-0.01-test/t/htdocs/google.html        Sat Dec 13 02:03:15 2003
@@ -0,0 +1,14 @@
+<html><head><meta http-equiv="content-type" content="text/html; 
charset=ISO-8859-1"><title>Google</title><style><!--
+body,td,a,p,.h{font-family:arial,sans-serif;}
+.h{font-size: 20px;}
+.q{text-decoration:none; color:#0000cc;}
+//-->
+</style>
+<script>
+<!--
+function sf(){document.f.q.focus();}
+// -->
+</script>
+</head><body bgcolor=#ffffff text=#000000 link=#0000cc vlink=#551a8b alink=#ff0000 
onLoad=sf()><center><table border=0 cellspacing=0 cellpadding=0><tr><td><img 
src="/images/logo.gif" width=276 height=110 alt="Google"></td></tr></table><br>
+<table border=0 cellspacing=0 cellpadding=0><tr><td width=15>&nbsp;</td><td id=0 
bgcolor=#3366cc align=center width=95 nowrap><font color=#ffffff 
size=-1><b>Web</b></font></td><td width=15>&nbsp;</td><td id=1 bgcolor=#efefef 
align=center width=95 nowrap onClick="" style=cursor:pointer;cursor:hand;><a id=1a 
class=q href="/imghp?hl=en&tab=wi&ie=UTF-8"><font size=-1>Images</font></a></td><td 
width=15>&nbsp;</td><td id=2 bgcolor=#efefef align=center width=95 nowrap onClick="" 
style=cursor:pointer;cursor:hand;><a id=2a class=q 
href="/grphp?hl=en&tab=wg&ie=UTF-8"><font size=-1>Groups</font></a></td><td 
width=15>&nbsp;</td><td id=3 bgcolor=#efefef align=center width=95 nowrap onClick="" 
style=cursor:pointer;cursor:hand;><a id=3a class=q 
href="/dirhp?hl=en&tab=wd&ie=UTF-8"><font size=-1>Directory</font></a></td><td 
width=15>&nbsp;</td><td id=4 bgcolor=#efefef align=center width=95 nowrap onClick="" 
style=cursor:pointer;cursor:hand;><a id=4a class=q 
href="/nwshp?hl=en&tab=wn&ie=UTF-8"><font size=-1>News</font></a></td><td 
width=15>&nbsp;</td></tr><tr><td colspan=12 bgcolor=#3366cc><img width=1 height=1 
alt=""></td></tr></table><br><form action="/target-page" name="bob-the-form"><table 
cellspacing=0 cellpadding=0><tr><td width=75>&nbsp;</td><td align=center><input 
type=hidden name=hl value=en><span id=hf></span><input type=hidden name=ie 
value="ISO-8859-1"><input maxLength=256 size=55 name=q value=""><br><input type=submit 
value="Google Search" name=btnG><input type=submit value="I'm Feeling Lucky" 
name=btnI></td><td valign=top nowrap><font size=-2>&nbsp;&#8226;&nbsp;<a 
href=/advanced_search?hl=en>Advanced&nbsp;Search</a><br>&nbsp;&#8226;&nbsp;<a 
href=/preferences?hl=en>Preferences</a><br>&nbsp;&#8226;&nbsp;<a 
href=/language_tools?hl=en>Language 
Tools</a></font></td></tr></table></form><br><p><font size=-1>Want more from Google?  
Try these <a href="/tour/services/query.html">expert search tips</a></font><p>
+<br><font size=-1><a href="/ads/">Advertise&nbsp;with&nbsp;Us</a> - <a 
href="/services/">Business&nbsp;Solutions</a> - <a 
href="/options/">Services&nbsp;&amp;&nbsp;Tools</a> - <a 
href=/about.html>Jobs,&nbsp;Press,&nbsp;&amp;&nbsp;Help</a></font><p><font 
size=-2>&copy;2003 Google - Searching 3,083,324,652 web 
pages</font></p></center></body></html>
diff -Nru HTTP-Recorder-0.01/t/htdocs/tick.html 
HTTP-Recorder-0.01-test/t/htdocs/tick.html
--- HTTP-Recorder-0.01/t/htdocs/tick.html       Wed Dec 31 17:00:00 1969
+++ HTTP-Recorder-0.01-test/t/htdocs/tick.html  Sat Dec 13 02:03:15 2003
@@ -0,0 +1,14 @@
+<html>
+<body>
+<form action="http://localhost/"; method="POST">
+
+<input type="checkbox" name="foo" value="hello" /> Hello<br />
+<input type="checkbox" name="foo" value="bye"   /> Bye<br />
+<input type="checkbox" name="foo" value="arse" /> Arse<br />
+<input type="checkbox" name="foo" value="wibble" /> Wibble<br />
+<input type="checkbox" name="foo" value="foo" /> Foo<br />
+
+<input type="Submit" name="submit" value="Submit" label="Sumbit" />
+</form>
+</body>
+
diff -Nru HTTP-Recorder-0.01/t/htrec.pl HTTP-Recorder-0.01-test/t/htrec.pl
--- HTTP-Recorder-0.01/t/htrec.pl       Wed Dec 31 17:00:00 1969
+++ HTTP-Recorder-0.01-test/t/htrec.pl  Sat Dec 13 16:04:58 2003
@@ -0,0 +1,17 @@
+#!/usr/local/bin/perl -w
+
+#use lib '/home/jimc/perl.cpan/htrec/HTTP-Recorder-0.01-mod/lib';
+use HTTP::Proxy qw(:log);
+use HTTP::Recorder;
+$ENV{HTTP_PROXY} = "localhost:1024";
+my $proxy = HTTP::Proxy->new(port => 1025, maxchild => 0,
+                            logmask => ALL );
+
+# set HTTP::Recorder as the agent
+my $agent = HTTP::Recorder->new( file => shift || "recording"); #,  showwindow => 1); 
#/tmp/tmpfile" );
+
+$proxy->agent( $agent );
+
+$proxy->start();
+
+
diff -Nru HTTP-Recorder-0.01/t/mechtest.t HTTP-Recorder-0.01-test/t/mechtest.t
--- HTTP-Recorder-0.01/t/mechtest.t     Wed Dec 31 17:00:00 1969
+++ HTTP-Recorder-0.01-test/t/mechtest.t        Sat Dec 13 01:31:25 2003
@@ -0,0 +1,145 @@
+#!perl
+
+use warnings;
+use strict;
+#use Test::More tests => 6;
+use URI::file;
+
+###################
+# set up the Serving daemon using LWP derived setup
+
+our $greating;
[EMAIL PROTECTED](qw( LocalAddr localhost:1024 Timeout 300 )) unless @ARGV; # daemon
+do "t/daemon.pl";
+#print "server said: $greating";
+#$greating = <DAEMON>;
+$greating =~ /(<[^>]+>)/;
+
+require URI;
+my $base = URI->new($1);
+sub url {
+   my $u = URI->new(@_);
+   $u = $u->abs($_[1]) if @_ > 1;
+   $u->as_string;
+}
+
+print "Will access HTTP server at $base\n";
+
+
+###################
+# The Module under test, and support
+use HTTP::Recorder;
+use HTTP::Proxy;
+
+my $proxy = HTTP::Proxy->new;
+
+# set HTTP::Recorder as the agent
+my $agent = HTTP::Recorder->new( file => "/tmp/tmpfile" );
+$proxy->agent( $agent );
+
+#$proxy->start();
+
+###################
+# the Test Driver side
+
+
+#BEGIN { use_ok( 'WWW::Mechanize' );}
+use WWW::Mechanize;
+
+my $t = WWW::Mechanize->new( cookie_jar => undef );
+#isa_ok( $t, 'WWW::Mechanize' );
+
+# my $uri = URI::file->new_abs( "t/find_link.html" )->as_string;
+
+
+print "1..7\n";
+
+###################
+# the borrowed test TBadapted
+
+my ($ua, $req, $res);
+require LWP::RobotUA;
+require HTTP::Request;
+$ua = new LWP::RobotUA 'lwp-spider/0.1', '[EMAIL PROTECTED]';
+$ua->delay(0.05);  # rather quick robot
+
+#----------------------------------------------------------------
+sub httpd_get_robotstxt
+{
+   my($c,$r) = @_;
+   $c->send_basic_header;
+   $c->print("Content-Type: text/plain");
+   $c->send_crlf;
+   $c->send_crlf;
+   $c->print("User-Agent: *
+Disallow: /private
+
+");
+}
+
+sub httpd_get_someplace
+{
+   my($c,$r) = @_;
+   $c->send_basic_header;
+   $c->print("Content-Type: text/plain");
+   $c->send_crlf;
+   $c->send_crlf;
+   $c->print("Okidok\n");
+}
+
+$req = new HTTP::Request GET => url("/someplace", $base);
+$res = $ua->request($req);
+#print $res->as_string;
+print "not " unless $res->is_success;
+print "ok 1\n";
+
+$req = new HTTP::Request GET => url("/private/place", $base);
+$res = $ua->request($req);
+#print $res->as_string;
+print "not " unless $res->code == 403
+                and $res->message =~ /robots.txt/;
+print "ok 2\n";
+
+$req = new HTTP::Request GET => url("/foo", $base);
+$res = $ua->request($req);
+#print $res->as_string;
+print "not " unless $res->code == 404;  # not found
+print "ok 3\n";
+
+# Let the robotua generate "Service unavailable/Retry After response";
+$ua->delay(1);
+$ua->use_sleep(0);
+$req = new HTTP::Request GET => url("/foo", $base);
+$res = $ua->request($req);
+#print $res->as_string;
+print "not " unless $res->code == 503   # Unavailable
+                and $res->header("Retry-After");
+print "ok 4\n";
+
+#----------------------------------------------------------------
+print "Terminating server...\n";
+sub httpd_get_quit
+{
+    my($c) = @_;
+    $c->send_error(503, "Bye, bye");
+    exit;  # terminate HTTP server
+}
+
+$ua->delay(0);
+$req = new HTTP::Request GET => url("/quit", $base);
+$res = $ua->request($req);
+
+print "not " unless $res->code == 503 and $res->content =~ /Bye, bye/;
+print "ok 5\n";
+
+#---------------------------------------------------------------
+$ua->delay(1);
+
+# host_wait() should be around 60s now
+print "not " unless abs($ua->host_wait($base->host_port) - 60) < 5;
+print "ok 6\n";
+
+# Number of visits to this place should be 
+print "not " unless $ua->no_visits($base->host_port) == 4;
+print "ok 7\n";
+
diff -Nru HTTP-Recorder-0.01/t/ua.t HTTP-Recorder-0.01-test/t/ua.t
--- HTTP-Recorder-0.01/t/ua.t   Wed Dec 31 17:00:00 1969
+++ HTTP-Recorder-0.01-test/t/ua.t      Fri Dec 12 13:45:30 2003
@@ -0,0 +1,103 @@
+#!perl
+
+our $greating;
+do "robot/daemon.pl";
+$greating =~ /(<[^>]+>)/;
+
+print "1..7\n";
+
+require URI;
+my $base = URI->new($1);
+sub url {
+   my $u = URI->new(@_);
+   $u = $u->abs($_[1]) if @_ > 1;
+   $u->as_string;
+}
+
+print "Will access HTTP server at $base\n";
+
+require LWP::RobotUA;
+require HTTP::Request;
+$ua = new LWP::RobotUA 'lwp-spider/0.1', '[EMAIL PROTECTED]';
+$ua->delay(0.05);  # rather quick robot
+
+#----------------------------------------------------------------
+sub httpd_get_robotstxt
+{
+   my($c,$r) = @_;
+   $c->send_basic_header;
+   $c->print("Content-Type: text/plain");
+   $c->send_crlf;
+   $c->send_crlf;
+   $c->print("User-Agent: *
+Disallow: /private
+
+");
+}
+
+sub httpd_get_someplace
+{
+   my($c,$r) = @_;
+   $c->send_basic_header;
+   $c->print("Content-Type: text/plain");
+   $c->send_crlf;
+   $c->send_crlf;
+   $c->print("Okidok\n");
+}
+
+$req = new HTTP::Request GET => url("/someplace", $base);
+$res = $ua->request($req);
+#print $res->as_string;
+print "not " unless $res->is_success;
+print "ok 1\n";
+
+$req = new HTTP::Request GET => url("/private/place", $base);
+$res = $ua->request($req);
+#print $res->as_string;
+print "not " unless $res->code == 403
+                and $res->message =~ /robots.txt/;
+print "ok 2\n";
+
+$req = new HTTP::Request GET => url("/foo", $base);
+$res = $ua->request($req);
+#print $res->as_string;
+print "not " unless $res->code == 404;  # not found
+print "ok 3\n";
+
+# Let the robotua generate "Service unavailable/Retry After response";
+$ua->delay(1);
+$ua->use_sleep(0);
+$req = new HTTP::Request GET => url("/foo", $base);
+$res = $ua->request($req);
+#print $res->as_string;
+print "not " unless $res->code == 503   # Unavailable
+                and $res->header("Retry-After");
+print "ok 4\n";
+
+#----------------------------------------------------------------
+print "Terminating server...\n";
+sub httpd_get_quit
+{
+    my($c) = @_;
+    $c->send_error(503, "Bye, bye");
+    exit;  # terminate HTTP server
+}
+
+$ua->delay(0);
+$req = new HTTP::Request GET => url("/quit", $base);
+$res = $ua->request($req);
+
+print "not " unless $res->code == 503 and $res->content =~ /Bye, bye/;
+print "ok 5\n";
+
+#---------------------------------------------------------------
+$ua->delay(1);
+
+# host_wait() should be around 60s now
+print "not " unless abs($ua->host_wait($base->host_port) - 60) < 5;
+print "ok 6\n";
+
+# Number of visits to this place should be 
+print "not " unless $ua->no_visits($base->host_port) == 4;
+print "ok 7\n";
+

[Fwd: a 3 process test for HTTP::Recorder]

Reply via email to