I have written a script which is very useful for me day-to-day. It checks
table structure in HTML files. The script is working, but I would
appreciate any comments, especially as to how this can be better written.
Thank you,
Shawn
Code follows:
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
#!/usr/bin/perl
#This script looks for bad table
#structure in HTML pages. When using
#server-side scripting, do not run this
#on the source page -- load the page in
#a browser, view source, and save the source
#as a file.
#This script does not take into account client-side
#scripting on the page. For example, if a Javascript
#function writes out table elements in a loop, this
#script will report an error. This is just a helpful
#tool for the web programmer who already knows what
#they are doing.
#Created by Shawn Milochik, Oct 31 2002. Released
#free of any restrictions, and without any warranty.
#execution: checktables.pl filename.html
open (IN, "<" . $ARGV[0]) || die "Could not open input file.\n";
$line = 0;
$last = undef;
#Open each line of the file.
while (<IN>){
#Keep track of the line number, so we
#can tell the user which line of the HTML has
#the problem.
$line++;
#Split the line on the > character.
@data = split />/, $_;
while (@data){
#Take the portion of the line containing an HTML tag
chomp($test = lc(shift(@data)));
if (($test =~ /<t/) || ($test =~ /<\/t/)){
#print "Tag found in $test.\n";
$curr = undef;
if ($test =~ "<table"){ $curr = "<table";}
if ($test =~ "<\/table"){ $curr = "<\/table";}
if ($test =~ "<tr"){ $curr = "<tr";}
if ($test =~ "<\/tr"){ $curr = "<\/tr";}
if ($test =~ "<td"){ $curr = "<td";}
if ($test =~ "<\/td"){ $curr = "<\/td";}
#if we found a valid table tag
if ($curr){
#If this is not the first
#iteration of this block
if ($last){
if ($last eq "<table"){
if ($curr ne "<tr"){ print "Line $line: found $curr
instead of <tr after <table from line $lastline.\n";}
}
if ($last eq "<\/table"){
if (($curr ne "<table") && ($curr ne "<\/td")){
print "Line $line: found $curr instead of <\/td or <table after <\/table
from line $lastline.\n";}
}
if ($last eq "<tr"){
if ($curr ne "<td"){ print "Line $line: found $curr
instead of <td after $last from line $lastline.\n";}
}
if ($last eq "<\/tr"){
if (($curr ne "<\/table") && ($curr ne "<tr")){
print "Line $line: found $curr instead of <tr or <\/table after $last from
line $lastline.\n";}
}
if ($last eq "<td"){
if (($curr ne "<table") && ($curr ne "<\/td")){
print "Line $line: found $curr instead of <table or <\/td after $last from
line $lastline.\n";}
}
if ($last eq "<\/td"){
if (($curr ne "<td") && ($curr ne "<\/tr")){ print
"Line $line: found $curr instead of <td or <\/tr after $last from line
$lastline.\n";}
}
$last = $curr;
$lastline = $line;
}else{
#First iteration, initialize
#$last
$last = $curr;
}#close curly brace for if ($last) block
}#close curly brace for if ($curr) block
}#close curly brace for if (($test =~ /<t/) || ($test =~ /<\/t/))
block
}#close curly brace for while(@data) block
}#close curly brace for while(<IN>) block
print "Check complete.\n";
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
End of Code
**********************************************************************
This e-mail and any files transmitted with it may contain
confidential information and is intended solely for use by
the individual to whom it is addressed. If you received
this e-mail in error, please notify the sender, do not
disclose its contents to others and delete it from your
system.
**********************************************************************
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]