Thanks or the reply John. I have a couple of questions inline.
On Dec 22, 2006, at 10:53 PM, John W. Krahn wrote:
#! /usr/bin/perl -w
use strict;
use Date::Calc qw(Day_of_Year);
use DBI;
#MICIS climate data munger. Required input argument is the file to
process.
#Use > to redirect output to new file.
#Set the item delimiter to tabs instead of the default commas and
the line
The Output Field Separator ($,) has the default value of undef.
I guess I'm too new at this. I don't understand your point.
#delimiter to a newline character
$, = "\t";
$\ = "\n";
#Instantiate the global station ID variable
my $station_id = "";
#Initialize I/O variables
my ($year,$month,$day,$doy,$date,$precip,$tmin,$tmax,$snowfall,
$snowdepth,$tmean,$obstime,$datasource);
You don't really need to declare these variables in file scope, you
probably
should declare them inside the while loop.
Understood
#Part 1. Loop through the 11 header lines to identify the station id.
#The 7th line contains the station ID, and has the format of
#STATION: SOME_STATION, STATE (Station ID: ######)
for(my $i=1;$i<=6;$i++) {
Your comment says eleven lines but your code says six?
A mistake on my part not updating the comments. The earlier file
format had 11 lines.
my $header = <>;
#Remove the newline character
chomp $header;
if ($i == 2) {
#Split the line into an 3-item array based on the 2 colons.
my @line = split(":", $header);
#Extract everything after the 2nd colon.
$station_id = $line[2];
#Remove leading white spaces.
$station_id =~ s/^\s+//;
#Remove ending bracket.
$station_id =~ s/\)//;
}
}
#Connect to postgreql
my $dbh = DBI->connect( "DBI:Pg:dbname=met_data;host=localhost",
"pguser", "pguser" )
or die "Couldn't connect to PostgreSQL: $DBI::errstr ($DBI::err)
\n";
#Part 2. Loop through the records and prepare SQL statement.
while (my $line=<>) {
You are using the <> operator to read from the file(s) so this
*will* read all
the lines from all the files listed on the command line. The only
problem is
that you will not distinguish the headers from the second and
subsequent files
listed on the command line.
That will not do. I need to start fresh on each file. Just as if I
ran the program as:
./program.pl file1
./program.pl file2
./program.pl file3
ect....
chomp $line;
#Split the line on white spaces.
($year,$month,$day,$precip,$tmin,$tmax,$snowfall,$snowdepth,$tmean,
$obstime,$datasource) = split(/\s+/, $line);
#Stop reading data at the end of the file, when $year is empty.
This
#gets you out of the datafile before the program chokes on the
footer.
exit unless $year;
# Initialize and concatenate date as YYYMMDD.
$date = $year . $month . $day;
# Initialize and calculate day of the year (doy)
$doy = Day_of_Year($year, $month, $day);
#Switch T (trace) to 0.01 and M (missing) to -999
if ($precip eq "T") { $precip = 0.01; }
elsif ($precip eq "M") {$precip = -999; }
if ($tmin eq "M") { $tmin = -999; }
if ($tmax eq "M") { $tmax = -999; }
if ($snowfall eq "M") { $snowfall = -999; }
if ($snowdepth eq "M") { $snowdepth = -999 }
if ($tmean eq "M") { $tmean = -999 }
my $sth = $dbh->prepare("INSERT INTO weather (station_id, year,
month,
day, doy, date, precip, tmin, tmax, snowfall, snowdepth, tmean)
VALUES
(?,?,?,?,?,?,?,?,?,?,?,?)");
You shouldn't call $dbh->prepare() inside the while loop, you only
need to
call it once before the loop starts.
I follow
$sth->execute($station_id, $year, $month, $day, $doy, $date, $precip,
$tmin, $tmax, $snowfall, $snowdepth, $tmean);
#print $station_id, $year, $month, $day, $doy, $date, $precip, $tmin,
$tmax, $snowfall, $snowdepth, $tmean;
}
#$sth->finish();
#Disconntect from database
$dbh->disconnect();
This may work better for you:
#!/usr/bin/perl -w
use strict;
use Date::Calc qw(Day_of_Year);
use DBI;
my $dbh = DBI->connect( 'DBI:Pg:dbname=met_data;host=localhost',
'pguser',
'pguser' )
or die "Couldn't connect to PostgreSQL: $DBI::errstr ($DBI::err)
\n";
my $sth = $dbh->prepare( 'INSERT INTO weather (station_id, year,
month, day,
doy, date, precip, tmin, tmax, snowfall, snowdepth, tmean) VALUES
(?,?,?,?,?,?,?,?,?,?,?,?)' );
my $station_id = '';
while ( <> ) {
# Part 1. Loop through the 11 header lines to identify the
station id.
# The station ID has the format of:
# STATION: SOME_STATION, STATE (Station ID: ######)
if ( 1 .. 11 ) {
$station_id = $1 if /\(Station ID:\s*(\S+)\)/;
It seems that this is more flexable. ie not dependent upon a certine
number of header lines. Can you translate the f /\(Station ID:\s*(\S+)
\)/; part though?
next;
}
# At eof close the input filehandle to reset $.
if ( eof ) {
close ARGV;
next;
}
I think this is suppose to allow the script to jump to the next file.
Right?
However, this script also reads the first file into the database,
then stops.
# Part 2. Loop through the records and prepare SQL statement.
my ( $year, $month, $day, $precip, $tmin, $tmax, $snowfall,
$snowdepth,
$tmean, $obstime, $datasource ) = split;
# Initialize and concatenate date as YYYMMDD.
my $date = $year . $month . $day;
# Initialize and calculate day of the year (doy)
my $doy = Day_of_Year( $year, $month, $day );
# Switch T (trace) to 0.01 and M (missing) to -999
$precip = 0.01 if $precip eq 'T';
for ( $precip, $tmin, $tmax, $snowfall, $snowdepth, $tmean ) {
$_ = -999 if $_ eq 'M';
}
much more efficient. Thankyou.
$sth->execute( $station_id, $year, $month, $day, $doy, $date,
$precip,
$tmin, $tmax, $snowfall, $snowdepth, $tmean );
#print join( "\t", $station_id, $year, $month, $day, $doy,
$date, $precip,
$tmin, $tmax, $snowfall, $snowdepth, $tmean ), "\n";
}
#$sth->finish();
# Disconntect from database
$dbh->disconnect();
__END__
John
--
Perl isn't a toolbox, but a small machine shop where you can
special-order
certain sorts of tools at low cost and in short order. --
Larry Wall
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>