Kirk Wythers wrote:
> 
> On Dec 22, 2006, at 8:33 PM, Chad Perrin wrote:
> 
>> On Fri, Dec 22, 2006 at 08:04:39PM -0600, Kirk Wythers wrote:
>>> I have written a short perl script that munges climate data and then
>>> loads it into a postgres database. It works fine on one file at a
>>> time... syntax is ./program.pl filename
>>>
>>> I would like to run it in a directory with multiple files. I have
>>> tried syntax ./program.pl file1 file2, but only the first file gets
>>> processed. Can anyone help me figure out how to run this script on a
>>> directory full of files that all need to be processed?
>>
>> Yes, some of us probably can help.  We'll probably need to see what
>> you're trying so far to be able to give the most helpful responses
>> possible, however, for solving your problem.  Right off the top of my
>> head, without any other information, I'm just inclined to say "Try  using
>> 'while (<>)' to access file contents."  That may not suit your  needs at
>> all, though, since I don't know exactly how you need your file  access to
>> fit into the program.
> 
> Thanks for the reply Chad. Here is my script (I'm not sure if I  should
> be modifying the script itself, or piping something on the CL:
> 
> #! /usr/bin/perl -w
> use strict;
> use Date::Calc qw(Day_of_Year);
> use DBI;
> 
> #MICIS climate data munger. Required input argument is the file to 
> process.
> #Use > to redirect output to new file.
> 
> #Set the item delimiter to tabs instead of the default commas and the  line

The Output Field Separator ($,) has the default value of undef.

> #delimiter to a newline character
> $, = "\t";
> $\ = "\n";
> 
> #Instantiate the global station ID variable
> my $station_id = "";
> #Initialize I/O variables
> my ($year,$month,$day,$doy,$date,$precip,$tmin,$tmax,$snowfall,
> $snowdepth,$tmean,$obstime,$datasource);

You don't really need to declare these variables in file scope, you probably
should declare them inside the while loop.

> #Part 1. Loop through the 11 header lines to identify the station id.
> #The 7th line contains the station ID, and has the format of
> #STATION: SOME_STATION, STATE   (Station ID: ######)
> 
> for(my $i=1;$i<=6;$i++) {

Your comment says eleven lines but your code says six?

>   my $header = <>;
>   #Remove the newline character
>   chomp $header;
>   if ($i == 2) {
>     #Split the line into an 3-item array based on the 2 colons.
>     my @line = split(":", $header);
>     #Extract everything after the 2nd colon.
>     $station_id = $line[2];
>     #Remove leading white spaces.
>     $station_id =~ s/^\s+//;
>     #Remove ending bracket.
>     $station_id =~ s/\)//;
>   }
> }
> 
> #Connect to postgreql
> my $dbh = DBI->connect( "DBI:Pg:dbname=met_data;host=localhost", 
> "pguser", "pguser" )
>   or die "Couldn't connect to PostgreSQL: $DBI::errstr ($DBI::err)\n";
> 
> #Part 2. Loop through the records and prepare SQL statement.
> while (my $line=<>) {

You are using the <> operator to read from the file(s) so this *will* read all
the lines from all the files listed on the command line.  The only problem is
that you will not distinguish the headers from the second and subsequent files
listed on the command line.

>   chomp $line;
>   #Split the line on white spaces.
>   ($year,$month,$day,$precip,$tmin,$tmax,$snowfall,$snowdepth,$tmean,
> $obstime,$datasource) = split(/\s+/, $line);
>   #Stop reading data at the end of the file, when $year is empty. This
>   #gets you out of the datafile before the program chokes on the  footer.
>   exit unless $year;
>   # Initialize and concatenate date as YYYMMDD.
>   $date = $year . $month . $day;
>   # Initialize and calculate day of the year (doy)
>   $doy = Day_of_Year($year, $month, $day);
>   #Switch T (trace) to 0.01 and M (missing) to -999
>   if ($precip eq "T") { $precip = 0.01; }
>   elsif ($precip eq "M") {$precip = -999; }
>   if ($tmin eq "M") { $tmin = -999; }
>   if ($tmax eq "M") { $tmax = -999; }
>   if ($snowfall eq "M") { $snowfall = -999; }
>   if ($snowdepth eq "M") { $snowdepth = -999 }
>   if ($tmean eq "M") { $tmean = -999 }
> 
> my $sth = $dbh->prepare("INSERT INTO weather (station_id, year,  month,
> day, doy, date, precip, tmin, tmax, snowfall, snowdepth,  tmean) VALUES
> (?,?,?,?,?,?,?,?,?,?,?,?)");

You shouldn't call $dbh->prepare() inside the while loop, you only need to
call it once before the loop starts.

> $sth->execute($station_id, $year, $month, $day, $doy, $date, $precip, 
> $tmin, $tmax, $snowfall, $snowdepth, $tmean);
> #print $station_id, $year, $month, $day, $doy, $date, $precip, $tmin, 
> $tmax, $snowfall, $snowdepth, $tmean;
> }
> 
> #$sth->finish();
> 
> #Disconntect from database
> $dbh->disconnect();

This may work better for you:

#!/usr/bin/perl -w
use strict;
use Date::Calc qw(Day_of_Year);
use DBI;


my $dbh = DBI->connect( 'DBI:Pg:dbname=met_data;host=localhost', 'pguser',
'pguser' )
    or die "Couldn't connect to PostgreSQL: $DBI::errstr ($DBI::err)\n";

my $sth = $dbh->prepare( 'INSERT INTO weather (station_id, year,  month, day,
doy, date, precip, tmin, tmax, snowfall, snowdepth,  tmean) VALUES
(?,?,?,?,?,?,?,?,?,?,?,?)' );


my $station_id = '';

while ( <> ) {

    # Part 1. Loop through the 11 header lines to identify the station id.
    # The station ID has the format of:
    # STATION: SOME_STATION, STATE   (Station ID: ######)
    if ( 1 .. 11 ) {
        $station_id = $1 if /\(Station ID:\s*(\S+)\)/;
        next;
        }

    # At eof close the input filehandle to reset $.
    if ( eof ) {
        close ARGV;
        next;
        }

    # Part 2. Loop through the records and prepare SQL statement.
    my ( $year, $month, $day, $precip, $tmin, $tmax, $snowfall, $snowdepth,
$tmean, $obstime, $datasource ) = split;

    # Initialize and concatenate date as YYYMMDD.
    my $date = $year . $month . $day;

    # Initialize and calculate day of the year (doy)
    my $doy = Day_of_Year( $year, $month, $day );

    # Switch T (trace) to 0.01 and M (missing) to -999
    $precip = 0.01 if $precip eq 'T';
    for ( $precip, $tmin, $tmax, $snowfall, $snowdepth, $tmean ) {
        $_ = -999 if $_ eq 'M';
        }

    $sth->execute( $station_id, $year, $month, $day, $doy, $date, $precip,
$tmin, $tmax, $snowfall, $snowdepth, $tmean );
    #print join( "\t", $station_id, $year, $month, $day, $doy, $date, $precip,
$tmin, $tmax, $snowfall, $snowdepth, $tmean ), "\n";
    }

#$sth->finish();

# Disconntect from database
$dbh->disconnect();

__END__



John
-- 
Perl isn't a toolbox, but a small machine shop where you can special-order
certain sorts of tools at low cost and in short order.       -- Larry Wall

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to