Re: run script on multiple files

Kirk Wythers Sat, 23 Dec 2006 10:32:29 -0800

Thanks or the reply John. I have a couple of questions inline.


On Dec 22, 2006, at 10:53 PM, John W. Krahn wrote:


#! /usr/bin/perl -w
use strict;
use Date::Calc qw(Day_of_Year);
use DBI;

#MICIS climate data munger. Required input argument is the file to
process.
#Use > to redirect output to new file.

#Set the item delimiter to tabs instead of the default commas andthe line


The Output Field Separator ($,) has the default value of undef.


I guess I'm too new at this. I don't understand your point.

#delimiter to a newline character
$, = "\t";
$\ = "\n";

#Instantiate the global station ID variable
my $station_id = "";
#Initialize I/O variables
my ($year,$month,$day,$doy,$date,$precip,$tmin,$tmax,$snowfall,
$snowdepth,$tmean,$obstime,$datasource);

You don't really need to declare these variables in file scope, youprobably

should declare them inside the while loop.


Understood

#Part 1. Loop through the 11 header lines to identify the station id.
#The 7th line contains the station ID, and has the format of
#STATION: SOME_STATION, STATE   (Station ID: ######)

for(my $i=1;$i<=6;$i++) {


Your comment says eleven lines but your code says six?

A mistake on my part not updating the comments. The earlier fileformat had 11 lines.

  my $header = <>;
  #Remove the newline character
  chomp $header;
  if ($i == 2) {
    #Split the line into an 3-item array based on the 2 colons.
    my @line = split(":", $header);
    #Extract everything after the 2nd colon.
    $station_id = $line[2];
    #Remove leading white spaces.
    $station_id =~ s/^\s+//;
    #Remove ending bracket.
    $station_id =~ s/\)//;
  }
}

#Connect to postgreql
my $dbh = DBI->connect( "DBI:Pg:dbname=met_data;host=localhost",
"pguser", "pguser" )
or die "Couldn't connect to PostgreSQL: $DBI::errstr ($DBI::err)\n";
#Part 2. Loop through the records and prepare SQL statement.
while (my $line=<>) {
You are using the <> operator to read from the file(s) so this*will* read allthe lines from all the files listed on the command line. The onlyproblem isthat you will not distinguish the headers from the second andsubsequent files
listed on the command line.

That will not do. I need to start fresh on each file. Just as if Iran the program as:


./program.pl file1
./program.pl file2
./program.pl file3
ect....

  chomp $line;
  #Split the line on white spaces.
  ($year,$month,$day,$precip,$tmin,$tmax,$snowfall,$snowdepth,$tmean,
$obstime,$datasource) = split(/\s+/, $line);
#Stop reading data at the end of the file, when $year is empty.This#gets you out of the datafile before the program chokes on thefooter.
  exit unless $year;
  # Initialize and concatenate date as YYYMMDD.
  $date = $year . $month . $day;
  # Initialize and calculate day of the year (doy)
  $doy = Day_of_Year($year, $month, $day);
  #Switch T (trace) to 0.01 and M (missing) to -999
  if ($precip eq "T") { $precip = 0.01; }
  elsif ($precip eq "M") {$precip = -999; }
  if ($tmin eq "M") { $tmin = -999; }
  if ($tmax eq "M") { $tmax = -999; }
  if ($snowfall eq "M") { $snowfall = -999; }
  if ($snowdepth eq "M") { $snowdepth = -999 }
  if ($tmean eq "M") { $tmean = -999 }
my $sth = $dbh->prepare("INSERT INTO weather (station_id, year,month,day, doy, date, precip, tmin, tmax, snowfall, snowdepth, tmean)VALUES
(?,?,?,?,?,?,?,?,?,?,?,?)");
You shouldn't call $dbh->prepare() inside the while loop, you onlyneed to
call it once before the loop starts.


I follow

$sth->execute($station_id, $year, $month, $day, $doy, $date, $precip,
$tmin, $tmax, $snowfall, $snowdepth, $tmean);
#print $station_id, $year, $month, $day, $doy, $date, $precip, $tmin,
$tmax, $snowfall, $snowdepth, $tmean;
}

#$sth->finish();

#Disconntect from database
$dbh->disconnect();


This may work better for you:

#!/usr/bin/perl -w
use strict;
use Date::Calc qw(Day_of_Year);
use DBI;

my $dbh = DBI->connect( 'DBI:Pg:dbname=met_data;host=localhost','pguser',

'pguser' )

or die "Couldn't connect to PostgreSQL: $DBI::errstr ($DBI::err)\n";

my $sth = $dbh->prepare( 'INSERT INTO weather (station_id, year,month, day,

doy, date, precip, tmin, tmax, snowfall, snowdepth,  tmean) VALUES
(?,?,?,?,?,?,?,?,?,?,?,?)' );


my $station_id = '';

while ( <> ) {

# Part 1. Loop through the 11 header lines to identify thestation id.

    # The station ID has the format of:
    # STATION: SOME_STATION, STATE   (Station ID: ######)
    if ( 1 .. 11 ) {
        $station_id = $1 if /\(Station ID:\s*(\S+)\)/;

It seems that this is more flexable. ie not dependent upon a certinenumber of header lines. Can you translate the f /$Station ID:\s*(\S+)$/; part though?

        next;
        }

    # At eof close the input filehandle to reset $.
    if ( eof ) {
        close ARGV;
        next;
        }

I think this is suppose to allow the script to jump to the next file.Right?However, this script also reads the first file into the database,then stops.

    # Part 2. Loop through the records and prepare SQL statement.

my ( $year, $month, $day, $precip, $tmin, $tmax, $snowfall,$snowdepth,

$tmean, $obstime, $datasource ) = split;

    # Initialize and concatenate date as YYYMMDD.
    my $date = $year . $month . $day;

    # Initialize and calculate day of the year (doy)
    my $doy = Day_of_Year( $year, $month, $day );

    # Switch T (trace) to 0.01 and M (missing) to -999
    $precip = 0.01 if $precip eq 'T';
    for ( $precip, $tmin, $tmax, $snowfall, $snowdepth, $tmean ) {
        $_ = -999 if $_ eq 'M';
        }


much more efficient. Thankyou.

$sth->execute( $station_id, $year, $month, $day, $doy, $date,$precip,
$tmin, $tmax, $snowfall, $snowdepth, $tmean );
#print join( "\t", $station_id, $year, $month, $day, $doy,$date, $precip,
$tmin, $tmax, $snowfall, $snowdepth, $tmean ), "\n";
    }

#$sth->finish();

# Disconntect from database
$dbh->disconnect();

__END__



John
--
Perl isn't a toolbox, but a small machine shop where you canspecial-ordercertain sorts of tools at low cost and in short order. --Larry Wall
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>



--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>

Re: run script on multiple files

Reply via email to