Hello all,
I am having an issue with my attempts to accurately filter some data from a CSV
file I am importing. I have attached both a sample of the CSV data and my
script.
The attached CSV file contains two rows and 27 columns of data. The first
column is the station ID "BLS", the second column is the sensor number "4", the
third column is the date, and the remaining 24 columns are hourly temperature
readings.
In my attached script, I read in row[3:] to extract just the temperatures, do a
sanity check to make sure there are 24 values, remove any missing or "m"
values, and then append the non-missing values into the "hour_list".
Strangely the the first seven rows appear to be empty after reading into the
CSV file, so that's what I had to incorporate the if len(temps) == 24
statement.
But the real issue is that for days with no missing values, for example the
second row of data, the length of the hour_list should be 24. My script,
however, is returning 23. I think this is because the end-of-row-values have a
trailing "\". This must mark these numbers as non-digits and are lost in my
"isdig" filter line. I've tried several ways to remove this trailing "\", but
to no success.
Do you have any suggestions on how to fix this issue?
Many thanks in advance,
Neil Berg
# Purpose: read in a CSV file containing hourly temps. at each station,
# then append non-missing hourly data into a list and find the maximum
# value of that list
#---------------------------------------------------------------------
import csv
from numpy import *
f = csv.reader(open('csv_sample.csv','rb'))
for row in f:
temps= row[3:] #extract hourly temps, neglect station ID,sensor ID, and date
#print temps # you see here that the first seven rows are empty
if len(temps) == 24: #only keep rows with 24 temps in them
hour_list = [] #empty list of all integer hourly temps, i.e. exclude missing "m" values
for val in temps:
#print val #here you can see that the end-of-row values have a trailing "\"
#--------------------------------------------------------------------
# This is where I want to strip the trailing "\" before removing any
# missing or "m" values
#--------------------------------------------------------------------
isdig = str.isdigit(val)
if isdig is True:
hour_list.append(val)
print len(hour_list) #should be 24 for rows with no missing values, but it's 23 as is
{\rtf1\ansi\ansicpg1252\cocoartf1038\cocoasubrtf350
{\fonttbl\f0\fmodern\fcharset0 Courier;}
{\colortbl;\red255\green255\blue255;}
\margl1440\margr1440\vieww14540\viewh7680\viewkind0
\deftab720
\pard\pardeftab720\ql\qnatural
\f0\fs24 \cf0 BLS,4,19981101,37,m,36,34,36,35,34,34,35,36,38,39,43,42,42,42,38,36,34,32,33,33,35,34\
BLS,4,19981102,34,32,33,32,34,32,33,32,34,38,40,41,44,47,43,42,39,36,35,35,36,36,35,33\
}
--
http://mail.python.org/mailman/listinfo/python-list