It looks to me that you should be using the table or the xtabs
function. You have apparently already decided not to use NA for
missing values, so the instances in which variable1 == "" you should
get counts with those functions:
dft <- data.frame(var1 = sample(c("", "this", "that", "and"), 120,
replace=TRUE),
dt = sample( seq(as.Date("2006-01-01"),
as.Date("2007-12-31"), by="months"),
120, replace=TRUE))
mo.tbl <- xtabs( ~var1+ dt, data=dft) # the =="" entry is the
first row
> mo.tbl[1,]
2006-01-01 2006-02-01 2006-03-01 2006-04-01 2006-05-01 2006-06-01
2006-07-01
2 1 1 2 2
3 1
2006-08-01 2006-09-01 2006-10-01 2006-11-01 2006-12-01 2007-01-01
2007-02-01
0 1 1 1 2
1 2
2007-03-01 2007-04-01 2007-05-01 2007-06-01 2007-07-01 2007-08-01
2007-09-01
2 2 2 0 1
3 4
2007-10-01 2007-11-01 2007-12-01
1 3 2
x <- seq(as.Date("2006-01-01"), as.Date("2007-03-31"), by="months")
plot(mo.tbl[1,]~x)
--
David Winsemius
On Mar 24, 2009, at 2:58 AM, Shreyasee wrote:
Hi,
I have a dataset in which there are in all 250 variables and for each
variable the data is entered over the months.
I need to calculate the percentage of missing values for each
variable over
each month and then plot a graph for that.
I am running the following code for doing the same
*ds <- read.csv(file="filepath", header=TRUE)
attach(ds)
may <- length(variable1[variable1==""]) / length(dos[dos=="May-06"])
* 100
jun <- length(variable1[variable1==""]) /
length(dos[dos=="June-06"]) * 100
.
.
.
var1 <- c(may, jun, ...........)
x <- seq(as.Date("2006-01-01"), as.Date("2007-03-31"), by="months")
plot(var1~x)*
So likewise I am calculating the percentage of missing values for each
variable for each month using different variables and storing the
values in
those variables and then combining those variables in one object for
plotting the graph.
I need to know, whether can I combine all the variables from that
dataset in
one object and calculate the missing values percentage over months
together,
instead of creating different variables for each month and then
combining
them.
Also, after doing that, I need to plot the graph for each variable and
combine it in a single pdf file.
I highly appreciate all your help.
Thanks,
Shreyasee
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.