Dear R help folks,

First my apologizes for sending several related questions to the list server. I 
am trying to learn how to manipulate data in R . . . and am having difficulty 
getting my program to work. I greatly appreciate the help and support list 
member give!

I am trying to write a program that will run through a data frame organized by 
ID and for the first line of each new group of data lines that has the same ID 
create a new variable first that will be 1 for the first line of the group and 
0 for all other lines.

e.g. if my original data is 
 olddata
   ID date
    1     1
    1     1
    1     2
    1     2
    1     3
    1     3
    1     4
    1     4
    1     5
    1     5
    2     5
    2     5
    2     5
    2     6
    2     6
    2     6
    3   10
    3   10

the new data will be
newdata
   ID date  first
    1     1       1
    1     1       0
    1     2       0
    1     2       0
    1     3       0
    1     3       0
    1     4       0
    1     4       0
    1     5       0
    1     5       0
    2     5       1
    2     5       0
    2     5       0
    2     6       0
    2     6       0
    2     6       0
    3   10       1
    3   10       0

When I run the program below, I receive the following error:
Error in df[, "ID"] : incorrect number of dimensions

My code:
# Create data.frame
ID <- c(rep(1,10),rep(2,6),rep(3,2))
date <- c(rep(1,2),rep(2,2),rep(3,2),rep(4,2),rep(5,2),
          rep(5,3),rep(6,3),rep(10,2))
olddata <- data.frame(ID=ID,date=date)
class(olddata)
cat("This is the original data frame","\n")
print(olddata)
 
# This function is supposed to identify the first row 
# within each level of ID and, for the first row, set
# the variable first to 1, and for all rows other than
# the first row set first to 0.
mydoit <- function(df){
  value <- ifelse (first(df[,"ID"]),1,0)
  cat("value=",value,"\n")
  df[,"first"] <- value
}
newdata <- aggregate(olddata,list(olddata[,"ID"]),mydoit)

Thank you,
John


John David Sorkin M.D., Ph.D.
Professor of Medicine, University of Maryland School of Medicine;
Associate Director for Biostatistics and Informatics, Baltimore VA Medical 
Center Geriatrics Research, Education, and Clinical Center; 
PI Biostatistics and Informatics Core, University of Maryland School of 
Medicine Claude D. Pepper Older Americans Independence Center;
Senior Statistician University of Maryland Center for Vascular Research;

Division of Gerontology and Paliative Care,
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
Cell phone 443-418-5382



______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to