Colleagues, The by function in the R program below is not giving me the sums I expect to see, viz., 382+170=552 4730+170=4900 5+6=11 199+25=224 ################################################### #full R program: mydata <- data.frame(covid=c(0,0,0,0,1,1,1,1), sex=(rep(c(1,1,0,0),2)), status=rep(c(1,0),2), values=c(382,4730,5,199,170,497,6,25)) mydata by(mydata,list(mydata$sex,mydata$status),sum) by(mydata,list(mydata$sex,mydata$status),print) ###################################################
More complete explanation of my question I have created a simple dataframe having three factors: mydata <- data.frame(covid=c(0,0,0,0,1,1,1,1), sex=(rep(c(1,1,0,0),2)), status=rep(c(1,0),2), values=c(382,4730,5,199,170,497,6,25)) > mydata covid sex status values 1 0 1 1 382 2 0 1 0 4730 3 0 0 1 5 4 0 0 0 199 5 1 1 1 170 6 1 1 0 497 7 1 0 1 6 8 1 0 0 25 When I use the by function with a sum as an argument, I don’t get the sums that I would expect to receive based either on the listing of the dataframe above, or from using by with print as an argument: > by(mydata,list(mydata$sex,mydata$status),sum) : 0 : 0 [1] 225 ------------------------------------------------------------------------------- : 1 : 0 [1] 5230 ------------------------------------------------------------------------------- : 0 : 1 [1] 14 ------------------------------------------------------------------------------- : 1 : 1 [1] 557 I expected to see the following sums: 382+170=552 4730+170=4900 5+6=11 199+25=224 Which as can be seen by the output above, I am not getting. Using print as an argument to the by function, I get the values grouped as I would expect, but for some reason I get a double printing of the values! > by(mydata,list(mydata$sex,mydata$status),print) covid sex status values 4 0 0 0 199 8 1 0 0 25 covid sex status values 2 0 1 0 4730 6 1 1 0 497 covid sex status values 3 0 0 1 5 7 1 0 1 6 covid sex status values 1 0 1 1 382 5 1 1 1 170 : 0 : 0 covid sex status values 4 0 0 0 199 8 1 0 0 25 ------------------------------------------------------------------------------- : 1 : 0 covid sex status values 2 0 1 0 4730 6 1 1 0 497 ------------------------------------------------------------------------------- : 0 : 1 covid sex status values 3 0 0 1 5 7 1 0 1 6 ------------------------------------------------------------------------------- : 1 : 1 covid sex status values 1 0 1 1 382 5 1 1 1 170 What am I doing wrong, or what don’t I understand About the by function? Thank you John John David Sorkin M.D., Ph.D. Professor of Medicine Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.