by() chooses **data frame** subsets -- sum() is acting on these frames, adding up everything in them. Try this instead:
> by(mydata,list(mydata$sex,mydata$status),function(x)sum(x$values)) : 0 : 0 [1] 224 ----------------------------------------------------------- : 1 : 0 [1] 5227 ----------------------------------------------------------- : 0 : 1 [1] 11 ----------------------------------------------------------- : 1 : 1 [1] 552 Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Thu, Jul 23, 2020 at 3:25 PM Sorkin, John <jsor...@som.umaryland.edu> wrote: > Colleagues, > > The by function in the R program below is not giving me the sums > I expect to see, viz., > 382+170=552 > 4730+170=4900 > 5+6=11 > 199+25=224 > ################################################### > #full R program: > mydata <- data.frame(covid=c(0,0,0,0,1,1,1,1), > sex=(rep(c(1,1,0,0),2)), > status=rep(c(1,0),2), > values=c(382,4730,5,199,170,497,6,25)) > mydata > by(mydata,list(mydata$sex,mydata$status),sum) > by(mydata,list(mydata$sex,mydata$status),print) > ################################################### > > More complete explanation of my question > > I have created a simple dataframe having three factors: > mydata <- data.frame(covid=c(0,0,0,0,1,1,1,1), > sex=(rep(c(1,1,0,0),2)), > status=rep(c(1,0),2), > values=c(382,4730,5,199,170,497,6,25)) > > > mydata > covid sex status values > 1 0 1 1 382 > 2 0 1 0 4730 > 3 0 0 1 5 > 4 0 0 0 199 > 5 1 1 1 170 > 6 1 1 0 497 > 7 1 0 1 6 > 8 1 0 0 25 > > When I use the by function with a sum as an argument, I don’t > get the sums that I would expect to > receive based either on the listing of the dataframe above, > or from using by with print as an argument: > > > by(mydata,list(mydata$sex,mydata$status),sum) > : 0 > : 0 > [1] 225 > ------------------------------------------------------------------------------- > > : 1 > : 0 > [1] 5230 > ------------------------------------------------------------------------------- > > : 0 > : 1 > [1] 14 > ------------------------------------------------------------------------------- > > : 1 > : 1 > [1] 557 > > I expected to see the following sums: > 382+170=552 > 4730+170=4900 > 5+6=11 > 199+25=224 > Which as can be seen by the output above, I am not getting. > > Using print as an argument to the by function, I get the values > grouped as I would expect, but for some reason I get a double > printing of the values! > > > by(mydata,list(mydata$sex,mydata$status),print) > covid sex status values > 4 0 0 0 199 > 8 1 0 0 25 > covid sex status values > 2 0 1 0 4730 > 6 1 1 0 497 > covid sex status values > 3 0 0 1 5 > 7 1 0 1 6 > covid sex status values > 1 0 1 1 382 > 5 1 1 1 170 > : 0 > : 0 > covid sex status values > 4 0 0 0 199 > 8 1 0 0 25 > ------------------------------------------------------------------------------- > > : 1 > : 0 > covid sex status values > 2 0 1 0 4730 > 6 1 1 0 497 > ------------------------------------------------------------------------------- > > : 0 > : 1 > covid sex status values > 3 0 0 1 5 > 7 1 0 1 6 > ------------------------------------------------------------------------------- > > : 1 > : 1 > covid sex status values > 1 0 1 1 382 > 5 1 1 1 170 > > What am I doing wrong, or what don’t I understand > About the by function? > > Thank you > John > > > > > > > > > > > > > > > > > > > > John David Sorkin M.D., Ph.D. > > Professor of Medicine > > Chief, Biostatistics and Informatics > > University of Maryland School of Medicine Division of Gerontology and > Geriatric Medicine > > Baltimore VA Medical Center > > 10 North Greene Street > > GRECC (BT/18/GR) > > Baltimore, MD 21201-1524 > > (Phone) 410-605-7119 > > (Fax) 410-605-7913 (Please call phone number above prior to faxing) > > > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.