Hello,
These two gives the same results:
aggregate(values ~ sex + status, mydata, sum)
# sex status values
#1 0 0 224
#2 1 0 5227
#3 0 1 11
#4 1 1 552
by(mydata$values, list(mydata$sex, mydata$status), sum)
#: 0
#: 0
#[1] 224
#------------------------------------------------------------
#: 1
#: 0
#[1] 5227
#------------------------------------------------------------
#: 0
#: 1
#[1] 11
#------------------------------------------------------------
#: 1
#: 1
#[1] 552
So Duncan is right, your expected output's 2nd sum is wrong, the right
sum is
mydata rows 2 and 6: 4730 + 497 == 5227
----------------------------------------^
Another option, returning a matrix,
tapply(mydata$values, list(mydata$sex, mydata$status), sum)
# 0 1
#0 224 11
#1 5227 552
Hope this helps,
Rui Barradas
Às 23:15 de 23/07/2020, Sorkin, John escreveu:
Colleagues,
The by function in the R program below is not giving me the sums
I expect to see, viz.,
382+170=552
4730+170=4900
5+6=11
199+25=224
###################################################
#full R program:
mydata <- data.frame(covid=c(0,0,0,0,1,1,1,1),
sex=(rep(c(1,1,0,0),2)),
status=rep(c(1,0),2),
values=c(382,4730,5,199,170,497,6,25))
mydata
by(mydata,list(mydata$sex,mydata$status),sum)
by(mydata,list(mydata$sex,mydata$status),print)
###################################################
More complete explanation of my question
I have created a simple dataframe having three factors:
mydata <- data.frame(covid=c(0,0,0,0,1,1,1,1),
sex=(rep(c(1,1,0,0),2)),
status=rep(c(1,0),2),
values=c(382,4730,5,199,170,497,6,25))
> mydata
covid sex status values
1 0 1 1 382
2 0 1 0 4730
3 0 0 1 5
4 0 0 0 199
5 1 1 1 170
6 1 1 0 497
7 1 0 1 6
8 1 0 0 25
When I use the by function with a sum as an argument, I don’t
get the sums that I would expect to
receive based either on the listing of the dataframe above,
or from using by with print as an argument:
by(mydata,list(mydata$sex,mydata$status),sum)
: 0
: 0
[1] 225
-------------------------------------------------------------------------------
: 1
: 0
[1] 5230
-------------------------------------------------------------------------------
: 0
: 1
[1] 14
-------------------------------------------------------------------------------
: 1
: 1
[1] 557
I expected to see the following sums:
382+170=552
4730+170=4900
5+6=11
199+25=224
Which as can be seen by the output above, I am not getting.
Using print as an argument to the by function, I get the values
grouped as I would expect, but for some reason I get a double
printing of the values!
by(mydata,list(mydata$sex,mydata$status),print)
covid sex status values
4 0 0 0 199
8 1 0 0 25
covid sex status values
2 0 1 0 4730
6 1 1 0 497
covid sex status values
3 0 0 1 5
7 1 0 1 6
covid sex status values
1 0 1 1 382
5 1 1 1 170
: 0
: 0
covid sex status values
4 0 0 0 199
8 1 0 0 25
-------------------------------------------------------------------------------
: 1
: 0
covid sex status values
2 0 1 0 4730
6 1 1 0 497
-------------------------------------------------------------------------------
: 0
: 1
covid sex status values
3 0 0 1 5
7 1 0 1 6
-------------------------------------------------------------------------------
: 1
: 1
covid sex status values
1 0 1 1 382
5 1 1 1 170
What am I doing wrong, or what don’t I understand
About the by function?
Thank you
John
John David Sorkin M.D., Ph.D.
Professor of Medicine
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology and Geriatric
Medicine
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--
Este e-mail foi verificado em termos de vírus pelo software antivírus Avast.
https://www.avast.com/antivirus
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.