Hi, I have a dataset where the response for each person on one of the 2 treatments was a proportion (percentage of certain number of markers being positive), I also have the number of positive & negative markers available for each person. what is the best way to analyze this kind of data?
I can think of analyzing this data using glm() with the attached dataset: test<-read.table('test.txt',sep='\t') fit<-glm(cbind(positive,total-positive)~treatment,test,family=binomial) summary(fit) anova(fit, test='Chisq') First, is this still called logistic regression or something else? I thought with logistic regression, the response variable is a binary factor? Second, then summary(fit) and anova(fit, test='Chisq') gave me different p values, why is that? which one should I use? Third, is there an equivalent model where I can use variable "percentage" instead of "positive" & "total"? Finally, what is the best way to analyze this kind of dataset where it's almost the same as ANOVA except that the response variable is a proportion (or success and failure)? Thanks John
"treatment" "total" "positive" "percentage" "1" "exposed" 11 4 0.363636363636364 "2" "exposed" 10 4 0.4 "3" "exposed" 9 4 0.444444444444444 "4" "exposed" 7 4 0.571428571428571 "5" "exposed" 7 4 0.571428571428571 "6" "exposed" 6 5 0.833333333333333 "8" "exposed" 12 7 0.583333333333333 "9" "exposed" 8 5 0.625 "10" "exposed" 13 12 0.923076923076923 "11" "exposed" 10 5 0.5 "12" "control" 10 1 0.1 "13" "control" 11 2 0.181818181818182 "14" "control" 8 0 0 "16" "control" 12 1 0.0833333333333333 "15" "control" 8 0 0 "17" "control" 10 1 0.1 "18" "control" 10 1 0.1 "19" "control" 8 1 0.125 "20" "control" 8 0 0 "21" "control" 9 1 0.111111111111111 "22" "control" 10 1 0.1
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.