On May 27, 2015, at 2:49 PM, Kengo Inagaki wrote: > Thank you very much for your rapid response. I sincerely appreciate your > input. > I am sorry for sending the previous email in HTML format. > > with(a, table(Sex, Therapy1) ) shows the following. > Therapy1 > Sex no yes > female 6 7 > male 7 5 > > and with(a, table(Therapy1, Outcome) ) > elicit the following > > Outcome > Sex Alive Death > female 4 9 > male 9 3 > > Outcome > Therapy1 Alive Death > no 4 9 > yes 9 3
Then what about: with(a, table(Sex, Therapy1, Outcome) ) -- David > > As there is no zero cells, it does not seem to be complete separation. > I really appreciate comments. > > Kengo Inagaki > Memphis, TN > > > 2015-05-27 13:57 GMT-05:00 David Winsemius <dwinsem...@comcast.net>: >> >> On May 27, 2015, at 10:10 AM, Kengo Inagaki wrote: >> >>> I am currently working on a health care related project using R. I am >>> learning R while working on data analysis. >>> >>> Below is the part of the data in which i am encountering a problem. >>> >>> >>> Case# Sex Therapy1 Therapy2 Outcome >>> >>> 1 male no >>> no Alive >>> >> >> snipped mangled data sent in HTML >> >>> >>> >>> "Outcome" is the response variable and "Sex", "Therapy1", "Therapy2" are >>> predictor variables. >>> >>> All of the predictors are significantly associated with the outcome by >>> univariate analysis. >>> >>> Logistic regression runs fine with most of the predictors when "Sex" and >>> "Therapy1" are not included at the same time (This is a part of table that >>> I cut out from a larger table for ease of >>> >>> presentation and there are more predictors that i tested). >> >> Please examine the data before reaching for ridge regression: >> >> What does this show: ... >> >> with(a, table(Sex, Therapy1) ) >> >> I predict you will see a zero cell entry. The read about "complete >> separation" and the so-called "Hauck-Donner effect". >> >> -- >> David. >>> >>> However, when "Sex" and "Therapy1" are included in logistic regression >>> model at the same time, standard error inflates and p value gets close to 1. >>> >>> The formula used is, >>> >>> >>> >>>> Model<-glm(Outcome~Sex+Therapy1,data=a,family=binomial) #I assigned a >>> vector "a" to represent above table. >>> >>> >>> >>> After doing some reading, I suspect this might be collinearity, as vif >>> values (using "vif()" function in car package) were sky high (8,875,841 for >>> both "Sex" and "Therapy1"). >>> >>> Learning that ridge regression may be a solution, I attempted using >>> logisticRidge {ridge} using the following formula, but i get the >>> accomapnying error message. >>> >>> >>> >>>> logisticRidge(a$Outcome~a$Sex+a$Therapy1) >>> >>> >>> >>> Error in ifelse(y, log(p), log(1 - p)) : >>> >>> invalid to change the storage mode of a factor >>> >>> >>> >>> At this point I do not have an idea how to solve this and would like to >>> seek help. >>> >>> I really really appreciate your input!!! >>> >>> [[alternative HTML version deleted]] >>> >> >> >> David Winsemius >> Alameda, CA, USA >> David Winsemius Alameda, CA, USA ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.