Thank you very much for your rapid response. I sincerely appreciate your input. I am sorry for sending the previous email in HTML format.
with(a, table(Sex, Therapy1) ) shows the following. Therapy1 Sex no yes female 6 7 male 7 5 with(a, table(Sex, Outcome) ) and with(a, table(Therapy1, Outcome) ) elicit the following Outcome Sex Alive Death female 4 9 male 9 3 Outcome Therapy1 Alive Death no 4 9 yes 9 3 As there is no zero cells, it does not seem to be complete separation. I really appreciate comments. Kengo Inagaki Memphis, TN 2015-05-27 13:57 GMT-05:00 David Winsemius <dwinsem...@comcast.net>: > > On May 27, 2015, at 10:10 AM, Kengo Inagaki wrote: > >> I am currently working on a health care related project using R. I am >> learning R while working on data analysis. >> >> Below is the part of the data in which i am encountering a problem. >> >> >> Case# Sex Therapy1 Therapy2 Outcome >> >> 1 male no >> no Alive >> > > snipped mangled data sent in HTML > >> >> >> "Outcome" is the response variable and "Sex", "Therapy1", "Therapy2" are >> predictor variables. >> >> All of the predictors are significantly associated with the outcome by >> univariate analysis. >> >> Logistic regression runs fine with most of the predictors when "Sex" and >> "Therapy1" are not included at the same time (This is a part of table that >> I cut out from a larger table for ease of >> >> presentation and there are more predictors that i tested). > > Please examine the data before reaching for ridge regression: > > What does this show: ... > > with(a, table(Sex, Therapy1) ) > > I predict you will see a zero cell entry. The read about "complete > separation" and the so-called "Hauck-Donner effect". > > -- > David. >> >> However, when "Sex" and "Therapy1" are included in logistic regression >> model at the same time, standard error inflates and p value gets close to 1. >> >> The formula used is, >> >> >> >>> Model<-glm(Outcome~Sex+Therapy1,data=a,family=binomial) #I assigned a >> vector "a" to represent above table. >> >> >> >> After doing some reading, I suspect this might be collinearity, as vif >> values (using "vif()" function in car package) were sky high (8,875,841 for >> both "Sex" and "Therapy1"). >> >> Learning that ridge regression may be a solution, I attempted using >> logisticRidge {ridge} using the following formula, but i get the >> accomapnying error message. >> >> >> >>> logisticRidge(a$Outcome~a$Sex+a$Therapy1) >> >> >> >> Error in ifelse(y, log(p), log(1 - p)) : >> >> invalid to change the storage mode of a factor >> >> >> >> At this point I do not have an idea how to solve this and would like to >> seek help. >> >> I really really appreciate your input!!! >> >> [[alternative HTML version deleted]] >> > > > David Winsemius > Alameda, CA, USA > ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.