Hi all, I have a problem with binary response data in GLM fitting. The problem is that the "y" take only 1 or 0, and if I use logit link, it is the log of the odds ratio, which is p/(1-p). In my situation, think "y" is "p", so sometimes the odds is 0, sometimes it is "1/0", which is (should be) undefine? I wonder how R fits the glm?
The FULL detail of this exercise is as follow: ---------------------------------------------------------------------------------------------------------- The data here are concerned with whether people default on a loan taken from a particular bank and for identical interest rates and for a fixed period. The information on each individual is their sex (male of female); their income (in pounds), whether the person is a home owner or not, their age (in years), and the amount of the loan (in pounds). The information recorded is whether the individal defaulted on the loan or not. Study the data and try and understand a relation between the persons characteristics and defaulting. Specifically, what is your estimated probability that a female aged 42, who is not a home owner, has an income of 23,500, and took a loan of 12,000, defaults on the loan? The table holding the data have headings as follows: m/f: male=1, female=0 age: age in years home: home=1 is a home owner, home=0 is not a home owner inc: income loan: amount of loan def: default=1, non-default=0. ---------------------------------------------------------------------------------------------------------- my R code Q3=read.table("tabl3.dat") colnames(Q3)=c("Sex","Age","Home","Inc","Loan","Def") Q3$Sex=as.factor(Q3$Sex) Q3$Home=as.factor(Q3$Home) Q3$Def=as.factor(Q3$Def) Q3.mod=glm(Def~Sex+Age+Home+Inc+Loan,data=Q3,family=binomial(logit)) I dont really get that HOW R actually fits the model? if there is "1/0" that it has to calculate? This does give me some results but I dont quite feel right about it. Now, if I use the empirical logit link, which has a 0.5 correction, log ( y+0.5/ (1+0.5-y) ) as the response, then regress it on the explanntory variables, I got some estimated probability to be 0.49***** (when you transfer the log odds back to p), whereas the previous model give 0. Am I wrong in the first place to think that the response is "y=default"? How should I approach this? Thanks! DATA is attached. http://r.789695.n4.nabble.com/file/n3574478/tabl3.dat tabl3.dat -- View this message in context: http://r.789695.n4.nabble.com/Binary-response-GLM-Question-tp3574478p3574478.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.