On Mon, 11 Apr 2011, ty ty wrote:
Hello, dear experts. I don't have much experience in building
regression models, so sorry if this is too simple and not very
interesting question.
Currently I'm working on the model that have to predict proportion of
the debt returned by the debtor in some period of time. So the
dependent variable can be any number between 0 and 1 with very high
probability of 0 (if there are no payment) and if there are some
payments it can very likely be 1 (all debt paid) although can be any
number from 0 to 1.
Not having much knowledge in this area I can't think about any
appropriate model and wasn't able to find much on the Internet. Can
anyone give me some ideas about possible models, any information
on-line and some R functions and packages that can implement it.
Thank you in advance for any help.
Beta regression is one possibility to model proportions in the open unit
interval (0, 1). It is available in R in the package "betareg":
http://CRAN.R-project.org/package=betareg
http://www.jstatsoft.org/v34/i02/
If 0 and 1 can occur, some authors have suggested to scale the response so
that 0 and 1 are avoided. See the paper linked above for an example. If,
however, there are many 0s and/or 1s, one might want to take a hurdle or
inflation type approach. One such approach is implemented in the "gamlss"
package:
http://CRAN.R-project.org/package=gamlss
http://www.jstatsoft.org/v23/i07/
http://www.gamlss.org/
The hurdle approach can be implemented using separate building blocks.
First a binary regression model that captures whether the dependent
variable is greater than 0 (i.e., crosses the hurdle): glm(I(y > 0) ~ ...,
family = binomial). Second a beta regression for only the observations in
(0, 1) that crossed the hurdle: betareg(y ~ ..., subset = y > 0). A recent
technical report introduces such a family of models along with many
further techniques (specialized residuals and regression diagnostics) that
are not yet available in R:
http://arxiv.org/abs/1103.2372
Best,
Z
Ihor.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.