Thanks for very insightful thoughts. What I am trying to achieve with the weights is actually not new, something like https://stats.stackexchange.com/questions/44776/logistic-regression-with-weighted-instances. I thought my inquiry was not too strange, and I could utilize some existing codes. It is just an optimization problem at the end of day, or not? Thanks
On Sat, Aug 29, 2020 at 9:02 AM John Fox <j...@mcmaster.ca> wrote: > Dear John, > > On 2020-08-29 1:30 a.m., John Smith wrote: > > Thanks Prof. Fox. > > > > I am curious: what is the model estimated below? > > Nonsense, as Peter explained in a subsequent response to your prior > posting. > > > > > I guess my inquiry seems more complicated than I thought: with y being > 0/1, how to fit weighted logistic regression with weights <1, in the sense > of weighted least squares? Thanks > > What sense would that make? WLS is meant to account for non-constant > error variance in a linear model, but in a binomial GLM, the variance is > purely a function for the mean. > > If you had binomial (rather than binary 0/1) observations (i.e., > binomial trials exceeding 1), then you could account for overdispersion, > e.g., by introducing a dispersion parameter via the quasibinomial > family, but that isn't equivalent to variance weights in a LM, rather to > the error-variance parameter in a LM. > > I guess the question is what are you trying to achieve with the weights? > > Best, > John > > > > >> On Aug 28, 2020, at 10:51 PM, John Fox <j...@mcmaster.ca> wrote: > >> > >> Dear John > >> > >> I think that you misunderstand the use of the weights argument to glm() > for a binomial GLM. From ?glm: "For a binomial GLM prior weights are used > to give the number of trials when the response is the proportion of > successes." That is, in this case y should be the observed proportion of > successes (i.e., between 0 and 1) and the weights are integers giving the > number of trials for each binomial observation. > >> > >> I hope this helps, > >> John > >> > >> John Fox, Professor Emeritus > >> McMaster University > >> Hamilton, Ontario, Canada > >> web: https://socialsciences.mcmaster.ca/jfox/ > >> > >>> On 2020-08-28 9:28 p.m., John Smith wrote: > >>> If the weights < 1, then we have different values! See an example > below. > >>> How should I interpret logLik value then? > >>> set.seed(135) > >>> y <- c(rep(0, 50), rep(1, 50)) > >>> x <- rnorm(100) > >>> data <- data.frame(cbind(x, y)) > >>> weights <- c(rep(1, 50), rep(2, 50)) > >>> fit <- glm(y~x, data, family=binomial(), weights/10) > >>> res.dev <- residuals(fit, type="deviance") > >>> res2 <- -0.5*res.dev^2 > >>> cat("loglikelihood value", logLik(fit), sum(res2), "\n") > >>>> On Tue, Aug 25, 2020 at 11:40 AM peter dalgaard <pda...@gmail.com> > wrote: > >>>> If you don't worry too much about an additive constant, then half the > >>>> negative squared deviance residuals should do. (Not quite sure how > weights > >>>> factor in. Looks like they are accounted for.) > >>>> > >>>> -pd > >>>> > >>>>> On 25 Aug 2020, at 17:33 , John Smith <jsw...@gmail.com> wrote: > >>>>> > >>>>> Dear R-help, > >>>>> > >>>>> The function logLik can be used to obtain the maximum log-likelihood > >>>> value > >>>>> from a glm object. This is an aggregated value, a summation of > individual > >>>>> log-likelihood values. How do I obtain individual values? In the > >>>> following > >>>>> example, I would expect 9 numbers since the response has length 9. I > >>>> could > >>>>> write a function to compute the values, but there are lots of > >>>>> family members in glm, and I am trying not to reinvent wheels. > Thanks! > >>>>> > >>>>> counts <- c(18,17,15,20,10,20,25,13,12) > >>>>> outcome <- gl(3,1,9) > >>>>> treatment <- gl(3,3) > >>>>> data.frame(treatment, outcome, counts) # showing data > >>>>> glm.D93 <- glm(counts ~ outcome + treatment, family = poisson()) > >>>>> (ll <- logLik(glm.D93)) > >>>>> > >>>>> [[alternative HTML version deleted]] > >>>>> > >>>>> ______________________________________________ > >>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > >>>>> https://stat.ethz.ch/mailman/listinfo/r-help > >>>>> PLEASE do read the posting guide > >>>> http://www.R-project.org/posting-guide.html > >>>>> and provide commented, minimal, self-contained, reproducible code. > >>>> > >>>> -- > >>>> Peter Dalgaard, Professor, > >>>> Center for Statistics, Copenhagen Business School > >>>> Solbjerg Plads 3, 2000 Frederiksberg, Denmark > >>>> Phone: (+45)38153501 > >>>> Office: A 4.23 > >>>> Email: pd....@cbs.dk Priv: pda...@gmail.com > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>> [[alternative HTML version deleted]] > >>> ______________________________________________ > >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > >>> https://stat.ethz.ch/mailman/listinfo/r-help > >>> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > >>> and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.