Gerard
"Lam, Tzeng Yih"
<[EMAIL PROTECTED]
gonstate.edu> To
Sent by: "Prof Brian Ripley"
[EMAIL PROTECTED] <[EMAIL PROTECTED]>
project.org cc
r-help@r-project.org
Subject
11/12/2008 15:20 Re: [R] Validity of GLM using
Gaussian family with sqrt link
Dear Prof. Ripley,
Thank you for your quick response.
(A)
link-sqrt is a name and not accepted. link="sqrt" is a literal character
string, and is.
I am not entirely sure whether I understand that statement but this is what
I found out. If I specify family=gaussian(link=sqrt), the glm() fails to
run because it is not a default link (so, I understand this part).
Following Venables and Ripley (2002):
summary(glm(cnt~herbc+herbht,data=sotr,family=gaussian(link="sqrt"),start=c(0.1,-0.004,0.01)))
Call:
glm(formula = cnt ~ herbc + herbht, family = gaussian(link = "sqrt"),
data = sotr, start = c(0.1, -0.004, 0.01))
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.462211 0.043475 10.632 < 2e-16 ***
herbc -0.003315 0.001661 -1.996 0.0461 *
herbht 0.010241 0.001291 7.935 4.86e-15 ***
AIC: 3235.0
summary(glm(cnt~herbc+herbht,data=sotr,family=quasi(link=power(0.5),variance=constant),start=c(0.1,-0.004,0.01)))
Call:
glm(formula = cnt ~ herbc + herbht, family = quasi(link = power(0.5)),
data = sotr, start = c(0.1, -0.004, 0.01))
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.462211 0.043475 10.632 < 2e-16 ***
herbc -0.003315 0.001661 -1.996 0.0461 *
herbht 0.010241 0.001291 7.935 4.86e-15 ***
AIC: NA
Notice that the parameter estimates and corresponding standard errors are
identical. So, my interpretation is that family=gaussian(link="sqrt") is
identical as specify family=quasi(link=power(0.5)) in glm(). The exception
is that AIC (and thus maximized log-likelihood values) can be computed for
family=gaussian(link="sqrt").
The questions are:
(A.1) Is this interpretation correct?
(A.2) If (A.1) is true, does family=gaussian(link="sqrt") implies that I am
doing a Generalized Linear Model with normal distribution and the link
function is: sqrt(mu) = b0+b1(herbc)+b2(herbht)?
(B)
In less technical terms, in model 1 you compute the likelihood from
probabilities
and in model 2 from probability densities, and the latter depend on the
units of measurement.
Yes, you are correct and I understand it now. Although not as common these
days, some small mammal studies still use sqrt transformation of count as
response variable and carry out a linear model fitting with predictors (via
least squares). So, the exercise that I got into is to compare performances
of linear model with sqrt transformation of count and GLM with Poisson.
However, knowing that we can't compare logLik or AIC based on different
measures of responses. So, I thought that comparison under GLM framework
might be an approach closer to the intention.
Thank again for your quick respond and advices. I appreciate it very much.
Best regards,
TzengYih Lam
-----------------------------------
Ph.D. student
College of Forestry
Oregon State University
-----Original Message-----
From: Prof Brian Ripley [mailto:[EMAIL PROTECTED]
Sent: Wed 12/10/2008 11:45 PM
To: Lam, Tzeng Yih
Cc: r-help@r-project.org
Subject: Re: [R] Validity of GLM using Gaussian family with sqrt link
a) There is a difference between link=sqrt and link="sqrt".
link: a specification for the model link function. This can be a
name/expression, a literal character string, a length-one
character vector or an object of class '"link-glm"' (such as
generated by 'make.link') provided it is not specified _via_
one of the standard names given next.
link-sqrt is a name and not accepted. link="sqrt" is a literal character
string, and is.
b) Your first model is a model for integer observations, the second for
continuous observations. As such, the log-likleihoods are computed with
respect to different reference measures and are not comparable. In less
technical terms, in model 1 you compute the likelihood from probabilities
and in model 2 from probability densities, and the latter depend on the
units of measurement.
On Wed, 10 Dec 2008, Lam, Tzeng Yih wrote:
Dear all,
I have the following dataset: each row corresponds to count of forest
floor small mammal captured in a plot and vegetation characteristics
measured at that plot
sotr
plot cnt herbc herbht
1 1A1 0 37.08 53.54
2 1A3 1 36.27 26.67
3 1A5 0 32.50 30.62
4 1A7 0 56.54 45.63
5 1B2 0 41.66 38.13
6 1B4 0 32.08 37.79
7 1B6 0 33.71 30.62
...
I am interested in comparing fit of different specification of
Generalized Linear Models (although there are some issues with using AIC
or BIC for comparison, but this is the question that I like to post
here). Here are two of the several models that I am interested in:
(1) Poission log-linear model
pois<-glm(cnt~herbc+herbht,family=poisson,data=sotr)
summary(pois)
Call:
glm(formula = cnt ~ herbc + herbht, family = poisson, data = sotr)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.341254 0.089969 -14.908 <2e-16 ***
herbc -0.007303 0.003469 -2.105 0.0353 *
herbht 0.024064 0.002659 9.051 <2e-16 ***
---
Null deviance: 1699.0 on 1180 degrees of freedom
Residual deviance: 1569.8 on 1178 degrees of freedom
AIC: 2311.4
(2) Gaussian with sqrt link model
gaus.sqrt<-glm(cnt~herbc+herbht,family=gaussian(link="sqrt"),data=sotr,start=c(0.1,-0.004,0.01))
summary(gaus.sqrt)
Call:
glm(formula = cnt ~ herbc + herbht, family = gaussian(link = "sqrt"),
data = sotr, start = c(0.1, -0.004, 0.01))
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.462211 0.043475 10.632 < 2e-16 ***
herbc -0.003315 0.001661 -1.996 0.0461 *
herbht 0.010241 0.001291 7.935 4.86e-15 ***
---
Null deviance: 1144.6 on 1180 degrees of freedom
Residual deviance: 1062.9 on 1178 degrees of freedom
AIC: 3235.0
logLik(gaus.sqrt)
'log Lik.' -1613.524 (df=4)
From the glm() help file that I read, family=gaussian() accepts the
links "identity", "log" and "inverse". There is no mentioning of gaussian()
accepting "sqrt" link. Although "sqrt" link is available for
family=poisson()
A. Therefore, is the code in (2) actually computing Maximum Likelihood
Estimates (MLE) of the coefficients using Gaussian family with "sqrt"
link or is it computing MLE of something else?
B. If the code in (2) is computing the MLE with gaussian(link="sqrt"),
then will the maximized value of log-likelihood function using logLik()
be valid (other than the issue that the dispersion parameter is counted
as a parameter in aic() within glm())?
Thank you in advance and I appreciate it very much for any advices that
are offered.
Best regards,
TzengYih Lam
TzengYih Lam, PhD Student
College of Forestry
Oregon State University
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--
Brian D. Ripley, [EMAIL PROTECTED]
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
**********************************************************************************
The information transmitted is intended only for the person or entity to which
it is addressed and may contain confidential and/or privileged material. Any
review, retransmission, dissemination or other use of, or taking of any action
in reliance upon, this information by persons or entities other than the
intended recipient is prohibited. If you received this in error, please contact
the sender and delete the material from any computer. It is the policy of the
Department of Justice, Equality and Law Reform and the Agencies and Offices
using its IT services to disallow the sending of offensive material.
Should you consider that the material contained in this message is offensive
you should contact the sender immediately and also mailminder[at]justice.ie.
Is le haghaidh an duine nó an eintitis ar a bhfuil sí dírithe, agus le haghaidh
an duine nó an eintitis sin amháin, a bheartaítear an fhaisnéis a tarchuireadh
agus féadfaidh sé go bhfuil ábhar faoi rún agus/nó faoi phribhléid inti.
Toirmisctear aon athbhreithniú, atarchur nó leathadh a dhéanamh ar an
bhfaisnéis seo, aon úsáid eile a bhaint aisti nó aon ghníomh a dhéanamh ar a
hiontaoibh, ag daoine nó ag eintitis seachas an faighteoir beartaithe. Má fuair
tú é seo trí dhearmad, téigh i dteagmháil leis an seoltóir, le do thoil, agus
scrios an t-ábhar as aon ríomhaire. Is é beartas na Roinne Dlí agus Cirt,
Comhionannais agus Athchóirithe Dlí, agus na nOifígí agus na nGníomhaireachtaí
a úsáideann seirbhísí TF na Roinne, seoladh ábhair cholúil a dhícheadú.
Más rud é go measann tú gur ábhar colúil atá san ábhar atá sa teachtaireacht
seo is ceart duit dul i dteagmháil leis an seoltóir láithreach agus le
mailminder[ag]justice.ie chomh maith.
***********************************************************************************