Re: [R] vglm(), t values and p values

Federico Calboli Wed, 04 Nov 2009 10:19:41 -0800

On 4 Nov 2009, at 18:11, Gavin Simpson wrote:

Is there a particular reason for choosing a VGLM here? My reading of
your post suggests the response is an univariate, ordered factor and

VGLMs are especially for multivariate responses. In which case, canyou

not use polr() in package MASS that comes with R or the lms() function
in the rms package (available from CRAN).


I have used polr() but:

1) seems to be even more reluctant to give me at least the t value formy marker: for the very same data I used to understand what was goingon vglm does give a t value for x (my marker), polr does not. As muchas my data is *dire* I still need to use it and get as many p-valuesas I can (ideally one for each marker). The goodness of the model isgoing to be assessed, perversely, on the whole sheabang of p-valuesand their aderence to a null distribution.

Additionally, while extracting the t value is a piece of cake withpolr(), the p-value I get a nowhere close to a null distribution.


I will try lms() and hope for the best.


I haven't really used either of these functions in earnest, but one or
both may provide the p-values you desire, out of the box.


I hope so!

Thanks,

F

HTH

G
My response variable is the severity of diseases, going from 0 to 5(the
severity is actually an ordered factor).
The independent variables are: 1 genetic marker, time of medicalobservation,age, sex. What I *need* is a p-value for the genetic marker.Because I have ~1.5
million markers I'd rather not faffing around too much.

My model is:
mod.vglm = vglm(disease.status ~ x + time + age + sex, family =
cumulative(par = T))
where x is my genetic marker, coded as 0/1/2, time is days ofmedical observation.
summary(mod.vglm) works:
Call:
vglm(formula = disease.status ~ x + time + age + sex, family =cumulative(par = T))
Pearson Residuals:
                   Min       1Q   Median       3Q     Max
logit(P[Y<=1]) -0.6642 -0.28704 -0.18329 -0.11681  3.8919
logit(P[Y<=2]) -2.5580 -0.48080 -0.23315  0.47388  2.5983
logit(P[Y<=3]) -2.1565 -0.56961  0.22089  0.44349 10.7964
logit(P[Y<=4]) -3.3175  0.13064  0.20117  0.43176 12.5233

Coefficients:
                    Value Std. Error  t value
(Intercept):1 -2.4460e+00 4.2791e-01  -5.7162
(Intercept):2 -7.1078e-01 4.1628e-01  -1.7074
(Intercept):3  3.7619e-01 4.1545e-01   0.9055
(Intercept):4  1.7467e+00 4.2092e-01   4.1496
x              4.1421e-01 1.9762e-01   2.0959
time          -3.6021e-04 3.0387e-05 -11.8540
age           -2.6115e-05 9.2504e-06  -2.8232
sexM           1.0188e-01 1.2491e-01   0.8156

Number of linear predictors:  4

Names of linear predictors:
logit(P[Y<=1]), logit(P[Y<=2]), logit(P[Y<=3]), logit(P[Y<=4])

Dispersion Parameter for cumulative family:   1

Residual Deviance: 2475.937 on 3460 degrees of freedom

Log-likelihood: -1237.969 on 3460 degrees of freedom

#######################

So here are my questions:
1) I need to get the t value for x, so I can use "1 - pt(tvalue,1)"to find somesort of probability value for x. That's not trivial. Additionally,I assume dffor x is 1, hence I plan to use "1 - pt(tvalue,1)", though I mightwell be
wrong. In any case getting the darned t value seems impossible
2) because of the difficulty of getting (1), it there a way ofgetting vglm() to
spit out a p-value for x please?
I do recon many people might scoff at my crass desire for a p-value, but I'mdealing with some dire phenotype in a whole genome analysis wherethe *only*thing that matters are p-values. I have to be quite unsophysticatedI'm afraid.
Best,

Federico
--
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%


--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG

Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] vglm(), t values and p values

Reply via email to