On 4 Nov 2009, at 18:11, Gavin Simpson wrote:
Is there a particular reason for choosing a VGLM here? My reading of
your post suggests the response is an univariate, ordered factor and
VGLMs are especially for multivariate responses. In which case, can
you
not use polr() in package MASS that comes with R or the lms() function
in the rms package (available from CRAN).
I have used polr() but:
1) seems to be even more reluctant to give me at least the t value for
my marker: for the very same data I used to understand what was going
on vglm does give a t value for x (my marker), polr does not. As much
as my data is *dire* I still need to use it and get as many p-values
as I can (ideally one for each marker). The goodness of the model is
going to be assessed, perversely, on the whole sheabang of p-values
and their aderence to a null distribution.
Additionally, while extracting the t value is a piece of cake with
polr(), the p-value I get a nowhere close to a null distribution.
I will try lms() and hope for the best.
I haven't really used either of these functions in earnest, but one or
both may provide the p-values you desire, out of the box.
I hope so!
Thanks,
F
HTH
G
My response variable is the severity of diseases, going from 0 to 5
(the
severity is actually an ordered factor).
The independent variables are: 1 genetic marker, time of medical
observation,
age, sex. What I *need* is a p-value for the genetic marker.
Because I have ~1.5
million markers I'd rather not faffing around too much.
My model is:
mod.vglm = vglm(disease.status ~ x + time + age + sex, family =
cumulative(par = T))
where x is my genetic marker, coded as 0/1/2, time is days of
medical observation.
summary(mod.vglm) works:
Call:
vglm(formula = disease.status ~ x + time + age + sex, family =
cumulative(par = T))
Pearson Residuals:
Min 1Q Median 3Q Max
logit(P[Y<=1]) -0.6642 -0.28704 -0.18329 -0.11681 3.8919
logit(P[Y<=2]) -2.5580 -0.48080 -0.23315 0.47388 2.5983
logit(P[Y<=3]) -2.1565 -0.56961 0.22089 0.44349 10.7964
logit(P[Y<=4]) -3.3175 0.13064 0.20117 0.43176 12.5233
Coefficients:
Value Std. Error t value
(Intercept):1 -2.4460e+00 4.2791e-01 -5.7162
(Intercept):2 -7.1078e-01 4.1628e-01 -1.7074
(Intercept):3 3.7619e-01 4.1545e-01 0.9055
(Intercept):4 1.7467e+00 4.2092e-01 4.1496
x 4.1421e-01 1.9762e-01 2.0959
time -3.6021e-04 3.0387e-05 -11.8540
age -2.6115e-05 9.2504e-06 -2.8232
sexM 1.0188e-01 1.2491e-01 0.8156
Number of linear predictors: 4
Names of linear predictors:
logit(P[Y<=1]), logit(P[Y<=2]), logit(P[Y<=3]), logit(P[Y<=4])
Dispersion Parameter for cumulative family: 1
Residual Deviance: 2475.937 on 3460 degrees of freedom
Log-likelihood: -1237.969 on 3460 degrees of freedom
#######################
So here are my questions:
1) I need to get the t value for x, so I can use "1 - pt(tvalue,1)"
to find some
sort of probability value for x. That's not trivial. Additionally,
I assume df
for x is 1, hence I plan to use "1 - pt(tvalue,1)", though I might
well be
wrong. In any case getting the darned t value seems impossible
2) because of the difficulty of getting (1), it there a way of
getting vglm() to
spit out a p-value for x please?
I do recon many people might scoff at my crass desire for a p-
value, but I'm
dealing with some dire phenotype in a whole genome analysis where
the *only*
thing that matters are p-values. I have to be quite unsophysticated
I'm afraid.
Best,
Federico
--
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Dr. Gavin Simpson [t] +44 (0)20 7679 0522
ECRC, UCL Geography, [f] +44 (0)20 7679 0565
Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/
UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG
Tel +44 (0)20 75941602 Fax +44 (0)20 75943193
f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.