Re: [R] vglm(), t values and p values

Federico Calboli Wed, 04 Nov 2009 10:45:03 -0800

On 4 Nov 2009, at 18:40, Gavin Simpson wrote:

Additionally, while extracting the t value is a piece of cake with
polr(), the p-value I get a nowhere close to a null distribution.
Yes - I see that polr() also doesn't produce p-values in the outputfrom
summary. You can use it to get "a" p-value for x if you use a LRT; fit
the model using polr() with and without 'x' and then supply bothfitted
models to the anova() method for polr objects.

I did think of that solution, but it's the last thing on my cards...I'm trying to analyse 1.5 million markers, and fitting two models at atime for every marker starts to be a bit on the heavy side. If allelse fails that's the way to go.


I will try lms() and hope for the best.


Sorry, I meant the lrm() function in package rms


Ta,

F

I haven't really used either of these functions in earnest, butone or
both may provide the p-values you desire, out of the box.


I hope so!

Thanks,

F


HTH

G


My response variable is the severity of diseases, going from 0 to 5
(the
severity is actually an ordered factor).

The independent variables are: 1 genetic marker, time of medical
observation,
age, sex. What I *need* is a p-value for the genetic marker.
Because I have ~1.5
million markers I'd rather not faffing around too much.

My model is:

mod.vglm = vglm(disease.status ~ x + time + age + sex, family =

cumulative(par = T))

where x is my genetic marker, coded as 0/1/2, time is days of
medical observation.

summary(mod.vglm) works:


Call:
vglm(formula = disease.status ~ x + time + age + sex, family =
cumulative(par = T))

Pearson Residuals:
                  Min       1Q   Median       3Q     Max
logit(P[Y<=1]) -0.6642 -0.28704 -0.18329 -0.11681  3.8919
logit(P[Y<=2]) -2.5580 -0.48080 -0.23315  0.47388  2.5983
logit(P[Y<=3]) -2.1565 -0.56961  0.22089  0.44349 10.7964
logit(P[Y<=4]) -3.3175  0.13064  0.20117  0.43176 12.5233

Coefficients:
                   Value Std. Error  t value
(Intercept):1 -2.4460e+00 4.2791e-01  -5.7162
(Intercept):2 -7.1078e-01 4.1628e-01  -1.7074
(Intercept):3  3.7619e-01 4.1545e-01   0.9055
(Intercept):4  1.7467e+00 4.2092e-01   4.1496
x              4.1421e-01 1.9762e-01   2.0959
time          -3.6021e-04 3.0387e-05 -11.8540
age           -2.6115e-05 9.2504e-06  -2.8232
sexM           1.0188e-01 1.2491e-01   0.8156

Number of linear predictors:  4

Names of linear predictors:
logit(P[Y<=1]), logit(P[Y<=2]), logit(P[Y<=3]), logit(P[Y<=4])

Dispersion Parameter for cumulative family:   1

Residual Deviance: 2475.937 on 3460 degrees of freedom

Log-likelihood: -1237.969 on 3460 degrees of freedom

#######################

So here are my questions:

1) I need to get the t value for x, so I can use "1 - pt(tvalue,1)"
to find some
sort of probability value for x. That's not trivial. Additionally,
I assume df
for x is 1, hence I plan to use  "1 - pt(tvalue,1)", though I might
well be
wrong. In any case getting the darned t value seems impossible

2) because of the difficulty of getting (1), it there a way of
getting vglm() to
spit out a p-value for x please?

I do recon many people might scoff at my crass desire for a p-
value, but I'm
dealing with some dire phenotype in a whole genome analysis where
the *only*
thing that matters are p-values. I have to be quite unsophysticated
I'm afraid.

Best,

Federico

--

%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk

%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%


--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG

Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

--
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%


--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG

Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] vglm(), t values and p values

Reply via email to