from:"Rune Haubo"

Re: [R] "se.fit" option to the predict.nls() function

2015-01-05 Thread Rune Haubo

On 5 January 2015 at 21:08, Ben Bolker  wrote:
> Roger Coppock  cox.net> writes:
>
>>
>> When will "R" implement the "se.fit" option to the
>>  predict.nls() function?  Is there some schedule?
>>
>
>   I think this is unlikely to happen, ever (sorry).  The exact method
> for finding confidence intervals on nonlinear fits would be
> to compute likelihood profiles for each prediction, which would
> be rather tedious.

I understand profile likelihoods for parameters, but what do you mean
by a profile likelihood for yet unobserved observations, i.e.
predictions?

>
>   Another reasonable approach would be to use bootstrapping (see
> linked r-help thread below).
>
>   An approximate approach would be to use the delta method.
>
>   The nlstools package might be useful.

Alternatively the propagate package: it provides a function predictNLS
that computes uncertainty measures for nls predictions using (first
and second order) Taylor approximations as well as simulation methods.

I think the appropriateness of a simple (first order) Taylor/Delta
method depends on the application. I can think of two important
aspects: (1) if the model function is close to linear, you might be
ok. (2) if you are interested in a prediction-type (rather than
confidence) interval and the residual spread dominates the
uncertainty, any inaccuracies in the model function uncertainty (where
you apply the approximation) is swamped by the residual spread anyway.
In a recent application on shelf life estimation that I worked on,
both of these aspects were applicable and a simple approximation was
fine.

Cheers,
Rune

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Regression Overdispersion?

2015-02-01 Thread Rune Haubo

A third, and often preferable, way is to add an observation-level random effect:

library(lme4)
data1$obs <- factor(seq_len(nrow(data1)))
model <- glmer(y ~ x1 + x2 + (1 | obs), family=poisson(link=log), data=data1)

See http://glmm.wikidot.com/faq and search for "individual-level
random effects".

Cheers,
Rune

On 1 February 2015 at 19:55, David Barron  wrote:
> There are two straightforward ways of modelling overdispersion:
>
> 1) Use glm as in your example but specify family=quasipoisson.
> 2) Use glm.nb in the MASS package, which fits a negative binomial model.
>
>
>
> On 1 February 2015 at 16:26, JvanDyne  wrote:
>> I am trying to use Poisson regression to model count data with four
>> explanatory variables: ratio, ordinal, nominal and dichotomous – x1, x2, x3
>> and x4. After playing around with the input for a bit, I have formed – what
>> I believe is – a series of badly fitting models probably due to
>> overdispersion [1] - e.g. model=glm(y ~ x1 +
>> x2,family=poisson(link=log),data=data1) - and I was looking for some general
>> guidance/direction/help/approach to correcting this in R.
>>
>> [1] – I believe this as a. it’s, as I’m sure you’re aware, a possible reason
>> for poor model fits; b.the following:
>>
>> tapply(data1$y,data$x2,function(x)c(mean=mean(x),variance=var(x)))
>>
>> seems to suggest that, whilst variance does appear to be some function of
>> the mean, there is a consistently large difference between the two
>>
>>
>>
>>
>>
>> --
>> View this message in context: 
>> http://r.789695.n4.nabble.com/Regression-Overdispersion-tp4702611.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Checking the proportional odds assumption holds in an ordinal logistic regression using polr function

2014-11-26 Thread Rune Haubo

Dear Charlie,

I admit that I haven't read your email closely, but here is a way to
test for non-proportional odds using the ordinal package (warning:
self-promotion) using the wine data set also from the ordinal package.
There is more information in the package vignettes

Hope this is something you can use.
Cheers,
Rune

> library(ordinal)
> ## Fit model:
> fm <- clm(rating ~ temp + contact, data=wine)
> summary(fm)
formula: rating ~ temp + contact
data:wine

 link  threshold nobs logLik AICniter max.grad cond.H
 logit flexible  72   -86.49 184.98 6(0)  4.64e-15 2.7e+01

Coefficients:
   Estimate Std. Error z value Pr(>|z|)
tempwarm 2.5031 0.5287   4.735 2.19e-06 ***
contactyes   1.5278 0.4766   3.205  0.00135 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Threshold coefficients:
Estimate Std. Error z value
1|2  -1.3444 0.5171  -2.600
2|3   1.2508 0.4379   2.857
3|4   3.4669 0.5978   5.800
4|5   5.0064 0.7309   6.850
> ## Model with non-proportional odds for contact:
> fm2 <- clm(rating ~ temp, nominal=~contact, data=wine)
> ## Likelihood ratio test of non-proportional odds:
> anova(fm, fm2)
Likelihood ratio tests of cumulative link models:

formula:nominal: link: threshold:
fm  rating ~ temp + contact ~1   logit flexible
fm2 rating ~ temp   ~contact logit flexible

no.parAIC  logLik LR.stat df Pr(>Chisq)
fm   6 184.98 -86.492
fm2  9 190.42 -86.209  0.5667  3  0.904
> ## Automatic tests of non-proportional odds for all varibles:
> nominal_test(fm)
Tests of nominal effects

formula: rating ~ temp + contact
Df  logLikAICLRT Pr(>Chi)
 -86.492 184.98
temp 3 -84.904 187.81 3.1750   0.3654
contact  3 -86.209 190.42 0.5667   0.9040

On 25 November 2014 at 17:21, Charlotte Whitham
 wrote:
> Dear list,
>
> I have used the ‘polr’ function in the MASS package to run an ordinal 
> logistic regression for an ordinal categorical response variable with 15 
> continuous explanatory variables.
> I have used the code (shown below) to check that my model meets the 
> proportional odds assumption following advice provided at 
> (http://www.ats.ucla.edu/stat/r/dae/ologit.htm) – which has been extremely 
> helpful, thank you to the authors! However, I’m a little worried about the 
> output implying that not only are the coefficients across various cutpoints 
> similar, but they are exactly the same (see graphic below).
>
> Here is the code I used (and see attached for the output graphic)
>
> FGV1b<-data.frame(FG1_val_cat=factor(FGV1b[,"FG1_val_cat"]),scale(FGV1[,c("X","Y","Slope","Ele","Aspect","Prox_to_for_FG","Prox_to_for_mL","Prox_to_nat_border","Prox_to_village","Prox_to_roads","Prox_to_rivers","Prox_to_waterFG","Prox_to_watermL","Prox_to_core","Prox_to_NR","PCA1","PCA2","PCA3")]))
>
> b<-polr(FGV1b$FG1_val_cat ~ FGV1b$X + FGV1b$Y + FGV1b$Slope + FGV1b$Ele + 
> FGV1b$Aspect + FGV1b$Prox_to_for_FG + FGV1b$Prox_to_for_mL + 
> FGV1b$Prox_to_nat_border + FGV1b$Prox_to_village + FGV1b$Prox_to_roads + 
> FGV1b$Prox_to_rivers + FGV1b$Prox_to_waterFG + FGV1b$Prox_to_watermL + 
> FGV1b$Prox_to_core + FGV1b$Prox_to_NR, data = FGV1b, Hess=TRUE)
>
> #Checking the assumption. So the following code will estimate the values to 
> be graphed. First it shows us #the logit transformations of the probabilities 
> of being greater than or equal to each value of the target #variable
>
> FGV1b$FG1_val_cat<-as.numeric(FGV1b$FG1_val_cat)
>
> sf <- function(y) {
>
>   c('VC>=1' = qlogis(mean(FGV1b$FG1_val_cat >= 1)),
>
> 'VC>=2' = qlogis(mean(FGV1b$FG1_val_cat >= 2)),
>
> 'VC>=3' = qlogis(mean(FGV1b$FG1_val_cat >= 3)),
>
> 'VC>=4' = qlogis(mean(FGV1b$FG1_val_cat >= 4)),
>
> 'VC>=5' = qlogis(mean(FGV1b$FG1_val_cat >= 5)),
>
> 'VC>=6' = qlogis(mean(FGV1b$FG1_val_cat >= 6)),
>
> 'VC>=7' = qlogis(mean(FGV1b$FG1_val_cat >= 7)),
>
> 'VC>=8' = qlogis(mean(FGV1b$FG1_val_cat >= 8)))
>
> }
>
>   (t <- with(FGV1b, summary(as.numeric(FGV1b$FG1_val_cat) ~ FGV1b$X + FGV1b$Y 
> + FGV1b$Slope + FGV1b$Ele + FGV1b$Aspect + FGV1b$Prox_to_for_FG + 
> FGV1b$Prox_to_for_mL + FGV1b$Prox_to_nat_border + FGV1b$Prox_to_village + 
> FGV1b$Prox_to_roads + FGV1b$Prox_to_rivers + FGV1b$Prox_to_waterFG + 
> FGV1b$Prox_to_watermL + FGV1b$Prox_to_core + FGV1b$Prox_to_NR, fun=sf)))
>
>
>
> #The table displays the (linear) predicted values we would get if we 
> regressed our
>
> #dependent variable on our predictor variables one at a time, without the 
> parallel slopes
>
> #assumption. So now, we can run a series of binary logistic regressions with 
> varying cutpoints
>
> #on the dependent variable to check the equality of coefficients across 
> cutpoints
>
> par(mfrow=c(1,1))
>
> plot(t, which=1:8, pch=1:8, xlab='logit', main=' ', xlim=range(s[,7:8]))
>
>
>
> Apologies that I am no statistics expert and perhaps I am missing something 
> obvious here. However, I have spent a long time trying to figure out i

Re: [R] Checking the proportional odds assumption holds in an ordinal logistic regression using polr function

2014-11-26 Thread Rune Haubo

On 26 November 2014 at 17:55, Charlotte Whitham
 wrote:
> Dear Rune,
>
> Thank you for your prompt reply and it looks like the ordinal package could 
> be the answer I was looking for!
>
> If you don't mind, I'd also like to know please what to do if the tests show 
> the proportional odds assumption is NOT met. (Unfortunately I notice effects 
> from almost all variables that breach the proportional odds assumption in my 
> dataset)

That depends almost entirely on the purpose of the analysis and is not
a topic fit for email - consulting a local statistician is probably
sound advice... Yet: With enough data these tests can be sensitive
beyond practical significance; if the 'proportional' part of the
effect explains the majority of the deviance, perhaps the proportional
odds model provides a reasonably good description of the main
structures in the data anyway. On the other hand, if the magnitude
(not significance!) of the non-proportional effects are large, perhaps
a cumulative link model is not the right kind of model structure and
you should be looking at alternative approaches in your analysis.

Cheers,
Rune

>
> Would you recommend a multinomial logistic model? Or re-scaling of the data?
>
> Thank you for your time,
> Best wishes,
>
> Charlie
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] proportional odds logistic regression with non-negative constraint for several coefficients

2017-01-28 Thread Rune Haubo

Hi Zhao,

This is not a direct answer to your question, but a suggestion for a
different approach. The ordinal package was designed to cope with
issues like this (parameter constraints in ordinal regression models)
- try the following:

> library(ordinal)
> data(wine, package="ordinal")
> ## Fit model for reference:
> fm1 <- clm(rating ~ temp + contact, data=wine)
> # summary(fm1)
> coef(fm1)
   1|22|33|44|5   tempwarm contactyes
 -1.344383   1.250809   3.466887   5.006404   2.503102   1.527798
>
> ## Now fit the same model by manually optimizing a clm-environment:
> env <- clm(rating ~ temp + contact, data=wine, doFit=FALSE)
> # ls.str(env) ## view contents of env
> ## Define negative log-likelihood function:
> nll <- function(par, envir) {
+   envir$par <- par
+   envir$clm.nll(envir)
+ }
> ## optimize with nlminb:
> nlminb(start=env$par, objective=nll, envir=env)$par
[1] -1.344385  1.250812  3.466887  5.006404  2.503102  1.527798
>
> # Now optimize under parameter constraints:
> nlminb(start=env$par, objective=nll, envir=env,
+upper = c(Inf, Inf, Inf, Inf, 2, 1),
+lower = rep(-Inf, 6))$par
[1] -1.6124363  0.8060461  2.8052591  4.2401832  2.000  1.000
>

Cheers
Rune

On 26 January 2017 at 20:41, Liu, Zhao  wrote:
> Hi,
>
> I am  working on proportional odds logistic regression, and trying to figure 
> out how to specify the constraint for several predictors.  Those non-negative 
> constraints for some predictors are for practical purpose.
>
> I have seen some one posted passing box constraint with L-BFGS-B with 
> logistic regression.
>
> What I did not is to use polr() to solve the proportional odds, and modify 
> the source code for polr() by passing the lower bounds to the optim() and 
> change the method to L-BFGS-B.
>
> Then I realized that polr() generate a start value for all coefficients with 
> glm.fit, which can still start from negative.
>
> So my question is that does the start value having negative while the 
> optimization has a lower bound as 0.1. Does it matter?
>
> Or is there another way of implementation to solve proportional odds while 
> forcing some coefficients  as non-negative.
>
> Thanks so much!
>
> Zhao
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Likelihood ratio test between glm and glmer fits

2008-07-17 Thread Rune Haubo

+SINGLE+chronic+vigor_d+moderat_d,family=binomial(),
>>  data=mx.merge)
>> #GLMER fit
>>
>> ph.fit.3<-glmer(poorhealth~RURAL+SMOKE+DRINK+EMPLOYED+INSURANCE+highereduc+INDIG+AGE+male+divorced+SINGLE+chronic+vigor_d+moderat_d+(1|munid),family=binomial(),
>>  data=mx.merge)
>>
>> I cannot find a method in R that will do the LR test between a glm  and a
>> glmer fit, so I try to do it using the liklihoods from both  models
>>
>> #form the likelihood ratio test between the glm and glmer fits
>> x2<--2*(logLik(ph.fit.2)-logLik(ph.fit.3))
>>
>>>ML
>>
>> 79.60454
>> attr(,"nobs")
>>n
>> 45243
>> attr(,"nall")
>>n
>> 45243
>> attr(,"df")
>> [1] 14
>> attr(,"REML")
>> [1] FALSE
>> attr(,"class")
>> [1] "logLik"
>>
>> #Get the associated p-value
>> dchisq(x2,14)
>> ML
>>>
>>> 5.94849e-15
>>
>> Which looks like an improvement in model fit to me.  Am I seeing  this
>> correctly or are the two models even able to be compared? they  are both
>> estimated via maximum likelihood, so they should be, I think.
>> Any help would be appreciated.
>>
>> Corey
>>
>> Corey S. Sparks, Ph.D.
>>
>> Assistant Professor
>> Department of Demography and Organization Studies
>> University of Texas San Antonio
>> One UTSA Circle
>> San Antonio, TX 78249
>> email:[EMAIL PROTECTED]
>> web: https://rowdyspace.utsa.edu/users/ozd504/www/index.htm
>>
>>
>>[[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
>
>
> Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Rune Haubo Bojesen Christensen

Master Student, M.Sc. Eng.
Phone: (+45) 30 26 45 54
Mail: [EMAIL PROTECTED], [EMAIL PROTECTED]

DTU Informatics, Section for Statistics
Technical University of Denmark, Build.321, DK-2800 Kgs. Lyngby, Denmark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [R-pkgs] New package: ordinal

2010-03-16 Thread Rune Haubo

This is to announce the new R-package ‘ordinal’ that implements
cumulative link (mixed) models for ordinal (ordered categorical) data
(http://www.cran.r-project.org/package=ordinal/).

The main features are:
-   scale (multiplicative) as well as location (additive) effects
-   nominal effects for a subset of the predictors (denoted partial
proportional odds when the link is the logistic)
-   structured thresholds, e.g. assuming symmetry or equidistant thresholds
-   random effects via the Laplace approximation and adaptive
Gauss-Hermite quadrature in the location-part of the model.
-   a range of standard link functions
-   flexible link functions where an extra link function-parameter
bridges the log-log, probit and c-loglog links (log-gamma), and
cloglog and logistic links (Aranda-Ordaz)
-   a suite of optimizers including an efficient Newton scheme.
-   works for binomial observations (a special case of ordinal data).
-   a suite of methods including anova, addterm, dropterm, profile,
confint, plot.profile, predict, in addition to the standard print and
summary methods.
-   an important special case is the proportional odds model (with
random effects).
-   a range of examples illustrates how to use the functions.

Future additions will include:
-   more general random effect structures: multiple (crossed and nested)
and vector-valued random effects.
-   profile methods for variance parameters in mixed effect models.
-   helpful package vignettes.
-   implementation of core functions in C.

Comments, critique, suggestions, wishes and contributions are always
highly appreciated.

Kind regards
Rune

-- 
Rune Haubo Bojesen Christensen

PhD student, M.Sc. Eng.
Phone: (+45) 45 25 33 63
Mail: rhbc at imm.dtu.dk

DTU Informatics, Section for Statistics
Technical University of Denmark, Build. 305, Room 122,
DK-2800 Kgs. Lyngby, Denmark

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Chi-square values in GLM model comparison

2013-09-11 Thread Rune Haubo

There is no argument 'test' to anova.clm hence the error message.

The likelihood ratio statistic (or, alternatively, G^2 statistic or
Deviance statistic) has an asymptotic chi-square distribution, so it
is the size of that statistic your reviewers are asking for. It is
printed in the anova output under the name 'LR.stat' (1252 on 8 df in
your case it seems).

Cheers,
Rune

On 11 September 2013 18:17, Torvon  wrote:
> José,
>
> I get the following error message:
>
>> m1<-clm(sym_bin ~ phq_index, data=data2)
>> m2<-clm(sym_bin ~ 1, data=data2)
>> anova(m1,m2,test="Chisq")
>
>> Error in anova.clm(m1, m2, test = "Chisq") :
>>  only 'clm' and 'clmm' objects are allowed
>
> My dependent variable is binary, so I don't know what the problem could be.
> See below the model summaries. Thank you! Eiko
>
>> summary(m1)
> formula: sym_bin ~ phq_index
> data:data2
>
>  link  threshold nobs  logLik   AIC niter max.grad cond.H
>  logit flexible  12348 -4846.49 9710.97 7(0)  2.53e-08 1.4e+02
>
> Coefficients:
>Estimate Std. Error z value Pr(>|z|)
> phq_index2 -0.297050.11954  -2.4850.013 *
> phq_index3  0.633820.10262   6.176 6.56e-10 ***
> phq_index4  1.530220.09664  15.834  < 2e-16 ***
> phq_index5  0.907200.09996   9.075  < 2e-16 ***
> phq_index6 -0.038550.11337  -0.3400.734
> phq_index7 -0.064880.11394  -0.5690.569
> phq_index8 -1.156180.15156  -7.628 2.38e-14 ***
> phq_index9 -2.500640.25670  -9.741  < 2e-16 ***
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Threshold coefficients:
> Estimate Std. Error z value
> 0|1  1.877700.07959   23.59
>
>
>> summary(m2)
> formula: sym_bin ~ 1
> data:data2
>
>  link  threshold nobs  logLik   AIC  niter max.grad
>  logit flexible  12348 -5472.48 10946.96 5(0)  1.01e-11
>
> Threshold coefficients:
>   0|1
> 1.642
>
>
>
>
>
>
>
> On 11 September 2013 18:03, Jose Iparraguirre <
> jose.iparragui...@ageuk.org.uk> wrote:
>
>> Hi Eiko,
>>
>> How about this?
>>
>> > anova (m1, m2, test="Chisq")
>>
>> See: ?anova.glm
>>
>> Regards,
>> José
>>
>>
>> Prof. José Iparraguirre
>> Chief Economist
>> Age UK
>>
>>
>>
>> -Original Message-
>> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
>> On Behalf Of Torvon
>> Sent: 11 September 2013 16:48
>> To: r-help@r-project.org
>> Subject: [R] Chi-square values in GLM model comparison
>>
>> Hello --
>> I am comparing two
>> GLMs (binomial dependent variable)
>> , the results are the following:
>> > m1<-glm(symptoms ~ phq_index, data=data2) m2<-glm(symptoms ~ 1,
>> > data=data2)
>>
>> Trying to compare these models using
>> > anova (m1, m2)
>> I do not obtain chi-square values or a chi-square difference test;
>> instead, I get loglikelihood ratios:
>>
>> > Likelihood ratio tests of cumulative link models:
>> > formula: link: threshold:
>> > m2 sym_bin ~ 1 logit flexible
>> > m1 sym_bin ~ phq_index logit flexible
>> >   no.par   AIC   logLik  LR.stat df Pr(>Chisq)
>> > m2  110947   -5472.5
>> > m1  9 9711   -4846.51252  8  < 2.2e-16 ***
>>
>> Since reviewers would like me to report chi-square values: how to I obtain
>> them when comparing GLMs? I'm looking for an output similar to the output
>> of the GLMER function in LME4, e.g.:
>>
>> > anova(m3,m4)
>> ...
>> >   Df   AIC   BIC  logLik Chisq Chi Df Pr(>Chisq)
>> > m3 13 11288 11393 -5630.9
>> > m4 21 11212 11382 -5584.9 92.02  8  < 2.2e-16 ***
>>
>> Thank you!
>>  Eiko
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>> The Wireless from Age UK | Radio for grown-ups.
>>
>> www.ageuk.org.uk/thewireless
>>
>>
>> If you’re looking for a radio station that offers real variety, tune in to
>> The Wireless from Age UK.
>> Whether you choose to listen through the website at
>> www.ageuk.org.uk/thewireless, on digital radio (currently available in
>> London and Yorkshire) or through our TuneIn Radio app, you can look forward
>> to an inspiring mix of music, conversation and useful information 24 hours
>> a day.
>>
>>
>>
>>
>> ---
>> Age UK is a registered charity and company limited by guarantee,
>> (registered charity number 1128267, registered company number 6825798).
>> Registered office: Tavis House, 1-6 Tavistock Square, London WC1H 9NA.
>>
>> For the purposes of promoting Age UK Insurance, Age UK is an Appointed
>> Representative of Age UK Enterprises Limited, Age UK is an Introducer
>> Appointed Representative of JLT Benefit Solutions Limited and Simplyhealth
>> Access for the purposes of introducing potential annuity and health
>> cash plans customers respectively.  Age UK Enterprises Limited, JLT
>> Benefit

Re: [R] Test of Parallel Regression Assumption in R

2013-03-11 Thread Rune Haubo

Dear Heather,

You can make this test using the ordinal package. Here the function
clm fits cumulative link models where the ordinal logistic regression
model is a special case (using the logit link).

Let me illustrate how to test the parallel regression assumption for a
particular variable using clm in the ordinal package. I am using the
wine dataset from the same package, I fit a model with two explanatory
variables; temp and contact, and I test the parallel regression
assumption for the contact variable in a likelihood ratio test:

> library(ordinal)
Loading required package: MASS
Loading required package: ucminf
Loading required package: Matrix
Loading required package: lattice
> head(wine)
  response rating temp contact bottle judge
1   36  2 cold  no  1 1
2   48  3 cold  no  2 1
3   47  3 cold yes  3 1
4   67  4 cold yes  4 1
5   77  4 warm  no  5 1
6   60  4 warm  no  6 1
> fm1 <- clm(rating ~ temp + contact, data=wine)
> fm2 <- clm(rating ~ temp, nominal=~ contact, data=wine)
> anova(fm1, fm2)
Likelihood ratio tests of cumulative link models:

formula:nominal: link: threshold:
fm1 rating ~ temp + contact ~1   logit flexible
fm2 rating ~ temp   ~contact logit flexible

no.parAIC  logLik LR.stat df Pr(>Chisq)
fm1  6 184.98 -86.492
fm2  9 190.42 -86.209  0.5667  3  0.904

The idea is to fit the model under the null hypothesis (parallel
effects - fm1) and under the alternative hypothesis (non-parallel
effects for contact - fm2) and compare these models with anova() which
performs the LR test. From the high p-value we see that the null
cannot be rejected and there is no evidence of non-parallel slopes in
this case. For additional information, I suggest that you take a look
at the following package vignette
(http://cran.r-project.org/web/packages/ordinal/vignettes/clm_tutorial.pdf)
where these kind of tests are more thoroughly described starting page
6.

I think you can also make similar tests with the VGAM package, but I
am not as well versed in that package.

Hope this helps,
Rune

Rune Haubo Bojesen Christensen
Postdoc
DTU Compute - Section for Statistics
---
Technical University of Denmark
Department of Applied Mathematics and Computer Science
Richard Petersens Plads
Building 324, Room 220
2800 Lyngby
Direct +45 45253363
Mobile +45 30264554
http://www.imm.dtu.dk


On 11 March 2013 22:52, Nicole Ford  wrote:
> here's some code as an example  hope it helps!
>
> mod<-polr(vote~age+demsat+eusup+lrself+male+retnat+union+urban, data=dat)
> summary(mod)
>
>
> mod<-polr(vote~age+demsat+eusup+lrself+male+retnat+union+urban, data=dat)
> levs<-levels(dat$vote)
> tmpdat<-list()
> for(i in 1:(nlevels(dat$vote)-1)){
> tmpdat[[i]] <- dat
> tmpdat[[i]]$z <- as.numeric(as.numeric(tmpdat[[1]]$vote) <= levs[i])
> }
> form<-as.formula("z~age+demsat+eusup+lrself+male+retnat+union+urban")
> mods<-lapply(tmpdat, function(x)glm(form, data=x, family=binomial))
> probs<-sapply(mods, predict, type="response")
> p.logits<-cbind(probs[,2], t(apply(probs, 1, diff)), 1-probs[,ncol(probs)])
> p.ologit<-predict(mod, type='probs')
> n<-nrow(p.logits)
> bin.ll <- p.logits[cbind(1:n, dat$vote)]
> ologit.ll <- p.ologit[cbind(1:n, dat$vote)]
> binom.test(sum(bin.ll > ologit.ll), n)
>
>
> dat$vote.fac<-factor(dat$vote, levels=1:6)
> mod<-polr(dat$vote.fac~age+demsat+eusup+lrself+male+retnat+union+urban, 
> data=dat)
>
> source("http://www.quantoid.net/cat_pre.R ")
> catpre(mod)
>
> install.packages("rms")
> library(rms)
> olprobs<-predict(mod, type='probs')
> pred.cat<-apply(olprobs, 1, which.max)
> table(pred.cat, dat$vote)
>
> round(prop.table(table(pred.cat, dat$vote), 2), 3)
> On Mar 11, 2013, at 5:02 PM, Heather Kettrey wrote:
>
>> Hi,
>>
>> I am running an analysis with an ordinal outcome and I need to run a test
>> of the parallel regression assumption to determine if ordinal logistic
>> regression is appropriate. I cannot find a function to conduct such a test.
>>> From searching various message boards I have seen a few useRs ask this same
>> question without a definitive answer - and I came across a thread that
>> indicated there is no such function available in any R packages. I hope
>> this is incorrect.
>>
>> Does anyone know how to test the parallel regression assumption in R?
>>
>> Thanks for your help!
>>
>>
>> --
>> Heather Hensman Kettrey
>> PhD Candidate
>> Department of Sociology
>> Vanderbilt Univ

Re: [R] Optimisation and NaN Errors using clm() and clmm()

2013-04-15 Thread Rune Haubo

On 15 April 2013 13:18, Thomas  wrote:
>
> Dear List,
>
> I am using both the clm() and clmm() functions from the R package 'ordinal'.
>
> I am fitting an ordinal dependent variable with 5 categories to 9 continuous 
> predictors, all of which have been normalised (mean subtracted then divided 
> by standard deviation), using a probit link function. From this global model 
> I am generating a confidence set of 200 models using clm() and the 'glmulti' 
> R package. This produces these errors:
>
> /> model.2.10 <- glmulti(as.factor(dependent) ~ 
> predictor_1*predictor_2*predictor_3*predictor_4*predictor_5*predictor_6*predictor_7*predictor_8*predictor_9,
>  data = database, fitfunc = clm, link = "probit", method = "g", crit = aicc, 
> confsetsize = 200, marginality = TRUE)
> ...
> After 670 generations:
> Best model: 
> as.factor(dependent)~1+predictor_1+predictor_2+predictor_3+predictor_4+predictor_5+predictor_6+predictor_8+predictor_9+predictor_4:predictor_3+predictor_6:predictor_2+predictor_8:predictor_5+predictor_9:predictor_1+predictor_9:predictor_4+predictor_9:predictor_5+predictor_9:predictor_6
> Crit= 183.716706496392
> Mean crit= 202.022138576506
> Improvements in best and average IC have bebingo en below the specified goals.
> Algorithm is declared to have converged.
> Completed.
> There were 24 warnings (use warnings() to see them)
> > warnings()
> Warning messages:
> 1: optimization failed: step factor reduced below minimum
> 2: optimization failed: step factor reduced below minimum
> 3: optimization failed: step factor reduced below minimum/
> etc.
>
>
> I am then re-fitting each of the 200 models with the clmm() function, with 2 
> random factors (family nested within order). I get this error in a few of the 
> re-fitted models:
>
> /> model.2.glmm.2 <- clmm(as.factor(dependent) ~ 1 + predictor_1 + 
> predictor_2 + predictor_3 + predictor_6 + predictor_7 + predictor_8 + 
> predictor_9 + predictor_6:predictor_2 + predictor_7:predictor_2 + 
> predictor_7:predictor_3 + predictor_8:predictor_2 + predictor_9:predictor_1 + 
> predictor_9:predictor_2 + predictor_9:predictor_3 + predictor_9:predictor_6 + 
> predictor_9:predictor_7 + predictor_9:predictor_8+ (1|order/family), link = 
> "probit", data = database)
> > summary(model.2.glmm.2)
> >
> Cumulative Link Mixed Model fitted with the Laplace approximation
>
> formula: as.factor(dependent) ~ 1 + predictor_1 + predictor_2 + predictor_3 + 
> predictor_6 + predictor_7 + predictor_8 + predictor_9 + 
> predictor_6:predictor_2 + predictor_7:predictor_2 +
> predictor_7:predictor_3 + predictor_8:predictor_2 + predictor_9:predictor_1 + 
> predictor_9:predictor_2 +
> predictor_9:predictor_3 + predictor_9:predictor_6 + predictor_9:predictor_7 + 
> predictor_9:predictor_8 + (1 | order/family)
> data: database
>
> link threshold nobs logLik AIC niter max.grad cond.H
> probit flexible 103 -65.56 173.13 58(3225) 8.13e-06 4.3e+03
>
> Random effects:
> Var Std.Dev
> family:order 7.493e-11 8.656e-06
> order 1.917e-12 1.385e-06
> Number of groups: family:order 12, order 4
>
> Coefficients:
> Estimate Std. Error z value Pr(>|z|)
> predictor_1 0.40802 0.78685 0.519 0.6041
> predictor_2 0.02431 0.26570 0.092 0.9271
> predictor_3 -0.84486 0.32056 -2.636 0.0084 **
> predictor_6 0.65392 0.34348 1.904 0.0569 .
> predictor_7 0.71730 0.29596 2.424 0.0154 *
> predictor_8 -1.37692 0.75660 -1.820 0.0688 .
> predictor_9 0.15642 0.28969 0.540 0.5892
> predictor_2:predictor_6 -0.46880 0.18829 -2.490 0.0128 *
> predictor_2:predictor_7 4.97365 0.82692 6.015 1.80e-09 ***
> predictor_3:predictor_7 -1.13192 0.46639 -2.427 0.0152 *
> predictor_2:predictor_8 -5.52913 0.88476 -6.249 4.12e-10 ***
> predictor_1:predictor_9 4.28519 NA NA NA
> predictor_2:predictor_9 -0.26558 0.10541 -2.520 0.0117 *
> predictor_3:predictor_9 -1.49790 NA NA NA
> predictor_6:predictor_9 -1.31538 NA NA NA
> predictor_7:predictor_9 -4.41998 NA NA NA
> predictor_8:predictor_9 3.99709 NA NA NA
> ---
> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Threshold coefficients:
> Estimate Std. Error z value
> 0|1 -0.2236 0.3072 -0.728
> 1|2 1.4229 0.3634 3.915
> (211 observations deleted due to missingness)
> Warning message:
> In sqrt(diag(vc)[1:npar]) : NaNs produced/
>

This warning is due to a (near) singular variance-covariance matrix of
the model parameters, which in turn is due to the fact that the model
converged to a boundary solution: both random effects variance
parameters are zero. If you exclude the random terms and refit the
model with clm, the variance-covariance matrix will probably be well
defined and standard errors can be computed.

Another thing is that you are fitting 17 regression parameters and 2
random effect terms (which in the end do not count) to only 103
observations. I would be worried about overfitting or perhaps even
non-fitting. I think I would also be concerned about the 211
observations that are incomplete, and I would be careful with
automatic model selection/averaging e

Re: [R] Optimisation and NaN Errors using clm() and clmm()

2013-04-20 Thread Rune Haubo

On 18 April 2013 18:38, Thomas Foxley  wrote:
> Rune,
>
> Thank you very much for your response.
>
> I don't actually have the models that failed to converge from the first
> (glmulti) part as they were not saved with the confidence set. glmulti
> generates thousands of models so it seems reasonable that a few of these may
> not converge.
>
> The clmm() model I provided was just an example - not all models have 17
> parameters. There were only one or two that produced errors (the example I
> gave being one of them), perhaps overparameterisation is the root of the
> problem.
>
> Regarding incomplete data - there are only 103 (of 314) records where I have
> data for every predictor. The number of observations included will obviously
> vary for different models, models with fewer predictors will include more
> observations. glmulti acts as a wrapper for another function, meaning (in
> this case) na's are treated as they would be in clm(). Is there a way around
> this (apart from filling in the missing data)? I believe its possible to
> limit model complexity in the glmulti call - which may or may not increase
> the number of observations - how would this affect interpretation of the
> results?

Since the likelihood (and hence also AIC-like criteria) depends on the
number of observations, I would make sure that only models with the
same number of observations are compared using model selection
criteria. This means that I would make a data.frame with complete
observations either by just deleting all rows with one or more missing
predictors or by imputing some data points. If one or a couple of
variables  are responsible for most of the missing observations, you
could disregard these variables before deleting rows with NAs.

As I said, I am no expert in model averaging or glmulti usage, so
there might be better approaches or other opinions on this.

Cheers,
Rune

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] multilevel binary and ordered regression models

2013-06-06 Thread Rune Haubo

On 6 June 2013 00:13, Xu Jun  wrote:
> Dear r-helpers,
>
> I have two questions on multilevel binary and ordered regression models,
> respectively:
>
> 1. Is there any r function (like lmer or glmer) to run multilevel ordered
> regression models?

Yes, package ordinal will fit such models.

Cheers,
Rune

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] multilevel binary and ordered regression models

2013-06-13 Thread Rune Haubo

Could you share the results of sessionInfo() and str(alllev)?

Also please share the exact in- and output with relevant error
messages; for example 'cntnew:male' does not make much sense without
context.

Unfortunately I don't understand your model specification and is lost
in the interpretation of gammas, w's, x's, mu's beta's and their
indexes, so it is hard for me to say whether you are specifying the
models you seek correctly i clmm. Perhaps the package vignettes can be
an inspiration here? Ahh, I see you have discovered those.

And yes, the current implementation only allows random effects terms
on the form (1| ), so unless x1 is a factor
(1|group:x1) wont work. I currently have a working implementation that
allows all kinds of random effects terms on the form
( | ) like glmer allows for, but I
lack the time to wrap it up and test it before I can release it on
CRAN.

Others have successfully fitted models to large data sets (nobs > 1e6)
with several random effects terms, so I don't expect that is a
problem. If you there are a lot of parameters in your models, then,
yes, they can take a while to fit. Equally important is the
identifiability of the estimation problem; the optimum of a ill
defined model can be very hard to locate.

Cheers,
Rune

On 9 June 2013 16:44, Xu Jun  wrote:
> Rune,
>
> Thanks a lot for pointing me to your ordinal package. It is wonderful,
> and I tried a random intercept model and it worked well except that
> probably there is something wrong with my data (size is big), I got
> some warning messages indicating that "In sqrt(diag(vc)[1:npar]) :
> NaNs produced." Therefore, some of the coefficients do not have
> standard errors. I also have a follow-up question: if I want to
> estimate a slope as outcome model, how am I supposed to specify the
> model. I tried following the few tutorials you posted on CRAN, but was
> not able to figure it out:
>
>
> Here is my model:
>
> level 1: y*_ij = beta_0j + beta_1j*x1_ij + beta_2j*x2_ij +
> beta_3j*x3_ij + epsilon_ij
>
> where y* is a latent continuous variable and y is an observed binary
> dependent variable.
> y  is an observed ordinal variable, say ranging from 1-4, and has been
> coded as a factor variable.
>
> x's are predictors at level 1
> beta's are regression coefficients at level 1
> epsilon's are error terms in level 1 equations
>
> level 2 Eq1: beta_0j  = gamma_00 + gamma_01*w1_j + gamma_02*w2_j + mu_0j
> Level 2 Eq2: beta_1j  = gamma_10 + gamma_11*w1_j + gamma_02*w2_j + mu_1j
>
> My guess would be:
> # try 1
> clmm(y~ x1 + x2 + x3 + w1 + w2 + w1:x1 + w2:x2 + (1 + x1 | group), #
> like the syntax in lmer or glmer
>  data=alllev, link="logit",
>  na.action=na.omit,
>  Hess=T)
>
> # or try 2
>
> clmm(y~ x1 + x2 + x3 + w1 + w2 + w1:x1 + w2:x2 + (1  | group) + (1 | 
> group:x1),
>  data=alllev, link="logit",
>  na.action=na.omit,
>  Hess=T)
>
> but none worked. After I issued the first try, I got the following message:
>
> Error: Matrices must have same number of columns in rbind2(..1, r)
> In addition: Warning messages:
> 1: In cntnew:male :
>   numerical expression has 10 elements: only the first used
>
> and after the second try, simply it says that I got to following the
> (1|factor) format. I would appreciate that if you could point me to
> the right direction. Also, I know I am dealing with a relatively large
> data set, but is there any way to speed up the estimation a bit.
> Thanks a lot!
>
> Jun
>
> On Fri, Jun 7, 2013 at 1:04 AM, Rune Haubo  wrote:
>> On 6 June 2013 00:13, Xu Jun  wrote:
>>> Dear r-helpers,
>>>
>>> I have two questions on multilevel binary and ordered regression models,
>>> respectively:
>>>
>>> 1. Is there any r function (like lmer or glmer) to run multilevel ordered
>>> regression models?
>>
>> Yes, package ordinal will fit such models.
>>
>> Cheers,
>> Rune

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] p values of lmer

2013-06-19 Thread Rune Haubo

Try

library(lmerTest)
fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy)
summary(fm1)

Linear mixed model fit by REML
Formula: Reaction ~ Days + (Days | Subject)
   Data: sleepstudy
  AIC  BIC logLik deviance REMLdev
 1756 1775 -871.8 17521744
Random effects:
 Groups   NameVariance Std.Dev. Corr
 Subject  (Intercept) 612.092  24.7405
  Days 35.072   5.9221  0.066
 Residual 654.941  25.5918
Number of obs: 180, groups: Subject, 18

Fixed effects:
Estimate Std. Error t value Pr(>|t|)
(Intercept)  251.405  6.825   36.84  < 2e-16 ***
Days  10.467  1.5466.77 3.27e-06 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
 (Intr)
Days -0.138

Cheers,
Rune

On 19 June 2013 11:27, meng  wrote:
> Hi all:
> I met a question about lmer.
>
> fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy)
> summary(fm1)
>
> ...
>
> Fixed effects:
> Estimate Std. Error t value
> (Intercept)  251.405  6.825   36.84
> Days  10.467  1.5466.77
>
> ...
>
> My question:
> Why p values of (Intercept) and Days are not given?
>
>
> Many thanks!
>
> Best.
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Find correlation in Clmm?

2012-09-11 Thread Rune Haubo

Den 11/09/2012 16.36 skrev "Anera Salucci" :
>
> Hi all,
>
> I am trying to fit a random effect  model to categorical response
variable using package "ordinal" /"clmm".
>
> How can I find the correlation between random effects (random intercept
and random slope)

You cannot, as such models are not implemented. The model you are fitting
is assuming there is no such correlation.

Cheers,
Rune
>
> Thanks in advance
> Ana
> [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error clmm(){ordinal}

2014-06-02 Thread Rune Haubo

It's telling you that one or more of the grouping factors for the
random-effect terms has less than three levels. From what you write,
this seems to apply to Location: you may want to treat it as a
fixed-effect instead.

Hope this helps,
Rune

On 2 June 2014 14:00, adesgroux  wrote:
> Dear all,
>
> I am trying to run the function clmm() on a data table composed as following
> :
> 
>
> I have 187 pea lines assessed on 4 years * 2 locations * 3 blocs * 15
> plantes for disease resistance. Disease resistance is assessed with a 0-to-5
> scale which made an ordinal variable.
> Here is the script a wrote :
>
>>Tab.INRCh=read.csv("INRChamp201013.csv",sep=";")
>>Tab.INRCh$Year=as.factor(Tab.INRCh$Year)
>>Tab.INRCh$RRI<-factor(Tab.INRCh$RRI,levels=c("5","4","3","2","1","0"),ordered=TRUE)
>>Tab.INRCh$RRI_DHW1<-factor(Tab.INRCh$RRI_DHW1,levels=c("5","4","3","2","1","0"),ordered=TRUE)
>>Tab.INRCh$RRI_DHW2<-factor(Tab.INRCh$RRI_DHW2,levels=c("5","4","3","2","1","0"),ordered=TRUE)
>
>>Modele=clmm(RRI~Line+RRI_DHW1+RRI_DHW2+(1|Year)+(1|Location)+(1|Not)+(1|Bloc)+(1|Plante),data=Tab.INRCh,na.action=na.omit)
>
> I want to create this modele to run function Anova() to know if disease
> resistance is different among lines.
>
> When I run the script, I have an error :
> Erreur : all(sapply(gfl, nlevels) > 2) is not TRUE
> I can't find anything about it on the internet...
> Can someone help me on this?
>
> Thanks
> Best regards
> Aurore
>
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Error-clmm-ordinal-tp4691592.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error clmm(){ordinal}

2014-06-04 Thread Rune Haubo

Aurore,

I don't know if car::Anova is able/should be able to produce anova
tables for clmm objects; I usually use drop1() (and sometimes add1) to
test terms in CLMMS:

> library(ordinal)
> fm1 <- clmm(rating ~ temp + contact + (1|judge), data=wine)
> drop1(fm1, test="Chi")
Single term deletions

Model:
rating ~ temp + contact + (1 | judge)
DfAICLRT  Pr(>Chi)
 177.13
temp 1 209.59 34.464 4.343e-09 ***
contact  1 189.48 14.347  0.000152 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Here you get the more accurate likelihood ratio tests instead of the
faster and more convenient Wald tests that (if I recall correctly)
car::Anova can provide.

HTH,
Rune


On 4 June 2014 09:18, adesgroux  wrote:
> Dear Rune,
>
> Thanks a lot for your comment!
> I tried with "Location" as a fixed-effect and clmm() ran! It take quiet a
> long time but I have approximatively 67000 observations in my datatable so I
> guess it's quiet normal!
>
> I want to run function Anova{car} on Modele but it gave my another error :
>>Anova(Modele)
> /Error in eval(expr, envir, enclos) : object 'RRI' not found/
>
> any idea?
> Thanks again
>
> Aurore
>
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Error-clmm-ordinal-tp4691592p4691693.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Ordered probit using clm2

2012-11-06 Thread Rune Haubo

Hi Alice,

A factor is a fairly basic R concept that you can read about in
http://cran.r-project.org/doc/manuals/R-intro.pdf on page 16. Now to
fit the CLM, you need to turn your response variable into a factor
with something like

datareg$Newpercentagecash <- factor(datareg$Newpercentagecash, ordered=TRUE)

after loading your data, but before fitting the model. I recommend
that you take a look at the variable to see that it has the levels
that you expect and that they are ordered appropriately.

And by the way, I recommend that you use clm() rather than clm2() if possible.

Hope this helps,
Rune

On 6 November 2012 23:00, Alice LAMBERTON  wrote:
> Hi,
>
> I am new in R. I would like to do a ordered probit regression using clm2 (in 
> the ordinal package).
> My dependent variable y is the way of payment in M&A: y=0 if the deal is 
> financed by stock only, y=1 if the deal is financed by a mix of cash and 
> stock and y=2 if it is by cash only.
> My independent variables are CollateralB, Cashavailable and Leverage.
>
> This is the code I wrote:
>
>> library(ordinal)
>> datareg<-read.xls("C:/regression.xls")
>> myprobit<-clm2(Newpercentagecash ~ CollateralB + CashavailableB + LEVERAGEB, 
>> data = datareg, link = "probit")
> Error in clm2(Newpercentagecash ~ CollateralB + CashavailableB + LEVERAGEB,  :
>   response needs to be a factor
>
> I do not understand this error message. My "y" only equals 0, 1 or 2. I do 
> not understand what a "factor" is.
> Could you help me please?
>
> Thank you in advance,
>
> Best regards,
>
> Alice
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Rune H B Christensen, PhD
DTU Informatics, Section for Statistics
Technical University of Denmark, Build. 305, Room 122,
DK-2800 Kgs. Lyngby, Denmark
Phone: (+45) 45 25 33 63
Mobile: (+45) 30 26 45 54

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Multilevel analysis for ordinal responses

2014-03-01 Thread Rune Haubo

Yes; see clm and clmm2 (mixed effects) in the ordinal package for
fitting proportional odds models. See section 3 of
http://cran.r-project.org/web/packages/ordinal/vignettes/clm_tutorial.pdf
to see how to test the proportional odds assumption with clm - it is
equivalent for clmm2 models. For an introduction to clmm2, see
http://cran.r-project.org/web/packages/ordinal/vignettes/clmm2_tutorial.pdf.

Hope this helps,
Rune

On 1 March 2014 02:56, shkingdom  wrote:
> Dear all,
>
> I need to fit a multielvel model for an ordinal response. Does R have a
> command for conducting a multilevel ordinal logistic regression when the
> model violates the parallel regression or proportional odds assumption?
> Additionally, are there any tests to check the parallel regression
> assumption for the multilevel ordered model? Providing an syntax would be
> really helpful.
>
> Thanks in advance,
>
> Wander
>
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Multilevel-analysis-for-ordinal-responses-tp4686057.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problems with clmm2 (ordinal data fit)

2014-03-16 Thread Rune Haubo

Dear Caroline,

Yes, it seems you have complete separation for the 'Timepoint'
variable. This means that the likelihood is unbounded for that
parameter and the optimizer just terminates when it gets far enough
out on an asymptote and improvements are below a threshold. This is
also the reason the variance-covariance matrix of the parameters i
singular, and so standard errors and summary-p-values are not
available. This doesn't mean that you cannot perform the likelihood
ratio test though, so the anova() comparisons should not be
misleading.

I would recommend, though, that (1) if you really want to fit the
mixed-effects models, use the newer implementation in clmm() instead
of clmm2(), and that you (2) fit your models without the
random-effects term, since the variance of that is clearly zero (and
hence use clm()) - this is going to add to the instability of the
optimization and singularity of the variance-covariance matrix of the
parameters.

Hope this helps,
Rune

On 16 March 2014 19:45, Caroline Lustenberger
 wrote:
> Dear all
>
> I have ordinal data (from a questionnaire with 4 levels) for 2 groups (30 
> subjects each) and 2 timepoints. So I used a cumulative link mixed model to 
> fit the data (nr = subject number).
>
> mod_FV<-clmm2(FV~GruppeVerbelendung+Timepoint,random=nr,data=data,Hess=TRUE,nAGQ=10,na.action=na.omit)
> mod_FV1<-clmm2(FV~GruppeVerbelendung,random=nr,data=data,Hess=TRUE,nAGQ=10,na.action=na.omit)
>
>
> For some questions (e.g. FV) I had the following output with NANs instead of 
> p-values
>
> Call:
> clmm2(location = FV ~ GruppeVerbelendung * Timepoint, random = nr,
> data = data, na.action = na.omit, Hess = TRUE, nAGQ = 10)
>
> Random effects:
> Var  Std.Dev
> nr 5.881958e-07 0.0007669392
>
> Location coefficients:
>Estimate Std. Error z value Pr(>|z|)
> GruppeVerbelendung1-0.0178  NaNNaN NA
> Timepoint1 18.6316  NaNNaN NA
> GruppeVerbelendung1:Timepoint1  0.3505  NaNNaN NA
>
> No scale coefficients
>
> Threshold coefficients:
> Estimate Std. Error z value
> 1|2 20.7741  NaNNaN
> 2|3 20.9486  NaNNaN
>
> log-likelihood: -24.01783
> AIC: 60.03566
> Condition number of Hessian: 5479162.40
> (4 observations deleted due to missingness)
> Warnmeldung:
> In sqrt(diag(vc)) : NaNs wurden erzeugt
>
> However when comparing the models (mod_FV, mod_FV1) I obtain a usable result 
> (p-val) even though the models seem not to be a good fit (high AIC and 
> condition number of Hessian).
>
>  anova(mod_FV,mod_FV1)
> Likelihood ratio tests of cumulative link models
>
> Response: FV
>  Model Resid. df -2logLik   TestDf LR 
> stat.  Pr(Chi)
> 1 GruppeVerbelendung |  |112 58.44384
> 2 GruppeVerbelendung + Timepoint |  |111 27.78259 1 vs 2 1 
> 30.66125 3.072406e-08
>
> What could be the problem? Timepoint 1 was always scored with level 1 (60 of 
> 60 subjects had  scored 1) and during timepoint 2 subjects scored level 1-3. 
> Might it be a problem that I only have 1 type of scoring level for timepoint 
> 1?
> Can I use the results obtained by Likelihod ratio test for clm even if the 
> model was not a good fit? What could I do instead?
>
>
> Thank you so much for your help and all the best
> Caroline
>
>
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Longitudinal categorical response data

2011-03-26 Thread Rune Haubo

lmer is not designed for ordered categorical data as yours are. You could
take a look at the ordinal package which is designed for this type of data
including mixed models (function clmm) which you probably want to use.

Best,
Rune

Den 24/03/2011 21.03 skrev "Rasanga Ruwanthi" :
>
> Dear List,
>
> I have some longitudinal data, each patient was followed at times 0, 12,
16, 24 weeks and measure severity of a illness (0-worse, 1-same, 2-better).
So, longitudinal response is categorical.  I was wondering whether lmer in R
can fit a model for this type of data. If so, how we code? Or any
other function in R that can fit this type of longitudinal data? Any
suggestion would be greatly appreciated.

>
> Thanks
> Ruwanthi
>
>
>
>
>[[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Ordinal logistic regression p-values

2011-08-29 Thread Rune Haubo

There is also clm() (for cumulative link models) from package ordinal
that has much the same interface that polr() has, but it does give you
p-values for the regression parameters. A simple example from
examples(clm):

library(ordinal)
data(wine)
fm1 <- clm(rating ~ contact + temp, data=wine)
(summ <- summary(fm1))

giving the output:

formula: rating ~ contact + temp
data:wine

 link  threshold nobs logLik AICniter max.grad cond.H
 logit flexible  72   -86.49 184.98 6(0)  4.01e-12 2.7e+01

Coefficients:
   Estimate Std. Error z value Pr(>|z|)
contactyes   1.5278 0.4766   3.205  0.00135 **
tempwarm 2.5031 0.5287   4.735 2.19e-06 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Threshold coefficients:
Estimate Std. Error z value
1|2  -1.3444 0.5171  -2.600
2|3   1.2508 0.4379   2.857
3|4   3.4669 0.5978   5.800
4|5   5.0064 0.7309   6.850

The p-values are extracted with:
> coef(summ)[5:6, 4]
  contactyes tempwarm
1.348440e-03 2.194605e-06

Cheers,
Rune

On 30 August 2011 03:30, Frank Harrell  wrote:
> Use the rms package to replace Design.  Run anova(fit object from lrm) which
> produces a matrix from which you can extract P-values.  This also handles
> the case of multiple degrees of freedom per predictor.
> Frank
>
> Debs Majumdar wrote:
>>
>> Hi,
>>
>>    Are there any packages which prints out p-values for OLR's (like
>> `ologit' from Stata)? I want to run a bunch of OLRs and print the p-value
>> for the first coefficient from each of them.
>>
>>
>>   I checked polr() under MASS and it doesn't.
>>
>>  There's a lrm() function under Design which does print out p-values but I
>> couldn't extract p-values from the output.
>>
>>
>>   Thanks,
>>
>>   Debs
>>
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
> -
> Frank Harrell
> Department of Biostatistics, Vanderbilt University
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Ordinal-logistic-regression-p-values-tp3777674p368.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Rune Haubo Bojesen Christensen

Ph.D. Student, M.Sc. Eng.
Phone: (+45) 45 25 33 63
Mobile: (+45) 30 26 45 54

DTU Informatics, Section for Statistics
Technical University of Denmark, Build. 305, Room 122,
DK-2800 Kgs. Lyngby, Denmark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] significance level (p) for t-value in package zelig

2012-06-25 Thread Rune Haubo

According to standard likelihood theory these are actually not
t-values, but z-values, i.e., they asymptotically follow a standard
normal distribution under the null hypothesis. This means that you
could use pnorm instead of pt to get the p-values, but an easier
solution is probably to use the clm-function (for Cumulative Link
Models) from the ordinal package - here you get the p-values
automatically.

Cheers,
Rune

On 23 June 2012 07:02, Bert Gunter  wrote:
> This advice is almost certainly false!
>
> A "t-statistic" can be calculated, but the distribution will not
> necessarily be student's t nor will the "df" be those of the rse.  See, for
> example, rlm() in MASS, where values of the t-statistic are given without p
> values. If Brian Ripley says that p values cannot be straightforwardly
> calculated by pt(), then believe it!
>
> -- Bert
>
> On Fri, Jun 22, 2012 at 9:30 PM, Özgür Asar  wrote:
>
>> Michael,
>>
>> Try
>>
>> ?pt
>>
>> Best
>> Ozgur
>>
>> --
>> View this message in context:
>> http://r.789695.n4.nabble.com/significance-level-p-for-t-value-in-package-zelig-tp4634252p4634271.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
>
> Internal Contact Info:
> Phone: 467-7374
> Website:
> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
>
>        [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Rune Haubo Bojesen Christensen

PhD Student, M.Sc. Eng.
Phone: (+45) 45 25 33 63
Mobile: (+45) 30 26 45 54

DTU Informatics, Section for Statistics
Technical University of Denmark, Build. 305, Room 122,
DK-2800 Kgs. Lyngby, Denmark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] significance level (p) for t-value in package zelig

2012-06-26 Thread Rune Haubo

My point was just that the situation in a cumulative link model is not
much different from a binomial glm - the binomial glm is even a
special case of the clm with only two response categories. And just
like summary(glm(, family=binomial)) reports z-values and computes
p-values by using the normal distribution as reference, one can do the
same in a cumulative link model by applying the same asymptotic
arguments.

In both models the variance is determined implicitly by the mean, so a
t-distribution is never involved.

Cheers,
Rune

On 25 June 2012 11:05, Prof Brian Ripley  wrote:
> On 25/06/2012 09:32, Rune Haubo wrote:
>>
>> According to standard likelihood theory these are actually not
>> t-values, but z-values, i.e., they asymptotically follow a standard
>> normal distribution under the null hypothesis. This means that you
>
>
> Whose 'standard'?
>
> It is conventional to call a value of t-like statistic (i.e. a ratio of the
> form value/standard error) a 't-value'.  And that is nothing to do with
> 'likelihood theory' (t statistics predate the term 'likelihood'!).
>
> The separate issue is whether a t statistic is even approximately
> t-distributed (and if so, on what df?), and another is if it is
> asymptotically normal.  For the latter you have to say what you mean by
> 'asymptotic': we have lost a lot of the context, but as this does not appear
> to be IID univariate observations:
>
> - 'standard likelihood theory' is unlikely to apply.
>
> - standard asymptotics may well not be a good approximation (in regression
> modelling, people tend to fit more complex models to large datasets, which
> is often why a large dataset was collected).
>
> - even for IID observations the derivation of the t distribution assumes
> normality.
>
> The difference between a t distribution and a normal distribution is
> practically insignificant unless the df is small.   And if the df is small,
> one can rarely rely on the CLT for approximate normality 
>
>
>> could use pnorm instead of pt to get the p-values, but an easier
>> solution is probably to use the clm-function (for Cumulative Link
>> Models) from the ordinal package - here you get the p-values
>> automatically.
>>
>> Cheers,
>> Rune
>>
>> On 23 June 2012 07:02, Bert Gunter  wrote:
>>>
>>> This advice is almost certainly false!
>>>
>>> A "t-statistic" can be calculated, but the distribution will not
>>> necessarily be student's t nor will the "df" be those of the rse.  See,
>>> for
>>> example, rlm() in MASS, where values of the t-statistic are given without
>>> p
>>> values. If Brian Ripley says that p values cannot be straightforwardly
>>> calculated by pt(), then believe it!
>>>
>>> -- Bert
>>>
>>> On Fri, Jun 22, 2012 at 9:30 PM, Özgür Asar  wrote:
>>>
>>>> Michael,
>>>>
>>>> Try
>>>>
>>>> ?pt
>>>>
>>>> Best
>>>> Ozgur
>>>>
>>>> --
>>>> View this message in context:
>>>>
>>>> http://r.789695.n4.nabble.com/significance-level-p-for-t-value-in-package-zelig-tp4634252p4634271.html
>>>> Sent from the R help mailing list archive at Nabble.com.
>>>>
>>>> __
>>>> R-help@r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>>
>>>
>>> --
>>>
>>> Bert Gunter
>>> Genentech Nonclinical Biostatistics
>>>
>>> Internal Contact Info:
>>> Phone: 467-7374
>>> Website:
>>>
>>> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
>>>
>>>        [[alternative HTML version deleted]]
>>>
>>>
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>>
>
>
> --
> Brian D. Ripley,                  rip...@stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.s

Re: [R] Package 'MASS' (polr): Error in svd(X) : infinite or missing values in 'x'

2012-07-10 Thread Rune Haubo

Hi Jeremy,

I think Jessica is right that probably you could make polr converge
and produce a Hessian if the data are better scaled, but there might
also be other things not allowing you to get the Hessian/vcov. Could
be insightful if you showed us the result of
str(Jdata)

Also, I am thinking that perhaps the another implementation of ordinal
regression models might avoid the problem. You could try the ordinal
package (of which I am the author) -- the following should reproduce
the MASS::polr results:

install.packages(ordinal)
library(ordinal)
Global <- clm(JVeg5 ~  Elevation +  Lat_Y_pos + Coast_dist +
Stream_dist, data=Jdata)
summary(Global)

Another option would be Frank Harrell's lrm function in the rms package.

HTH,
Rune

On 9 July 2012 11:55, Jeremy Little  wrote:
> Hello,
>
> I am trying to run an ordinal logistic regression (polr) using the package
> 'MASS'.
>
> I have successfully run other regression classes (glm, multinom) without
> much problem, but with the 'polr' class I get the following error:
> " Error in svd(X) : infinite or missing values in 'x' "
> which appears when I run the "summary" command.
>
> The data file is large (585000 rows) and has no NA, - or blank values.
>
> My script (in brief) is as follows, with results:
>
> 
>> library(MASS)
>>
>> ## ADD DATA
>> Jdata<- read.delim("/Analysis/20120709 JLittle data file.txt", header=T)
>>
>> attach(Jdata)
>> names(Jdata)
>  [1] "POINTID" "Lat_Y_pos"   "JVeg5"   "Subregion"   "Rock_U_Nam"
> "Rock_Name"   "Elevation"   "Slope"   "Aspect"  "Hillshade"
> "Stream_dist" "Coast_dist"  "Coast_SE"
> [14] "Coast_E" "Wind_310""TPI" "Landform"
>>
>> Global <- polr(JVeg5 ~  Elevation +  Lat_Y_pos + Coast_dist + Stream_dist,
>> data=Jdata)
>>
>> summary(Global)
> Error in svd(X) : infinite or missing values in 'x'
>>
> ##Try with omit NA command
>> Global <- polr(JVeg5 ~  Elevation +  Lat_Y_pos + Coast_dist + Stream_dist,
>> data=Jdata, na.action = na.omit, Hess = TRUE)
>>
>> summary(Global)
> Error in svd(X) : infinite or missing values in 'x'
> 
>
> Does this imply an 'infinite value' and what would this mean?
>
> If anyone has any idea how to address this error, I would very much
> appreciate your response.
>
> Thank you in advance.
>
> Jeremy
>
> Date File Attachment (200 rows):
> http://r.789695.n4.nabble.com/file/n4635829/20120709_JLittle_data_file.txt
> 20120709_JLittle_data_file.txt
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Package-MASS-polr-Error-in-svd-X-infinite-or-missing-values-in-x-tp4635829.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Rune Haubo Bojesen Christensen

Ph.D. Student, M.Sc. Eng.
Phone: (+45) 45 25 33 63
Mobile: (+45) 30 26 45 54

DTU Informatics, Section for Statistics
Technical University of Denmark, Build. 305, Room 122,
DK-2800 Kgs. Lyngby, Denmark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Testing the proportional odds assumption of an ordinal generalized estimating equations (GEE) regression model

2011-06-30 Thread Rune Haubo

If you are willing to leave the GEE world and use a mixed effects
version of the proportional odds model, then you can test the
proportional odds assumption with the clmm function in package
ordinal. You simply include those variables for which you want to test
the assumption in the nominal formula in clmm. See the manual entry
for clmm for more info.

Best,
Rune

On 28 June 2011 20:48, Andreas Karlsson  wrote:
>
> Dear list members,
>
> I am estimating an ordinal generalized estimating equations (GEE) regression 
> model on repeated measurements data.
>
> Is there any way (preferably in R) to test the proportional odds assumption 
> of this model?
>
> Thanks in advance.
>
> Andreas
>
>
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Rune Haubo Bojesen Christensen

Ph.D. Student, M.Sc. Eng.
Phone: (+45) 45 25 33 63
Mobile: (+45) 30 26 45 54

DTU Informatics, Section for Statistics
Technical University of Denmark, Build. 305, Room 122,
DK-2800 Kgs. Lyngby, Denmark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] difference between MASS::polr() and Design::lrm()

2008-06-30 Thread Rune Haubo

Dear Vito

No, you are not wrong, but you should center score prior to model estimation:

summary(fm1 <- polr(factor(grade)~I(score - mean(score

which gives the same standard errors as do lrm. Now the intercepts
refer the median score rather than some potential unrealistic score of
0.

You can always check if standard errors are appropriate by plotting
the profile likelihood and the normal approximation (based on the
standard errors):

x <- seq(-.043 - .04, -.043 + .04, len = 20)
dev <- double(20)
for(i in 1:20)
dev[i] <- polr(factor(grade) ~ 1 + offset(x[i] * (score -
median(score$deviance

yy <- spline(x, dev, n = 100, method = "natural")
plot(yy[[1]], exp(-(yy[[2]] - min(yy[[2]]))/2), type = "l",
xlab = "Score", ylab = "Normalized profile likelihood")
## Add Normal approximation:
lines(yy[[1]], exp(-(yy[[1]] - coef(fm1))^2 /
 (2 * vcov(fm1)[1, 1])), lty = 2,
  col = "red")

## Add confidence limits:
level <- c(0.95, 0.99)
lim <- sapply(level, function(x)
  exp(-qchisq(x, df=1)/2) )
abline(h = lim, col = "grey")

## Add confidence interval points:
points(confint(fm1), rep(lim[1], 2), pch = 3)

So these standard errors are the most appropriate you can get, but
because the profile likelihood is not symmetric, you could consider
reporting the asymmetric profile likelihood confidence interval
instead.

Hope this helps
Rune

2008/6/30 vito muggeo <[EMAIL PROTECTED]>:
> Dear all,
> It appears that MASS::polr() and Design::lrm() return the same point
> estimates but different st.errs when fitting proportional odds models,
>
> grade<-c(4,4,2,4,3,2,3,1,3,3,2,2,3,3,2,4,2,4,5,2,1,4,1,2,5,3,4,2,2,1)
> score<-c(525,533,545,582,581,576,572,609,559,543,576,525,574,582,574,471,595,
>  557,557,584,599,517,649,584,463,591,488,563,553,549)
>
> library(MASS)
> library(Design)
>
> summary(polr(factor(grade)~score))
> lrm(factor(grade)~score)
>
> It seems to me that results from lrm() are right..
>
> Am I wrong?
> Thanks,
> vito
>
> --
> 
> Vito M.R. Muggeo
> Dip.to Sc Statist e Matem `Vianelli'
> Università di Palermo
> viale delle Scienze, edificio 13
> 90128 Palermo - ITALY
> tel: 091 6626240
> fax: 091 485726/485612
> http://dssm.unipa.it/vmuggeo
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] difference between MASS::polr() and Design::lrm()

2008-06-30 Thread Rune Haubo

Hi Vito
(question to the authors of MASS below)

2008/6/30 vito muggeo <[EMAIL PROTECTED]>:
> Dear Haubo,
> many thanks for your reply.
> Yes you are right, by scaling the "score", I get the same results.
>
> However it sounds strange to me. I understand that the SE and/or t-ratio of
> the intercepts depend on the scale of the explanatory variable, but I do not
> understand  why the t-ratio of the slope also depends on the scale.. In
> classical regression models (LM, GLM, ...) the scale of the covariate does
> not affect the t ratio! As far as I know, classical references (e.g.
> Agresti, 2002) do not discuss a such point. Could yu suggest me additional
> reference?  Namely why should I center the explanatory variables?

The wrong standard errors is a computational issue. Centering is a
statistical and (sometimes) a computational issue.

The t-value is just the ratio of the estimate to its standard error,
and when the program does not give you the correct standard error,
naturally the t-value is also wrong. It is often a good idea to center
covariates for inferential purposes and a very good reference is
Venables' "Exegeses on Linear Models":
http://www.stats.ox.ac.uk/pub/MASS3/Exegeses.pdf.

I find it natural to expect polr to give the correct standard error
without centering the covariate. I do not know why it does not, but
perhaps is has something to do with the scaling of the parameters
parsed to the BFGS-optimizer in optim. Optimization of the intercepts
seems to be performed on something like

summary(fm1 <- polr(factor(grade)~score))
cumsum(c(fm1$zeta[1], exp(fm1$zeta[-1])))

which seems to distort the whole variance-covariance matrix of the
parameters and therefore provides wrong standard errors for the
score-parameter as well. Optimization is much better scaled if you do

summary(fm2 <- polr(factor(grade)~1 + I(score - median(score
cumsum(c(fm2$zeta[1], exp(fm2$zeta[-1])))

This is merely a guess, and perhaps the authors of MASS, could provide
some clarification?

One other thing, that puzzles me, is the fact that the parameter
estimate/standard error ratio is reported as a t-value rather than a
z-value. Likelihood theory (as I know it) seems to suggest a z-value
rather than the t-value. A t-value is also provided by the special
case; the ordinary logistic regression model:

summary(glm(grade > 3 ~ score, family = binomial))

as well as by lrm in package Design.

>
> Of course profile-Lik based CI may be very usefuls at this aim..but this is
> another story..

I agree, but the topic is closely related to standard errors ;-)

Best
Rune

>
> many thanks again,
> vito
>
> Rune Haubo ha scritto:
>>
>> Dear Vito
>>
>> No, you are not wrong, but you should center score prior to model
>> estimation:
>>
>> summary(fm1 <- polr(factor(grade)~I(score - mean(score
>>
>> which gives the same standard errors as do lrm. Now the intercepts
>> refer the median score rather than some potential unrealistic score of
>> 0.
>>
>> You can always check if standard errors are appropriate by plotting
>> the profile likelihood and the normal approximation (based on the
>> standard errors):
>>
>> x <- seq(-.043 - .04, -.043 + .04, len = 20)
>> dev <- double(20)
>> for(i in 1:20)
>> dev[i] <- polr(factor(grade) ~ 1 + offset(x[i] * (score -
>> median(score$deviance
>>
>> yy <- spline(x, dev, n = 100, method = "natural")
>> plot(yy[[1]], exp(-(yy[[2]] - min(yy[[2]]))/2), type = "l",
>>xlab = "Score", ylab = "Normalized profile likelihood")
>> ## Add Normal approximation:
>> lines(yy[[1]], exp(-(yy[[1]] - coef(fm1))^2 /
>> (2 * vcov(fm1)[1, 1])), lty = 2,
>>  col = "red")
>>
>> ## Add confidence limits:
>> level <- c(0.95, 0.99)
>> lim <- sapply(level, function(x)
>>  exp(-qchisq(x, df=1)/2) )
>> abline(h = lim, col = "grey")
>>
>> ## Add confidence interval points:
>> points(confint(fm1), rep(lim[1], 2), pch = 3)
>>
>> So these standard errors are the most appropriate you can get, but
>> because the profile likelihood is not symmetric, you could consider
>> reporting the asymmetric profile likelihood confidence interval
>> instead.
>>
>> Hope this helps
>> Rune
>>
>> 2008/6/30 vito muggeo <[EMAIL PROTECTED]>:
>>>
>>> Dear all,
>>> It appears that MASS::polr() and Design::lrm() return the same point
>>> estimates but different st.errs when fitting proportional odds models,
>>>
>>> grade<-c(4,4,2,4,3,2,3,1,3,3,2,2,3,3,2,4,2,4,5,2,1,4,1,2,5,3,4,2,2,1)
>&

Re: [R] ANOVA between linear models.

2008-05-16 Thread Rune Haubo

Hi Richard

You are trying to compare two models, that are not nested. This means
that all usual asymptotics of the test statistics break down, hence
the (second) test you are attempting is not meaningful. Usually one
decides on the form of the response on other grounds such as residual
analysis or based on a box-cox analysis. You could take a look at
boxcox() in MASS (book as well as package).

That said it *is* possible to compare completely different models
using the likelihood or AIC in case of difference in number of
parameters although tests are seldom (if ever) available. In this case
you need to be sure to include the additive constants of the log
likelihood - see ?extractAIC.

In case the log-transform makes sense for your problem, you might want
to contemplate a gamma-GLM with a log link.

Hope this helps

/Rune

2008/5/15 Richard Martin <[EMAIL PROTECTED]>:
> Hi All,
>
> I'm accustomed to performing an ANOVA to aid in choosing between
> linear models (for example y~x or y~x+x^2), however with different
> models I can't seem to do it. I'm trying to fit an exponential model
> of the form ye^(bt).
>
> Below is a code snippet that highlights what I'm trying to do
>
> s = rnorm(100, sd = 1, mean=10)
> s = s + seq(0.1,10,0.1)
> x = 1:length(s)
> model.lin = lm(s ~ x)
> model.poly = lm(s ~x  + I(x^2))
> model.exp = lm(log(s) ~ x)
>
> anova(model.lin, model.poly)
> #gives the correct outcome
>
> anova(model.lin, model.exp)
> #doesn't.
>
> This fails because of the transformation of the response variable. Can
> anyone give any advice as to how I should proceed - is there a
> different test to use in this instance, or some way of reversing the
> transform after the model has been fitted?
>
> Any help greatly appreciated!!
>
> Richard
>
> --
> Contendere, Explorare, Invenire, et non Cedere
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Mixed model Nested ANOVA

2008-02-22 Thread Rune Haubo

Hi Stephen

On 22/02/2008, Stephen Cole <[EMAIL PROTECTED]> wrote:
> hello R help
>
>  I am trying to analyze a data set that has been collected from a
>  hierarchical sampling design.  The model should be a mixed model nested
>  ANOVA.  The purpose of my study is to analyze the variability at each
>  spatial scale in my design (random factors, variance components), and say
>  something about the variability between regions (fixed factor, contrast of
>  means).  The data is as follows;
>
>  region (fixed)
>  Location (random)
>  Site(random)
>
>  site nested in location nested in region.
>
>  I would like to run this as an ANOVA and then compute variance components.
>
>  My question is when i use the aov command;   mod1 <- aov(density ~
>  region/location/site)
>  is there any way to tell R which factor is random and fixed.

This depends on whether your design is balanced or not. If, your data
are balanced, you may use Error() in aov() to specify error-strata,
but then you may as well work out the variance parameters from the
sums of squares manually. A more general procedure would be to use
either lme or lmer, from which you can get anova tables of fixed
effects with anova().
>
>  I know i can specify fixed and random factors using lme or lmer but these
>  methods do not allow me to extract an anova table (or do they?)
>  I know that the data can be analyzed using a nested ANOVA because i have
>  based my design on several papers in the marine biological literature
>  (MEPS).  Thank-you for any advice in advance.

If you data are nested lme usually works rather smooth. Here you might
want use something like

anova(mymodel)

to get anova for the fixed effects. The variance components are provided with

summary(mymodel)

Best
Rune
>
>  Stephen Cole
>
> [[alternative HTML version deleted]]
>
>  __
>  R-help@r-project.org mailing list
>  https://stat.ethz.ch/mailman/listinfo/r-help
>  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>  and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] "se.fit" option to the predict.nls() function

Re: [R] Regression Overdispersion?

Re: [R] Checking the proportional odds assumption holds in an ordinal logistic regression using polr function

Re: [R] Checking the proportional odds assumption holds in an ordinal logistic regression using polr function

Re: [R] proportional odds logistic regression with non-negative constraint for several coefficients

Re: [R] Likelihood ratio test between glm and glmer fits

[R] [R-pkgs] New package: ordinal

Re: [R] Chi-square values in GLM model comparison

Re: [R] Test of Parallel Regression Assumption in R

Re: [R] Optimisation and NaN Errors using clm() and clmm()

Re: [R] Optimisation and NaN Errors using clm() and clmm()

Re: [R] multilevel binary and ordered regression models

Re: [R] multilevel binary and ordered regression models

Re: [R] p values of lmer

Re: [R] Find correlation in Clmm?

Re: [R] Error clmm(){ordinal}

Re: [R] Error clmm(){ordinal}

Re: [R] Ordered probit using clm2

Re: [R] Multilevel analysis for ordinal responses

Re: [R] Problems with clmm2 (ordinal data fit)

Re: [R] Longitudinal categorical response data

Re: [R] Ordinal logistic regression p-values

Re: [R] significance level (p) for t-value in package zelig

Re: [R] significance level (p) for t-value in package zelig

Re: [R] Package 'MASS' (polr): Error in svd(X) : infinite or missing values in 'x'

Re: [R] Testing the proportional odds assumption of an ordinal generalized estimating equations (GEE) regression model

Re: [R] difference between MASS::polr() and Design::lrm()

Re: [R] difference between MASS::polr() and Design::lrm()

Re: [R] ANOVA between linear models.

Re: [R] Mixed model Nested ANOVA

30 matches

Site Navigation

Mail list logo

Footer information