Re: [R] defining a formula method for a weighted lm()

Michael Friendly Thu, 06 Jan 2011 06:34:38 -0800

No one replied to this, so I'll try again, with a simple example. Icalculate a set of log odds ratios, and turn them into a data frame asfollows:


> library(vcdExtra)
> (lor.CM <- loddsratio(CoalMiners))
log odds ratios for Wheeze and Breathlessness by Age


   25-29    30-34    35-39    40-44    45-49    50-54    55-59    60-64
3.695261 3.398339 3.140658 3.014687 2.782049 2.926395 2.440571 2.637954
>
> (lor.CM.df <- as.data.frame(lor.CM))
  Wheeze Breathlessness   Age      LOR        ASE
1  W:NoW          B:NoB 25-29 3.695261 0.16471778
2  W:NoW          B:NoB 30-34 3.398339 0.07733658
3  W:NoW          B:NoB 35-39 3.140658 0.03341311
4  W:NoW          B:NoB 40-44 3.014687 0.02866111
5  W:NoW          B:NoB 45-49 2.782049 0.01875164
6  W:NoW          B:NoB 50-54 2.926395 0.01585918
7  W:NoW          B:NoB 55-59 2.440571 0.01452057
8  W:NoW          B:NoB 60-64 2.637954 0.02159903

Now I want to fit a linear model by WLS, LOR ~ Age, which can do directly as

> lm(LOR ~ as.numeric(Age), weights=1/ASE, data=lor.CM.df)

Call:
lm(formula = LOR ~ as.numeric(Age), data = lor.CM.df, weights = 1/ASE)

Coefficients:
    (Intercept)  as.numeric(Age)
         3.5850          -0.1376

But, I want to do the fitting in my own function, the simplest version is

my.lm <- function(formula, data, subset, weights) {
        lm(formula, data, subset, weights)
}

But there is obviously some magic about formula objects and evaluationenvironments, because I don't understand why this doesn't work.


> my.lm(LOR ~ as.numeric(Age), weights=1/ASE, data=lor.CM.df)

Error in model.frame.default(formula = formula, data = data, subset =subset, :

  invalid type (closure) for variable '(weights)'
>

A second question: Age is a factor, and as.numeric(Age) gives me 1:8.

What simple expression on lor.CM.df$Age would give me either the lower
limits (here: seq(25, 60, by = 5)) or midpoints of these Age intervals
(here: seq(27, 62, by = 5))?

best,
-Michael


On 12/16/2010 9:43 AM, Michael Friendly wrote:

In the vcdExtra package on R-Forge, I have functions and generic methods
for calculating log odds ratios
for R x C x strata tables. I'd like to define methods for fitting
weighted lm()s to the resulting loddsratio objects,
but I'm having problems figuring out how to do this generally.

# install.packages("vcdExtra", repos="http://R-Forge.R-Project.org";)
library(vcdExtra)

 > fung.lor <- loddsratio(Fungicide)
 > fung.lor
log odds ratios for group and outcome by sex, strain

strain
sex 1 2
M -1.596015 -0.8266786
F -1.386294 -0.6317782
 >
 > fung.lor.df <- as.data.frame(fung.lor)
 > fung.lor.df
group outcome sex strain LOR ASE
1 Control:Treated Tumor:NoTumor M 1 -1.5960149 0.7394909
2 Control:Treated Tumor:NoTumor F 1 -1.3862944 0.9574271
3 Control:Treated Tumor:NoTumor M 2 -0.8266786 0.6587325
4 Control:Treated Tumor:NoTumor F 2 -0.6317782 1.1905545
 >

Now, I want to test whether the odds ratios differ by sex or strain, so
I do a weighted lm()

 > fung.mod <- lm(LOR ~ sex + strain, data=fung.lor.df, weights=1/ASE^2)
 > anova(fung.mod)
Analysis of Variance Table

Response: LOR
Df Sum Sq Mean Sq F value Pr(>F)
sex 1 0.00744 0.00744 112.3 0.05990 .
strain 1 0.84732 0.84732 12788.1 0.00563 **
Residuals 1 0.00007 0.00007
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 >

I tried to write a generic formula method to do this, but I keep running
into errors:

lor <- function(x, ...)
UseMethod("lor")

lor.formula <- function(formula, data, subset, weights,
model = TRUE, x = FALSE, y = FALSE,
contrasts = NULL, ...) {

data <- as.data.frame(data)
if (missing(weights)) {
if (! "ASE" %in% names(data)) stop("data does not contain an ASE column")
data$weights <- 1/data$ASE^2
}
lm(formula, data, subset, weights=weights,
model = model, x = x, y = y,
contrasts = contrasts, ...)
}

 > lor(LOR ~ strain+sex, fung.lor)
Error in xj[i] : invalid subscript type 'closure'
 > lor(LOR ~ strain+sex, fung.lor.df)
Error in xj[i] : invalid subscript type 'closure'
 >
 > traceback()
8: `[.data.frame`(list(LOR = c(-1.59601489210196, -1.38629436111989,
-0.826678573184468, -0.631778234183653), strain = c(1L, 1L, 2L,
2L), sex = c(1L, 2L, 1L, 2L), `(weights)` = c(1.82866556836903,
1.09090909090909, 2.30452674897119, 0.705507123112907)), function (x,
...)
UseMethod("subset"), , FALSE)
7: model.frame.default(formula = formula, data = data, subset = subset,
weights = weights, drop.unused.levels = TRUE)
6: model.frame(formula = formula, data = data, subset = subset,
weights = weights, drop.unused.levels = TRUE)
5: eval(expr, envir, enclos)
4: eval(mf, parent.frame())
3: lm(formula, data, subset, weights = weights, model = model, x = x,
y = y, contrasts = contrasts, ...)
2: lor.formula(LOR ~ strain + sex, fung.lor.df)
1: lor(LOR ~ strain + sex, fung.lor.df)
 >

How can I make this work?



--
Michael Friendly     Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University      Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele Street    Web:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] defining a formula method for a weighted lm()

Reply via email to