Re: [R] defining a formula method for a weighted lm()

Michael Friendly Fri, 07 Jan 2011 06:41:13 -0800

Thanks, Martin

Now I understand 'standard non-standard evaluation' magic, and the code in
http://developer.r-project.org/model-fitting-functions.txt
explains how this works.

Still, I can't help but think of this as evil-magic, for which someanti-magic would be extremely useful, so that

a simple function like

my.lm <- function(formula, data, subset, weights, ...) {
    lm(formula, data, subset, weights, ...)
}

would work as expected. Oh dear, I think I need some syntactic sugar inmy coffee!


-Michael


On 1/6/2011 2:16 PM, Martin Maechler wrote:

Michael Friendly<frien...@yorku.ca>
     on Thu, 06 Jan 2011 09:33:25 -0500 writes:

     >  No one replied to this, so I'll try again, with a simple example.  I
     >  calculate a set of log odds ratios, and turn them into a data frame as
     >  follows:

     >>  library(vcdExtra)
     >>  (lor.CM<- loddsratio(CoalMiners))
     >  log odds ratios for Wheeze and Breathlessness by Age

     >  25-29    30-34    35-39    40-44    45-49    50-54    55-59    60-64
     >  3.695261 3.398339 3.140658 3.014687 2.782049 2.926395 2.440571 2.637954
     >>
     >>  (lor.CM.df<- as.data.frame(lor.CM))
     >  Wheeze Breathlessness   Age      LOR        ASE
     >  1  W:NoW          B:NoB 25-29 3.695261 0.16471778
     >  2  W:NoW          B:NoB 30-34 3.398339 0.07733658
     >  3  W:NoW          B:NoB 35-39 3.140658 0.03341311
     >  4  W:NoW          B:NoB 40-44 3.014687 0.02866111
     >  5  W:NoW          B:NoB 45-49 2.782049 0.01875164
     >  6  W:NoW          B:NoB 50-54 2.926395 0.01585918
     >  7  W:NoW          B:NoB 55-59 2.440571 0.01452057
     >  8  W:NoW          B:NoB 60-64 2.637954 0.02159903

     >  Now I want to fit a linear model by WLS, LOR ~ Age, which can do 
directly as

     >>  lm(LOR ~ as.numeric(Age), weights=1/ASE, data=lor.CM.df)

     >  Call:
     >  lm(formula = LOR ~ as.numeric(Age), data = lor.CM.df, weights = 1/ASE)

     >  Coefficients:
     >  (Intercept)  as.numeric(Age)
     >  3.5850          -0.1376

     >  But, I want to do the fitting in my own function, the simplest version 
is

     >  my.lm<- function(formula, data, subset, weights) {
     >  lm(formula, data, subset, weights)
     >  }

     >  But there is obviously some magic about formula objects and evaluation
     >  environments, because I don't understand why this doesn't work.


     >>  my.lm(LOR ~ as.numeric(Age), weights=1/ASE, data=lor.CM.df)
     >  Error in model.frame.default(formula = formula, data = data, subset =
     >  subset,  :
     >  invalid type (closure) for variable '(weights)'
     >>

Yes, the "magic" has been called "standard non-standard evaluation"
for a while (since August 2002, to be precise),
and the http://developer.r-project.org/ web page has had two
very relevant links since then, namely those mentioned in the
following two lines there:
----------------------------
# Description of the nonstandard evaluation rules in R 1.5.1 and some 
suggestions. (updated). Also an R function and docn for making model frames 
from multiple formulas.

# Notes on model-fitting functions in R, and especially on how to enable all 
the safety features.
----------------------------

For what you want, I think (but haven't tried) the second link, which is
        http://developer.r-project.org/model-fitting-functions.txt
is still very relevant.

Many many people (package authors) had to use something like
that or just directly taken the lm function as an example..
{{ but then probably failed the more subtle points on how to
    program residuals() , predict() , etc functions which you can
    also learn from model-fitting-functions.txt}}


     >  A second question: Age is a factor, and as.numeric(Age) gives me 1:8.

     >  What simple expression on lor.CM.df$Age would give me either the lower
     >  limits (here: seq(25, 60, by = 5)) or midpoints of these Age intervals
     >  (here: seq(27, 62, by = 5))?

With

   data(CoalMiners, package = "vcd")

here are some variations :

  >  (Astr<- dimnames(CoalMiners)[[3]])
  [1] "25-29" "30-34" "35-39" "40-44" "45-49" "50-54" "55-59" "60-64"
  >  sapply(lapply(strsplit(Astr, "-"), as.numeric), `[[`, 1)
  [1] 25 30 35 40 45 50 55 60
  >  sapply(lapply(strsplit(Astr, "-"), as.numeric), `[[`, 2)
  [1] 29 34 39 44 49 54 59 64
  >  sapply(lapply(strsplit(Astr, "-"), as.numeric), mean)
  [1] 27 32 37 42 47 52 57 62

Or use the 2-row matrix and apply(*, 1)  to that :

  >  sapply(strsplit(Astr, "-"), as.numeric)
       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
  [1,]   25   30   35   40   45   50   55   60
  [2,]   29   34   39   44   49   54   59   64

Regards,
Martin Maechler, ETH Zurich



--
Michael Friendly     Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University      Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele Street    Web:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] defining a formula method for a weighted lm()

Reply via email to