On Aug 19, 2013, at 9:45 AM, ivo welch wrote: > dear R experts---I was programming a fama-macbeth panel regression (a > fama-macbeth regression is essentially T cross-sectional regressions, with > statistics then obtained from the time-series of coefficients), partly > because I wanted faster speed than plm, partly because I wanted some > additional features. > > my function starts as > > fama.macbeth <- function( formula, din ) { > names <- terms( formula ) > ## omitted : I want an immediate check that the formula refers to > existing variables in the data frame with English error messages >
Look the structure of a terms result from a formula argument with str(): fama.macbeth <- function( formula, din ) { fnames <- terms( formula ) ; str(fnames) } > fama.macbeth( x ~ y, data.frame(x=rnorm(10), y=rnorm(10) ) ) Classes 'terms', 'formula' length 3 x ~ y ..- attr(*, "variables")= language list(x, y) ..- attr(*, "factors")= int [1:2, 1] 0 1 .. ..- attr(*, "dimnames")=List of 2 .. .. ..$ : chr [1:2] "x" "y" .. .. ..$ : chr "y" ..- attr(*, "term.labels")= chr "y" ..- attr(*, "order")= int 1 ..- attr(*, "intercept")= int 1 ..- attr(*, "response")= int 1 ..- attr(*, ".Environment")=<environment: R_GlobalEnv> Then extract the dimnames from the "factors" attribute to compare to the names in hte data-object: > fama.macbeth <- function( formula, din ) { fnames <- terms( formula ) ; dnames <- names( din) dimnames(attr(fnames, "factors"))[[1]] %in% dnames } #[1] TRUE TRUE I couldn't tell if this was the main thrust of you question. It seems to meander a bit. -- David. > monthly.regressions <- by( din, as.factor(din$month), function(dd) > coef(lm(model.frame( formula, data=dd ))) > as.m <- do.call("rbind", monthly.regressions) > colMeans(as.m) ## or something like this. > } > say my data frame mydata has columns named month, r, laggedx and ... . I > can call this function > > fama.macbeth( r ~ laggedx, din=mydata ) > > but it fails What fails? > if I want to compute my x variables. for example, > > myx <- d[,"laggedx"] > fama.macbeth( r ~ myx) > > I also wish that the computed myx still remembered that it was really > laggedx. it's almost as if I should not create a vector myx but a data > frame myx to avoid losing the column name. I wouldn't say "almost"... rather that is exactly what you should do. R regression methods almost always work better when formulas are interpreted in the environment of the data argument. > I wonder why such vectors don't > keep a name attribute of some sort. > > there is probably an "R way" of doing this. is there? > > /iaw > > ---- > Ivo Welch (ivo.we...@gmail.com) > > [[alternative HTML version deleted]] Still posting HTML? > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. And do explain what the goal is. -- David Winsemius Alameda, CA, USA ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.