When I want to manipulate expressions, including formulae, I first think of things like bquote() and substitute(). E.g.,
> for(nm in lapply(c("x","z"), as.name)) { fmla <- formula( bquote( y ~ .(nm) )) print(fama.macbeth(fmla, din=d)) } (Intercept) x -0.02384804 0.18151577 (Intercept) z 0.05562026 0.03174173 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf > Of ivo welch > Sent: Monday, August 19, 2013 1:05 PM > To: David Winsemius; r-help > Subject: Re: [R] model syntax processed --- probably common > > thank you. but uggh...sorry for my html post. and sorry again for > having been obscure in my attempt to be brief. here is a working > program. > > fama.macbeth <- function( formula, din ) { > fnames <- terms( formula ) > dnames <- names( din ) > stopifnot( all(dimnames(attr(fnames, "factors"))[[1]] %in% dnames) ) > > monthly.regressions <- by( din, as.factor(din$month), function(dd) > coef(lm(model.frame( formula, data=dd )))) > as.m <- do.call("rbind", monthly.regressions) > colMeans(as.m) > } > > ## a test data set > d <- data.frame( month=rep(1:5,10), y= rnorm(50), x= rnorm(50), z=rnorm(50) ) > > ## this works beautifully, exactly how I want it. the names are > there, the formula works. > print(fama.macbeth( y ~ x , din=d )) > > ## now I want something like the following statement to work, too > for (nm in c("x")) print(fama.macbeth( y ~ nm, din=d )) > or > for (nm in c("x")) print(fama.macbeth( y ~ d[[nm]], din=d )) > or whatever. > > the output in both cases should be the same, preferably even knowing > that the name of the variable is really "x" and not nm. is there a > standard common way to do this? > > regards, > > /iaw > > ---- > Ivo Welch (ivo.we...@gmail.com) > http://www.ivo-welch.info/ > J. Fred Weston Professor of Finance > Anderson School at UCLA, C519 > Director, UCLA Anderson Fink Center for Finance and Investments > Free Finance Textbook, http://book.ivo-welch.info/ > Editor, Critical Finance Review, http://www.critical-finance-review.org/ > > > > On Mon, Aug 19, 2013 at 12:48 PM, David Winsemius > <dwinsem...@comcast.net> wrote: > > > > On Aug 19, 2013, at 9:45 AM, ivo welch wrote: > > > >> dear R experts---I was programming a fama-macbeth panel regression (a > >> fama-macbeth regression is essentially T cross-sectional regressions, with > >> statistics then obtained from the time-series of coefficients), partly > >> because I wanted faster speed than plm, partly because I wanted some > >> additional features. > >> > >> my function starts as > >> > >> fama.macbeth <- function( formula, din ) { > >> names <- terms( formula ) > >> ## omitted : I want an immediate check that the formula refers to > >> existing variables in the data frame with English error messages > >> > > > > Look the structure of a terms result from a formula argument with str(): > > > > fama.macbeth <- function( formula, din ) { > > fnames <- terms( formula ) ; str(fnames) > > } > > > >> fama.macbeth( x ~ y, data.frame(x=rnorm(10), y=rnorm(10) ) ) > > Classes 'terms', 'formula' length 3 x ~ y > > ..- attr(*, "variables")= language list(x, y) > > ..- attr(*, "factors")= int [1:2, 1] 0 1 > > .. ..- attr(*, "dimnames")=List of 2 > > .. .. ..$ : chr [1:2] "x" "y" > > .. .. ..$ : chr "y" > > ..- attr(*, "term.labels")= chr "y" > > ..- attr(*, "order")= int 1 > > ..- attr(*, "intercept")= int 1 > > ..- attr(*, "response")= int 1 > > ..- attr(*, ".Environment")=<environment: R_GlobalEnv> > > > > Then extract the dimnames from the "factors" attribute to compare to the > > names in > hte data-object: > > > >> fama.macbeth <- function( formula, din ) { > > fnames <- terms( formula ) ; dnames <- names( din) > > dimnames(attr(fnames, "factors"))[[1]] %in% dnames > > } > > #[1] TRUE TRUE > > > > > > I couldn't tell if this was the main thrust of you question. It seems to > > meander a bit. > > > > -- > > David. > > > >> monthly.regressions <- by( din, as.factor(din$month), function(dd) > >> coef(lm(model.frame( formula, data=dd ))) > >> as.m <- do.call("rbind", monthly.regressions) > >> colMeans(as.m) ## or something like this. > >> } > >> say my data frame mydata has columns named month, r, laggedx and ... . I > >> can call this function > >> > >> fama.macbeth( r ~ laggedx, din=mydata ) > >> > >> but it fails > > > > What fails? > > > > > >> if I want to compute my x variables. for example, > >> > >> myx <- d[,"laggedx"] > >> fama.macbeth( r ~ myx) > >> > >> I also wish that the computed myx still remembered that it was really > >> laggedx. it's almost as if I should not create a vector myx but a data > >> frame myx to avoid losing the column name. > > > > I wouldn't say "almost"... rather that is exactly what you should do. R > > regression > methods almost always work better when formulas are interpreted in the > environment of > the data argument. > > > >> I wonder why such vectors don't > >> keep a name attribute of some sort. > >> > >> there is probably an "R way" of doing this. is there? > >> > >> /iaw > >> > >> ---- > >> Ivo Welch (ivo.we...@gmail.com) > >> > >> [[alternative HTML version deleted]] > > > > Still posting HTML? > > > >> > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > > > > And do explain what the goal is. > > > > -- > > > > David Winsemius > > Alameda, CA, USA > > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.