> On 06/06/2010 10:49 PM, Mark Seeto wrote: >> Hello, >> >> I have a couple of questions about the ols function in Frank Harrell's >> rms >> package. >> >> Is there any way to specify variables by their column number in the data >> frame rather than by the variable name? >> >> For example, >> >> library(rms) >> x1<- rnorm(100, 0, 1) >> x2<- rnorm(100, 0, 1) >> x3<- rnorm(100, 0, 1) >> y<- x2 + x3 + rnorm(100, 0, 5) >> d<- data.frame(x1, x2, x3, y) >> rm(x1, x2, x3, y) >> lm(y ~ d[,2] + d[,3], data = d) # This works >> ols(y ~ d[,2] + d[,3], data = d) # Gives error >> Error in if (!length(fname) || !any(fname == zname)) { : >> missing value where TRUE/FALSE needed >> >> However, this works: >> ols(y ~ x2 + d[,3], data = d) >> >> The reason I want to do this is to program variable selection for >> bootstrap model validation. >> >> A related question: does ols allow "y ~ ." notation? >> >> lm(y ~ ., data = d[, 2:4]) # This works >> ols(y ~ ., data = d[, 2:4]) # Gives error >> Error in terms.formula(formula) : '.' in formula and no 'data' argument >> >> Thanks for any help you can give. >> >> Regards, >> Mark > > Hi Mark, > > It appears that you answered the questions yourself. rms wants real > variables or transformations of them. It makes certain assumptions > about names of terms. The y ~ . should work though; sometime I'll have > a look at that. > > But these are the small questions compared to what you really want. Why > do you need variable selection, i.e., what is wrong with having > insignificant variables in a model? If you indeed need variable > selection see if backwards stepdown works for you. It is built-in to > rms bootstrap validation and calibration functions. > > Frank >
Thank you for your reply, Frank. I would have reached the conclusion that rms only accepts real variables had this not worked: ols(y ~ x2 + d[,3], data = d) The reason I want to program variable selection is so that I can use the bootstrap to check the performance of a model-selection method. My co-workers and I have used a variable selection method which combines forward selection, backward elimination, and best subsets (the forward and backward methods were run using different software). I want to do bootstrap validation to (1) check the over-optimism in R^2, and (2) justify using a different approach, if R^2 turns out to be very over-optimistic. The different approach would probably be data reduction using variable clustering, as you describe in your book. Regards, Mark -- Mark Seeto Statistician National Acoustic Laboratories <http://www.nal.gov.au/> A Division of Australian Hearing 126 Greville Street Chatswood NSW 2067 Australia ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.