On 2010-07-08 13:14, Addi Wei wrote:
Hopefully simple question: What is the best way to name, and treat factor columns for data that has lots of columns? This is my column list: id pID50 D.1 D.2 D.3 D.4 D.5 , etc. all the way to D.185 I was under the impression from several R examples in pls that if you name your columns like above, you should be able to simply call all the D factors with "D", instead of going in and putting a plus sign between each column. miceD<- plsr(pID50~D, ncomp=10,data = micetitletest) Error in model.frame.default(formula = pID50 ~ D, data = micetitletest) : invalid type (closure) for variable 'D' VS. miceD<- plsr(pID50 ~ D.1 + D.2 + D.3 + D.4 etc. to D.185 , ncomp=10, data = micetitletest) What am I missing above that's causing that error message in bold? Is there a better strategy for naming my columns in order to make R use easier?
From the help page for plsr(): "The formula argument should be a symbolic formula of the form response ~ terms, where response is the name of the response vector or matrix (for multi-response models) and terms is the name of one or more predictor _matrices_ (emphasis added), usually separated by +, e.g., water ~ FTIR or y ~ X + Z." Note the word _matrices_; you may not have set up your data correctly. Compare the 'yarn' dataset str(yarn) with your data str(micetitletest) And, as David says, don't use D for the name of your predictor matrix (although it will probably work). -Peter Ehlers ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.