On Fri, 2010-08-20 at 06:31 +0800, elaine kuo wrote: <snip /> > > > Oh, actually, I suppose you could automate this, so it will > return all > models with single variable: > > dd <- dredge(lm1) > parms <- !is.na(dd[, -c(1, (ncol(dd) - c(0:7)))]) > want <- which(rowSums(parms) == 1) > dd[want, ] > > Having said all this, I don't think this is a good way to do > model > selection. > > => > > > Thanks for the time-efficient formula. > Actually it is over my current ability to automate it, but it did save > a lot of time already.
You don't need to automate it, I did it for you. The four lines above show you only the lines from dd (your dredge result) that contain only one covariate. That was what you asked for, wasn't it? > Moreover, please kindly explain the command beginning with parms. > data (Cement) was checked but could not ensure why the row and the > column were set to be 1 and (ncol(dd) - c(0:7)). OK; for this, you need to first get that dredge returns a data frame with some additional attributes. (It actually inherits from class "data.frame" so whilst it isn't exactly a data frame it works like one.) So we can subset it as we would any other R data frame. If you look at the object returned from dredge, you see the first column refers to the intercept, the next N columns are for your covariates in the full model you fitted, with the remaining columns representing the 'rank' statistic (AIC, BIC etc) plus some other measures. > head(dd) Global model: lm(formula = y ~ ., data = Cement) --- Model selection table (Int) X X1 X2 X3 X4 k R.sq Adj.R.sq RSS AIC AICc 11 52.58 1.468 0.6623 4 0.9787 0.9744 57.90 64.31 69.31 24 71.65 1.452 0.4161 -0.2365 5 0.9823 0.9764 47.97 63.87 72.44 23 48.19 1.696 0.6569 0.25 5 0.9823 0.9764 48.11 63.90 72.48 13 103.10 1.440 -0.6140 4 0.9725 0.9670 74.76 67.63 72.63 25 111.70 1.052 -0.41 -0.6428 5 0.9813 0.9750 50.84 64.62 73.19 17 52.53 0.1551 1.467 0.6410 5 0.9798 0.9731 54.86 65.61 74.18 delta weight 11 0.000 0.544 24 3.125 0.114 23 3.163 0.112 13 3.322 0.103 25 3.879 0.078 17 4.869 0.048 So we select out only the bits we want: dd[, -c(1, (ncol(dd) - c(0:7)))] This bit is selecting columns by dropping the first and the last eight columns. ncol(dd) gives the number of columns, ncol(dd) - 0 is the last column, ncol(dd) - 1 is the next to the last column and so on. There are eight of these columns we don't want. If you just run > c(1, (ncol(dd) - c(0:7))) [1] 1 14 13 12 11 10 9 8 7 you'll see that we have the column integers we want to drop, and then we negate it to get the subsetting to drop rather than retain these columns: -c(1, (ncol(dd) - c(0:7))) Having dropped the superfluous columns we aren't interested in, we are left with an 'array' with one column per covariate (predictor variable). If a given covariate is not in the model, it gets an NA in its column for that model. The is.na() function returns an object of the same dimension as dd[, -c(1, (ncol(dd) - c(0:7)))] but which indicates if variable is NA or not: > is.na(dd[, -c(1, (ncol(dd) - c(0:7)))]) X X1 X2 X3 X4 11 TRUE FALSE FALSE TRUE TRUE 24 TRUE FALSE FALSE TRUE FALSE 23 TRUE FALSE FALSE FALSE TRUE 13 TRUE FALSE TRUE TRUE FALSE 25 TRUE FALSE TRUE FALSE FALSE 17 FALSE FALSE FALSE TRUE TRUE 19 FALSE FALSE TRUE TRUE FALSE .... The ! negates this turning TRUE into FALSE and vice versa. In computing and R, TRUE == 1, FALSE == 0, so when we do rowSums on this, we get information regarding how many covariates are in each model: > rowSums(parms) 11 24 23 13 25 17 19 26 31 27 28 29 30 16 22 32 14 20 6 4 7 15 10 9 8 3 2 3 3 2 3 3 3 3 4 4 4 4 4 2 3 5 2 3 1 1 2 2 2 2 2 1 12 18 21 5 2 1 2 3 3 1 1 0 You asked for only those models containing a single covariate, so they are the models with rowSum == 1. The which() function tells us which models (rows of dd) these are and we save this indicator in 'want' > which(rowSums(parms) == 1) 6 4 3 5 2 19 20 26 30 31 The final step is to select out rows 19, 20, 26, 30 and 31 (want) from dd as these are the models you said you were interested in. > dd[want, ] Global model: lm(formula = y ~ ., data = Cement) --- Model selection table (Int) X X1 X2 X3 X4 k R.sq Adj.R.sq RSS AIC 6 117.60 -0.7382 3 0.6745 0.6450 883.9 97.74 4 57.42 0.7891 3 0.6663 0.6359 906.3 98.07 3 81.48 1.869 3 0.5339 0.4916 1266.0 102.40 5 110.20 -1.256 3 0.2859 0.2210 1939.0 108.00 2 82.31 1.874 3 0.2353 0.1657 2077.0 108.80 AICc delta weight 6 100.4 31.10 0.511 4 100.7 31.42 0.434 3 105.1 35.77 0.050 5 110.6 41.31 0.003 2 111.5 42.20 0.002 HTH G > Elaine -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.