I am running permutation regressions in package lmPerm using lmp(). I am getting what I find to be confusing results and I would like help understanding what is going on. To illustrate my problem, I created a simple example and am running lmp() such that the results of the lmp() models should be identical to that of lm(). I'm referring to the notes section of the lmp() documentation where it says that the "function will behave identically to lm() if the following parameters are set: perm="", seqs=TRUE, center=FALSE."
Here is an example wherein I am unable to match my lmp() results to my lm() results. library(lmPerm) library(lattice) x1 <- c(rnorm(60, 150, 50),rnorm(60, 150, 50),rnorm(60, 150, 50)) y1 <- c(30-0.1*x1[1:60], rep(10, 60), 0.1*x1[121:180]) factor.levels1 <- c(rep("DOWN", 60), rep("FLAT", 60), rep("UP", 60)) xyplot(y1 ~ x1, groups = factor.levels1, auto.key = TRUE) lmp.model.1 <- lmp(y1 ~ x1*factor.levels1 - 1, perm = "", seqs = TRUE, center = FALSE) summary(lmp.model.1) lm.model.1 <- lm(y1 ~ x1*factor.levels1 - 1) summary(lm.model.1) Here are the results: > summary(lmp.model.1) Call: lmp(formula = y1 ~ x1 * factor.levels1 - 1, perm = "", seqs = TRUE, center = FALSE) Residuals: Min 1Q Median 3Q Max -1.509e-13 -1.700e-16 4.277e-17 9.558e-16 1.621e-14 Coefficients: Estimate Std. Error t value Pr(>|t|) factor.levels1DOWN 3.000e+01 7.359e-15 4.077e+15 <2e-16 *** factor.levels1FLAT 1.000e+01 4.952e-15 2.019e+15 <2e-16 *** factor.levels1UP -5.809e-16 5.095e-15 -1.140e-01 0.9094 x1 4.096e-17 2.137e-17 1.917e+00 0.0569 . x1:factor.levels11 -1.000e-01 3.391e-17 -2.949e+15 <2e-16 *** x1:factor.levels12 -4.500e-17 2.792e-17 -1.612e+00 0.1089 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 1.226e-14 on 174 degrees of freedom Multiple R-Squared: 1, Adjusted R-squared: 1 F-statistic: 3.721e+31 on 6 and 174 DF, p-value: < 2.2e-16 > summary(lm.model.1) Call: lm(formula = y1 ~ x1 * factor.levels1 - 1) Residuals: Min 1Q Median 3Q Max -3.141e-14 -3.190e-15 -9.880e-16 8.920e-16 1.905e-13 Coefficients: Estimate Std. Error t value Pr(>|t|) x1 -1.000e-01 5.638e-17 -1.774e+15 <2e-16 *** factor.levels1DOWN 3.000e+01 9.099e-15 3.297e+15 <2e-16 *** factor.levels1FLAT 1.000e+01 6.123e-15 1.633e+15 <2e-16 *** factor.levels1UP -3.931e-15 6.300e-15 -6.240e-01 0.533 x1:factor.levels1FLAT 1.000e-01 6.826e-17 1.465e+15 <2e-16 *** x1:factor.levels1UP 2.000e-01 6.931e-17 2.886e+15 <2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 1.515e-14 on 174 degrees of freedom Multiple R-squared: 1, Adjusted R-squared: 1 I thought that the results of summary(lmp.model.1) would be the same as summary(lm.model.1). However, I am concerned that I am interpreting the results incorrectly because I can't get the results to match. Specifically, I simulated data with a slope for UP of 0.1, the slope for FLAT of 0, and the slope for DOWN of -0.1. I can recover these values in lm.model.1, but not lmp.model.1. In the output for the lmp.model.1, I am estimating the slope for DOWN to be approximately -0.1 (4.096e-17-1.000e-01) and the slope of the FLAT to be approximately 0 (4.096e-17-4.500e-17); however, the slope of UP (what I think is equal to the reference level x1) is 4.096e-17. Am I interpreting the x1 term incorrectly? Why are the lmp() results not identical to the lm() results? I ran a similar example using a modification of the above data wherein factor level A is equal to FLAT, factor level B is equal to DOWN, and factor level C is equal to UP. Again, I was unable to match the results from lm() and lmp(). x2 <- c(rnorm(60, 150, 50), rnorm(60, 150, 50),rnorm(60, 150, 50)) y2 <- c(rep(10, 60), 30-0.1*x2[61:120], 0.1*x2[121:180]) factor.levels2 <- c(rep("A", 60), rep("B", 60), rep("C", 60)) xyplot(y2 ~ x2, groups = factor.levels2, auto.key = TRUE) lmp.model.2 <- lmp(y2 ~ x2*factor.levels2 - 1, perm = "", seqs = TRUE, center = FALSE) summary(lmp.model.2) lm.model.2 <- lm(y2 ~ x2*factor.levels2 - 1) summary(lm.model.2) Here are the results: > summary(lmp.model.2) Call: lmp(formula = y2 ~ x2 * factor.levels2 - 1, perm = "", seqs = TRUE, center = FALSE) Residuals: Min 1Q Median 3Q Max -1.284e-13 -6.772e-16 1.439e-16 1.581e-15 4.323e-14 Coefficients: Estimate Std. Error t value Pr(>|t|) factor.levels2A 1.000e+01 5.545e-15 1.803e+15 < 2e-16 *** factor.levels2B 3.000e+01 4.707e-15 6.373e+15 < 2e-16 *** factor.levels2C 1.556e-15 4.994e-15 3.120e-01 0.755688 x2 6.840e-17 1.860e-17 3.677e+00 0.000314 *** x2:factor.levels21 1.030e-16 2.734e-17 3.767e+00 0.000226 *** x2:factor.levels22 -1.000e-01 2.550e-17 -3.921e+15 < 2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 1.131e-14 on 174 degrees of freedom Multiple R-Squared: 1, Adjusted R-squared: 1 F-statistic: 4.719e+31 on 6 and 174 DF, p-value: < 2.2e-16 I would expect the reference level slope (term x2 in lmp.model.2, which I believe is the slope for factor level C) to be 0.1. However, it is 6.840e-17. Am I interpreting the reference levels for the lmp() models incorrectly? Perhaps I am specifying the models incorrectly. Any help would be very much appreciated. My session info is as follows: R version 3.0.1 (2013-05-16) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] lattice_0.20-15 lmPerm_1.1-2 loaded via a namespace (and not attached): [1] grid_3.0.1 tools_3.0.1 Thanks, Ann Marie Ann Marie Reinhold | Doctoral Candidate Montana Cooperative Fishery Research Unit Department of Ecology | Montana State University Box 173460 | Bozeman, MT 59717 Email: reinhold [AT] montana [DOT] edu | Office: (406) 994-6643 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.