Thanks Roger. Your comments were very helpful. Unfortunately, each of the 'groups' in this example are derived from the same set of data, two of which were subsets-- so it is not that unlikely that the weighted medians were the same in some cases.
This all leads back to an operation attempting to answer the question: Of the 2 subsetting methods, which one produces a distribution most like the complete data set? Since the distributions are not normal, and there are area-weights involved others on the list suggested quantile-regression. For a more complete picture of how 'different' the distributions are, I have tried looking at the differences between weighted quantiles: (0.1, 0.25, 0.5, 0.75, 0.9) as a more complete 'description' of each distribution. I imagine that there may be a better way to perform this comparison... Cheers, Dylan On Tuesday 30 June 2009, roger koenker wrote: > Admittedly this seemed quite peculiar.... but if you look at the > entrails > of the following code you will see that with the weights the first and > second levels of your x$method variable have the same (weighted) median > so the contrast that you are estimating SHOULD be zero. Perhaps > there is something fishy about the data construction that would have > allowed us to anticipate this? Regarding the "fn" option, and the > non-uniqueness warning, this is covered in the (admittedly obscure) > faq on quantile regression available at: > > http://www.econ.uiuc.edu/~roger/research/rq/FAQ > > # example: > library(quantreg) > > # load data > x <- read.csv(url('http://169.237.35.250/~dylan/temp/test.csv')) > > # with weights > summary(rq(sand ~ method, data=x, weights=area_fraction, tau=0.5), > se='ker') > > #Reproduction with more convenient notation: > > X0 <- model.matrix(~method, data = x) > y <- x$sand > w <- x$area_fraction > f0 <- summary(rq(y ~ X0 - 1, weights = w),se = "ker") > > #Second reproduction with orthogonal design: > > X1 <- model.matrix(~method - 1, data = x) > f1 <- summary(rq(y ~ X1 - 1, weights = w),se = "ker") > > #Comparing f0 and f1 we see that they are consistent!! How can that > be?? > #Since the columns of X1 are orthogonal estimation of the 3 parameters > are separable > #so we can check to see whether the univariate weighted medians are > reproducible. > > s1 <- X1[,1] == 1 > s2 <- X1[,2] == 1 > g1 <- rq(y[s1] ~ X1[s1,1] - 1, weights = w[s1]) > g2 <- rq(y[s2] ~ X1[s2,2] - 1, weights = w[s2]) > > #Now looking at the g1 and g2 objects we see that they are equal and > agree with f1. > > > url: www.econ.uiuc.edu/~roger Roger Koenker > email rkoen...@uiuc.edu Department of Economics > vox: 217-333-4558 University of Illinois > fax: 217-244-6678 Urbana, IL 61801 > > On Jun 30, 2009, at 3:54 PM, Dylan Beaudette wrote: > > Hi, > > > > I am trying to use quantile regression to perform weighted- > > comparisons of the > > median across groups. This works most of the time, however I am > > seeing some > > odd output in summary(rq()): > > > > Call: rq(formula = sand ~ method, tau = 0.5, data = x, weights = > > area_fraction) > > Coefficients: > > Value Std. Error t value Pr(>|t|) > > (Intercept) 45.44262 3.64706 12.46007 0.00000 > > methodmukey-HRU 0.00000 4.67115 0.00000 1.00000 > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > > > When I do not include the weights, I get something a little closer > > to a > > weighted comparison of means, along with an error message: > > > > Call: rq(formula = sand ~ method, tau = 0.5, data = x) > > Coefficients: > > Value Std. Error t value Pr(>|t|) > > (Intercept) 44.91579 2.46341 18.23318 0.00000 > > methodmukey-HRU 9.57601 9.29348 1.03040 0.30380 > > Warning message: > > In rq.fit.br(x, y, tau = tau, ...) : Solution may be nonunique > > > > > > I have noticed that the error message goes away when specifying > > method='fn' to > > rq(). An example is below. Could this have something to do with > > replication > > in the data? > > > > > > # example: > > library(quantreg) > > > > # load data > > x <- read.csv(url('http://169.237.35.250/~dylan/temp/test.csv')) > > > > # with weights > > summary(rq(sand ~ method, data=x, weights=area_fraction, tau=0.5), > > se='ker') > > > > # without weights > > # note error message > > summary(rq(sand ~ method, data=x, tau=0.5), se='ker') > > > > # without weights, no error message > > summary(rq(sand ~ method, data=x, tau=0.5, method='fn'), se='ker') > > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.