Duncan, I'm not sure how I missed your message. Sorry. What you describe is what I do when (n-1)p is an integer so that R computes the sample quantile using a single order statistic. My later posting has that exact binomial expression in there as a special case. When (n-1)p is not an integer then R uses a weighted average of 2 order statistics, in which case I'm left with my standing problem...
On Mon, Feb 14, 2011 at 2:26 PM, Duncan Murdoch <murdoch.dun...@gmail.com>wrote: > On 14/02/2011 9:58 AM, Bentley Coffey wrote: > >> I need to calculate the probability that a sample quantile will exceed a >> threshold given the size of the iid sample and the parameters describing >> the >> distribution of each observation (normal, in my case). I can compute the >> probability with brute force simulation: simulate a size N sample, apply >> R's >> quantile() function on it, compare it to the threshold, replicate this >> MANY >> times, and count the number of times the sample quantile exceeded the >> threshold (dividing by the total number of replications yields the >> probability of interest). The problem is that the number of replications >> required to get sufficient precision (3 digits say) is so HUGE that this >> takes FOREVER. I have to perform this task so much in my script (searching >> over the sample size and repeated for several different distribution >> parameters) that it takes too many hours to run. >> >> I've searched for pre-existing code to do this in R and haven't found >> anything. Perhaps I'm missing something. Is anyone aware of an R function >> to >> compute this probability? >> >> I've tried writing my own code using the fact that R's quantile() function >> is a linear combination of 2 order statistics. Basically, I wrote down the >> mathematical form of the joint pdf for the 2 order statistics (a function >> of >> the sample size and the distribution parameters) then performed a >> pseudo-Monte Carlo integration (i.e. using Halton Draws rather than R's >> random draws) over the region where the sample quantile exceeds the >> threshold. In theory, this should work and it takes about 1000 times fewer >> clock cycles to compute than the Brute Force approach. My problem is that >> there is a significant discrepancy between the results using Brute Force >> and >> using this more efficient approach that I have coded up. I believe that >> the >> problem is numerical error but it could be some programming bug; >> regardless, >> I have been unable to locate the source of this problem and have spent >> over >> 20 hours trying to identify it this weekend. Please, somebody help!!! >> >> So, again, my question: is there code in R for quickly evaluating the CDF >> of >> a Sample Quantile given the sample size and the parameters governing the >> distribution of each iid point in the sample? >> > > I think the answer to your question is no, but I think it's the wrong > question. Suppose Xm:n is the mth sample quantile in a sample of size n, > and you want to calculate P(Xm:n > x). Let X be a draw from the underlying > distribution, and suppose P(X > x) = p. Then the event Xm:n > x > is the same as the event that out of n independent draws of X, at least > n-m+1 are bigger than x: a binomial probability. And R can calculate > binomial probabilities using pbinom(). > > Duncan Murdoch > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.