Dear stats experts: Me and my little brain must be missing something regarding bootstrapping. I understand how to get a 95%CI and how to hypothesis test using bootstrapping (e.g., reject or not the null). However, I'd also like to get a p-value from it, and to me this seems simple, but it seems no-one does what I would like to do to get a p-value, which suggests I'm not understanding something. Rather, it seems that when people want a p-value using resampling methods, they immediately jump to permutation testing (e.g., destroying dependencies so as to create a null distribution). SO - here's my thought on getting a p-value by bootstrapping. Could someone tell me what is wrong with my approach? Thanks:
STEPS TO GETTING P-VALUES FROM BOOTSTRAPPING - PROBABLY WRONG: 1) sample B times with replacement, figure out theta* (your statistic of interest). B is large (> 1000) 2) get the distribution of theta* 3) the mean of theta* is generally near your observed theta. In the same way that we use non-centrality parameters in other situations, move the distribution of theta* such that the distribution is centered around the value corresponding to your null hypothesis (e.g., make the distribution have a mean theta = 0) 4) Two methods for finding 2-tailed p-values (assuming here that your observed theta is above the null value): Method 1: find the percent of recentered theta*'s that are above your observed theta. p-value = 2 * this percent Method 2: find the percent of recentered theta*'s that are above the absolute value of your observed value. This is your p-value. So this seems simple. But I can't find people discussing this. So I'm thinking I'm wrong. Could someone explain where I've gone wrong? J Jackson [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.