I am aware of the fact that bootstrapping produces different CIs with every run. I still believe that there is a difference between both types of procedures. My understanding is that setting "w" in the boot() function influences the "importance" of observations or how the bootstrap selects the observations. I.e, observation i does not have the same probability of being chosen as observation j when "w" is defined in the boot() function. If you return res_boot you will notice that with "w" being set in the boot() function, the function call states "weighted bootstrap". If not, it states "ordinary nonparametric bootstrap". But maybe I am wrong.
On Thu, Nov 20, 2014 at 8:19 PM, David Winsemius <dwinsem...@comcast.net> wrote: > > On Nov 20, 2014, at 2:23 AM, i.petzev wrote: > > > Hi David, > > > > sorry, I was not clear. > > Right. You never were clear about what you wanted and your examples was so > statistically symmetric that it is still hard to see what is needed. The > examples below show CI's that are arguably equivalent. I can be faulted for > attempting to provide code that produced a sensible answer to a vague > question to which I was only guessing at the intent. > > > > The difference comes from defining or not defining “w” in the boot() > function. The results with your function and your approach are thus: > > > > set.seed(1111) > > x <- rnorm(50) > > y <- rnorm(50) > > weights <- runif(50) > > weights <- weights / sum(weights) > > dataset <- cbind(x,y,weights) > > > > vw_m_diff <- function(dataset,w) { > > differences <- dataset[w,1]-dataset[w,2] > > weights <- dataset[w, "weights"] > > return(weighted.mean(x=differences, w=weights)) > > } > > res_boot <- boot(dataset, statistic=vw_m_diff, R = 1000, w=dataset[,3]) > > boot.ci(res_boot) > > > > BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS > > Based on 1000 bootstrap replicates > > > > CALL : > > boot.ci(boot.out = res_boot) > > > > Intervals : > > Level Normal Basic > > 95% (-0.5657, 0.4962 ) (-0.5713, 0.5062 ) > > > > Level Percentile BCa > > 95% (-0.6527, 0.4249 ) (-0.5579, 0.5023 ) > > Calculations and Intervals on Original Scale > > > > > ******************************************************************************************************************** > > > > However, without defining “w” in the bootstrap function, i.e., running > an ordinary and not a weighted bootstrap, the results are: > > > > res_boot <- boot(dataset, statistic=vw_m_diff, R = 1000) > > boot.ci(res_boot) > > > > BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS > > Based on 1000 bootstrap replicates > > > > CALL : > > boot.ci(boot.out = res_boot) > > > > Intervals : > > Level Normal Basic > > 95% (-0.6265, 0.4966 ) (-0.6125, 0.5249 ) > > I hope you are not saying that because those CI's are different that there > is some meaning in that difference. Bootstrap runs will always be > "different" than each other unless you use set.seed(.) before the runs. > > > > > Level Percentile BCa > > 95% (-0.6714, 0.4661 ) (-0.6747, 0.4559 ) > > Calculations and Intervals on Original Scale > > > > On 19 Nov 2014, at 17:49, David Winsemius <dwinsem...@comcast.net> > wrote: > > > >>>> vw_m_diff <- function(dataset,w) { > >>>> differences <- dataset[w,1]-dataset[w,2] > >>>> weights <- dataset[w, "weights"] > >>>> return(weighted.mean(x=differences, w=weights)) > >>>> } > > > > David Winsemius > Alameda, CA, USA > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.