Ok, thanks for the suggestions. I will look into that. And you are absolutely right that I should have been more clear about what type of weighting I want. So to clarify: I run time series regressions of returns of company i on two different sets of explanatory variables. Then I extract the respective intercepts of the two regressions and take the difference between both. I repeat this for the whole sample of companies and then compute the market value weighted average of those differences.
On 21 Nov 2014, at 19:18, David Winsemius <dwinsem...@comcast.net> wrote: > > On Nov 21, 2014, at 6:52 AM, ivan wrote: > >> I am aware of the fact that bootstrapping produces different CIs with every >> run. I still believe that there is a difference between both types of >> procedures. My understanding is that setting "w" in the boot() function >> influences the "importance" of observations or how the bootstrap selects the >> observations. I.e, observation i does not have the same probability of being >> chosen as observation j when "w" is defined in the boot() function. If you >> return res_boot you will notice that with "w" being set in the boot() >> function, the function call states "weighted bootstrap". If not, it states >> "ordinary nonparametric bootstrap". But maybe I am wrong. > > OK. So in the the second call w affects the probability of a case being sent > to the boot-function as well as being used in the boot-function; while with > the "non-weighted call" the w's are only affecting the individual mean > estimates. So the second one is different. And as I suggested earlier you > never described the goals of the investigation or the meaning of the > variables. > > I can tell you that when Davison and Hinkley offered examples of using a > bootstrap for a weighted bootstrap mean, they compared a stratified analysis > with an example where the weighting was only used on the inner function > (example 3.2, practical 3.14 pp 72, 131 of their book) with one where the > strata parameter was used. But so far I don't think you have ever described > what sort of weights these actually are. In that example the weights were the > inverse variances of the sample groups. They didn't use a 'weights' parameter > in the boot call. I'm do not know if it was part of the S package that was > being used at the time. > > I tried to find an example of a weighted bootstrap in V&R 4e but did not see > one. Prof Ripley is the maintainer of the boot package. In the V&R book, > Angelo Canty is given the credit for writing the boot package for S. I think > you should consult the code, first. And you should also look at the `stype` > parameter where "w" is one option. > > -- > David. > >> >> On Thu, Nov 20, 2014 at 8:19 PM, David Winsemius <dwinsem...@comcast.net> >> wrote: >> >> On Nov 20, 2014, at 2:23 AM, i.petzev wrote: >> >>> Hi David, >>> >>> sorry, I was not clear. >> >> Right. You never were clear about what you wanted and your examples was so >> statistically symmetric that it is still hard to see what is needed. The >> examples below show CI's that are arguably equivalent. I can be faulted for >> attempting to provide code that produced a sensible answer to a vague >> question to which I was only guessing at the intent. >> >> >>> The difference comes from defining or not defining �w� in the boot() >>> function. The results with your function and your approach are thus: >>> >>> set.seed(1111) >>> x <- rnorm(50) >>> y <- rnorm(50) >>> weights <- runif(50) >>> weights <- weights / sum(weights) >>> dataset <- cbind(x,y,weights) >>> >>> vw_m_diff <- function(dataset,w) { >>> differences <- dataset[w,1]-dataset[w,2] >>> weights <- dataset[w, "weights"] >>> return(weighted.mean(x=differences, w=weights)) >>> } >>> res_boot <- boot(dataset, statistic=vw_m_diff, R = 1000, w=dataset[,3]) >>> boot.ci(res_boot) >>> >>> BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS >>> Based on 1000 bootstrap replicates >>> >>> CALL : >>> boot.ci(boot.out = res_boot) >>> >>> Intervals : >>> Level Normal Basic >>> 95% (-0.5657, 0.4962 ) (-0.5713, 0.5062 ) >>> >>> Level Percentile BCa >>> 95% (-0.6527, 0.4249 ) (-0.5579, 0.5023 ) >>> Calculations and Intervals on Original Scale >>> >>> ******************************************************************************************************************** >>> >>> However, without defining �w� in the bootstrap function, i.e., running an >>> ordinary and not a weighted bootstrap, the results are: >>> >>> res_boot <- boot(dataset, statistic=vw_m_diff, R = 1000) >>> boot.ci(res_boot) >>> >>> BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS >>> Based on 1000 bootstrap replicates >>> >>> CALL : >>> boot.ci(boot.out = res_boot) >>> >>> Intervals : >>> Level Normal Basic >>> 95% (-0.6265, 0.4966 ) (-0.6125, 0.5249 ) >> >> I hope you are not saying that because those CI's are different that there >> is some meaning in that difference. Bootstrap runs will always be >> "different" than each other unless you use set.seed(.) before the runs. >> >>> >>> Level Percentile BCa >>> 95% (-0.6714, 0.4661 ) (-0.6747, 0.4559 ) >>> Calculations and Intervals on Original Scale >>> >>> On 19 Nov 2014, at 17:49, David Winsemius <dwinsem...@comcast.net> wrote: >>> >>>>>> vw_m_diff <- function(dataset,w) { >>>>>> differences <- dataset[w,1]-dataset[w,2] >>>>>> weights <- dataset[w, "weights"] >>>>>> return(weighted.mean(x=differences, w=weights)) >>>>>> } >>> >> >> David Winsemius >> Alameda, CA, USA >> >> > > David Winsemius > Alameda, CA, USA [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.