Re: [R] Bootstrap CIs for weighted means of paired differences

i.petzev Sat, 22 Nov 2014 08:23:40 -0800

Ok, thanks for the suggestions. I will look into that. And you are absolutely 
right that I should have been more clear about what type of weighting I want. 
So to clarify: I run time series regressions of returns of company i on two 
different sets of explanatory variables. Then I extract the respective 
intercepts of the two regressions and take the difference between both. I 
repeat this for the whole sample of companies and then compute the market value 
weighted average of those differences.





On 21 Nov 2014, at 19:18, David Winsemius <dwinsem...@comcast.net> wrote:

> 
> On Nov 21, 2014, at 6:52 AM, ivan wrote:
> 
>> I am aware of the fact that bootstrapping produces different CIs with every 
>> run. I still believe that there is a difference between both types of 
>> procedures. My understanding is that setting "w" in the boot() function 
>> influences the "importance" of observations or how the bootstrap selects the 
>> observations. I.e, observation i does not have the same probability of being 
>> chosen as observation j when "w" is defined in the boot() function. If you 
>> return res_boot you will notice that with "w" being set in the boot() 
>> function, the function call states "weighted bootstrap". If not, it states 
>> "ordinary nonparametric bootstrap". But maybe I am wrong. 
> 
> OK. So in the the second call w affects the probability of a case being sent 
> to the boot-function as well as being used in the boot-function; while with 
> the "non-weighted call" the w's are only affecting the individual mean 
> estimates. So the second one is different. And as I suggested earlier you 
> never described the goals of the investigation or the meaning of the 
> variables. 
> 
>   I can tell you that when Davison and Hinkley offered examples of using a 
> bootstrap for a weighted bootstrap mean, they compared a stratified analysis 
> with an example where the weighting was only used on the inner function 
> (example 3.2, practical 3.14 pp 72, 131 of their book) with one where the 
> strata parameter was used. But so far I don't think you have ever described 
> what sort of weights these actually are. In that example the weights were the 
> inverse variances of the sample groups. They didn't use a 'weights' parameter 
> in the boot call. I'm do not know if it was part of the S package that was 
> being used at the time.
> 
> I tried to find an example of a weighted bootstrap in V&R 4e but did not see 
> one. Prof Ripley is the maintainer of the boot package. In the V&R book, 
> Angelo Canty is given the credit for writing the boot package for S. I think 
> you should consult the code, first. And you should also look at the `stype` 
> parameter where "w" is one option.
> 
> -- 
> David.
> 
>> 
>> On Thu, Nov 20, 2014 at 8:19 PM, David Winsemius <dwinsem...@comcast.net> 
>> wrote:
>> 
>> On Nov 20, 2014, at 2:23 AM, i.petzev wrote:
>> 
>>> Hi David,
>>> 
>>> sorry, I was not clear.
>> 
>> Right. You never were clear about what you wanted and your examples was so 
>> statistically symmetric that it is still hard to see what is needed. The 
>> examples below show CI's that are arguably equivalent. I can be faulted for 
>> attempting to provide code that produced a sensible answer to a vague 
>> question to which I was only guessing at the intent.
>> 
>> 
>>> The difference comes from defining or not defining �w� in the boot() 
>>> function. The results with your function and your approach are thus:
>>> 
>>> set.seed(1111)
>>> x <- rnorm(50)
>>> y <- rnorm(50)
>>> weights <- runif(50)
>>> weights <- weights / sum(weights)
>>> dataset <- cbind(x,y,weights)
>>> 
>>> vw_m_diff <- function(dataset,w) {
>>>  differences <- dataset[w,1]-dataset[w,2]
>>>  weights <- dataset[w, "weights"]
>>>  return(weighted.mean(x=differences, w=weights))
>>> }
>>> res_boot <- boot(dataset, statistic=vw_m_diff, R = 1000, w=dataset[,3])
>>> boot.ci(res_boot)
>>> 
>>> BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
>>> Based on 1000 bootstrap replicates
>>> 
>>> CALL :
>>> boot.ci(boot.out = res_boot)
>>> 
>>> Intervals :
>>> Level      Normal              Basic
>>> 95%   (-0.5657,  0.4962 )   (-0.5713,  0.5062 )
>>> 
>>> Level     Percentile            BCa
>>> 95%   (-0.6527,  0.4249 )   (-0.5579,  0.5023 )
>>> Calculations and Intervals on Original Scale
>>> 
>>> ********************************************************************************************************************
>>> 
>>> However, without defining �w� in the bootstrap function, i.e., running an 
>>> ordinary and not a weighted bootstrap, the results are:
>>> 
>>> res_boot <- boot(dataset, statistic=vw_m_diff, R = 1000)
>>> boot.ci(res_boot)
>>> 
>>> BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
>>> Based on 1000 bootstrap replicates
>>> 
>>> CALL :
>>> boot.ci(boot.out = res_boot)
>>> 
>>> Intervals :
>>> Level      Normal              Basic
>>> 95%   (-0.6265,  0.4966 )   (-0.6125,  0.5249 )
>> 
>> I hope you are not saying that because those CI's are different that there 
>> is some meaning in that difference. Bootstrap runs will always be 
>> "different" than each other unless you use set.seed(.) before the runs.
>> 
>>> 
>>> Level     Percentile            BCa
>>> 95%   (-0.6714,  0.4661 )   (-0.6747,  0.4559 )
>>> Calculations and Intervals on Original Scale
>>> 
>>> On 19 Nov 2014, at 17:49, David Winsemius <dwinsem...@comcast.net> wrote:
>>> 
>>>>>> vw_m_diff <- function(dataset,w) {
>>>>>>    differences <- dataset[w,1]-dataset[w,2]
>>>>>>   weights <- dataset[w, "weights"]
>>>>>>   return(weighted.mean(x=differences, w=weights))
>>>>>> }
>>> 
>> 
>> David Winsemius
>> Alameda, CA, USA
>> 
>> 
> 
> David Winsemius
> Alameda, CA, USA


        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Bootstrap CIs for weighted means of paired differences

Reply via email to