Dear Professor Lumley; Thank you so much for your invaluable advice! I will digest your advice and try different methods. Great thanks again! Faye > Date: Fri, 5 Nov 2010 08:24:00 +1300 > Subject: Re: [R] How to do bootstrap for the complex sample design? > From: tlum...@uw.edu > To: timhesterb...@gmail.com > CC: feix...@hotmail.com; r-help@r-project.org > > On Fri, Nov 5, 2010 at 3:51 AM, Tim Hesterberg <timhesterb...@gmail.com> > wrote: > > Faye wrote: > >>Our survey is structured as : To be investigated area is divided into > >>6 regions, within each region, one urban community and one rural > >>community are randomly selected, then samples are randomly drawn from > >>each selected uran and rural community. > >> > >>The problems is that in urban/rural stratum, we only have one sample. > >>In this case, how to do bootstrap? > > > > You are lucky that your sample size is 1. If it were 2 you would > > probably have proceeded without realizing that the answers were wrong. > > > > Suppose you had two samples in each stratum. If you proceed naturally, > > drawing bootstrap samples of size 2 from each stratum, this would > > underestimate variability by a factor of 2. > > > > In general the ordinary nonparametric bootstrap estimates of variability > > are biased downward by a factor of (n-1)/n -- exactly for the mean, > > approximately for other statistics. In multiple-sample and stratified > > situations, the bias depends on the stratum sizes. > > > > Three remedies are: > > * draw bootstrap samples of size n-1 > > * "bootknife" sampling - omit one observation (a jackknife sample), then > > draw a bootstrap sample of size n from that > > * bootstrap from a kernel density estimate, with kernel covariance equal > > to empirical covariance (with divisor n-1) / n. > > The latter two are described in > > Hesterberg, Tim C. (2004), Unbiasing the Bootstrap-Bootknife Sampling vs. > > Smoothing, Proceedings of the Section on Statistics and the Environment, > > American Statistical Association, 2924-2930. > > http://home.comcast.net/~timhesterberg/articles/JSM04-bootknife.pdf > > > > All three are undefined for samples of size 1. You need to go to some > > other bootstrap, e.g. a parametric bootstrap with variability estimated > > from other data. > > > > And the 'survey' package supplies the first option. (It also supplies > a bootstrap sample of size n that allows finite population > corrections, designed for situations with a large n and a high > sampling fraction, such as some business surveys.) > > With a sample size of 1 per stratum there are no design-unbiased > estimators of the standard error, so as others have said you need > external data. > > -thomas > > > -- > Thomas Lumley > Professor of Biostatistics > University of Auckland [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.