I will point out again that sampling a five-fold replicate of 1:20 is not the same as resampling with replacement, although I made an error in reporting probabilities - the P(x2 = 1 | x1 = 1) = 4/99 and not 4/100. When sampling with replacement, P(x2 = 1 | x1 = 1) = P(x2 = 1 | x1 != 1) = 1/20. -------------------------------------- Jonathan P. Daily Technician - USGS Leetown Science Center 11649 Leetown Road Kearneysville WV, 25430 (304) 724-4480 "Is the room still a room when its empty? Does the room, the thing itself have purpose? Or do we, what's the word... imbue it." - Jubal Early, Firefly
r-help-boun...@r-project.org wrote on 03/02/2011 01:05:01 PM: > [image removed] > > Re: [R] bootstrap resampling - simplified > > Vokey, John > > to: > > r-help > > 03/02/2011 01:07 PM > > Sent by: > > r-help-boun...@r-project.org > > On 2011-03-02, at 4:00 AM, r-help-requ...@r-project.org wrote: > > > Hello there, > > > > I have a problem concerning bootstrapping in R - especially > focusing on the resampling part of it. I try to sum it up in a > simplified way so that I would not confuse anybody. > > > > I have a small database consisting of 20 observations (basically > numbers from 1 to 20, I mean: 1, 2, 3, 4, 5, ... 18, 19, 20). > > > > I would like to resample this database many times for the > bootstrap process with the following conditions. Firstly, every > resampled database should also include 20 observations. Secondly, > when selecting a number from the above-mentioned 20 numbers, you can > do this selection with replacement. The difficult part comes now: > one number can be selected only maximum 5 times. In order to make > this clear I show you a couple of examples. So the resampled > databases might be like the following ones: > > > > (1st database) 1,2,1,2,1,2,1,2,1,2,3,3,3,3,3,4,4,4,4,4 > > 4 different numbers are chosen (1, 2, 3, 4), each selected - for > the maximum possible - 5 times. > > > > (2nd database) 1,8,8,6,8,8,8,2,3,4,5,6,6,6,6,7,19,1,1,1 > > Two numbers - 8 and 6 - selected 5 times (the maximum possible > times), number 1 selected 4 times, the others selected less than 4 times. > > > > (3rd database) 1,1,2,2,3,3,4,4,9,9,9,10,10,13,10,9,3,9,2,1 > > Number 9 chosen for the maximum possible 5 times, number 10, 3, 2, > 1 chosen for 3 times, number 4 selected twice and number 13 selectedonly once. > > > > ... > > > > Anybody knows how to implement my "tricky" condition into one of > the R functions - that one number can be selected only 5 times at > most? Are 'boot' and 'bootstrap' packages capable of managing this? > I guess they are, I just couldn't figure it out yet... > > > > Thanks very much! Best regards, > > Laszlo Bodnar > > Laszlo, > Create a vector consisting of 5 of each number. Then, for each > sample, scramble the order of the items in the vector, and select > the first 20. > > > -- > Please avoid sending me Word or PowerPoint attachments. > See <http://www.gnu.org/philosophy/no-word-attachments.html> > > -Dr. John R. Vokey > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.