Good point. I'll take my suggestion back... Giovanni
On Tue, 2011-03-01 at 13:18 -0500, Jonathan P Daily wrote: > I'm not sure that is equivalent to sampling with replacement, since if the > first "draw" is 1, then the probability that the next draw will be one is > 4/100 instead of the 1/20 it would be in sampling with replacement. I > think the way to do this would be what Greg suggested - something like: > > bigsamp <- sample(1:20, 100, T) > idx <- sort(unlist(sapply(1:20, function(x) which(bigsamp == > x)[1:5])))[1:20] > samp <- bigsamp[idx] > > -------------------------------------- > Jonathan P. Daily > Technician - USGS Leetown Science Center > 11649 Leetown Road > Kearneysville WV, 25430 > (304) 724-4480 > "Is the room still a room when its empty? Does the room, > the thing itself have purpose? Or do we, what's the word... imbue it." > - Jubal Early, Firefly > > r-help-boun...@r-project.org wrote on 03/01/2011 09:37:31 AM: > > > [image removed] > > > > Re: [R] bootstrap resampling question > > > > Giovanni Petris > > > > to: > > > > Bodnar Laszlo EB_HU > > > > 03/01/2011 11:58 AM > > > > Sent by: > > > > r-help-boun...@r-project.org > > > > Cc: > > > > "'r-help@r-project.org'" > > > > A simple way of sampling with replacement from 1:20, with the additional > > constraint that each number can be selected at most five times is > > > > > sample(rep(1:20, 5), 20) > > > > HTH, > > Giovanni > > > > On Tue, 2011-03-01 at 11:30 +0100, Bodnar Laszlo EB_HU wrote: > > > Hello there, > > > > > > I have a problem concerning bootstrapping in R - especially > > focusing on the resampling part of it. I try to sum it up in a > > simplified way so that I would not confuse anybody. > > > > > > I have a small database consisting of 20 observations (basically > > numbers from 1 to 20, I mean: 1, 2, 3, 4, 5, ... 18, 19, 20). > > > > > > I would like to resample this database many times for the > > bootstrap process with the following two conditions. The resampled > > databases should also have 20 observations and you can select each > > of the previously mentioned 20 numbers with replacement. I guess it > > is obvious so far. Now the more difficult second condition is that > > one number can be selected only maximum 5 times. In order to make > > this clear I try to show you an example. So there can be resampled > > databases like the following ones: > > > > > > (1st database) 1,2,1,2,1,2,1,2,1,2,3,3,3,3,3,4,4,4,4,4 > > > (4 different numbers are chosen, each selected 5 times) > > > > > > (2nd database) 1,8,8,6,8,8,8,2,3,4,5,6,6,6,6,7,19,1,1,1 > > > (Two numbers - 8 and 6 - selected 5 times, number "1" selected > > four times, the others selected less than 4 times) > > > > > > My very first guess that came to my mind whilst thinking about the > > problem was the sample function where there are settings like > > replace=TRUE and prob=... where you can create a probability vector > > i.e. how much should be the probability of selecting a number. So I > > tried to calculate probabilities first. I thought the problem can > > basically described as a k-combination with repetitions. > > Unfortunately the only thing I could calculate so far is the total > > number of all possible selections which amounts to 137 846 527 049. > > > > > > Anybody knows how to implement my second "tricky" condition into > > one of the R functions? Are 'boot' and 'bootstrap' packages capable > > of managing this? I guess they are, I just couldn't figure it out yet... > > > > > > Thanks very much! Best regards, > > > Laszlo Bodnar > > > > > > > > > ____________________________________________________________________________________________________ > > > Ez az e-mail és az összes hozzá tartozó csatolt melléklet titkos > > és/vagy jogilag, szakmailag vagy más módon védett információt > > tartalmazhat. Amennyiben nem Ön a levél címzettje akkor a levél > > tartalmának közlése, reprodukálása, másolása, vagy egyéb más úton > > történő terjesztése, felhasználása szigorúan tilos. Amennyiben > > tévedésből kapta meg ezt az üzenetet kérjük azonnal értesítse az > > üzenet küldőjét. Az Erste Bank Hungary Zrt. (EBH) nem vállal > > felelősséget az információ teljes és pontos - címzett(ek)hez történő > > - eljuttatásáért, valamint semmilyen késésért, kapcsolat > > megszakadásból eredő hibáért, vagy az információ felhasználásából > > vagy annak megbízhatatlanságából eredő kárért. > > > > > > Az üzenetek EBH-n kívüli küldője vagy címzettje tudomásul veszi és > > hozzájárul, hogy az üzenetekhez más banki alkalmazott is hozzáférhet > > az EBH folytonos munkamenetének biztosítása érdekében. > > > > > > > > > This e-mail and any attached files are confidential > and/...{{dropped:19}} > > > > > > ______________________________________________ > > > R-help@r-project.org mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > -- > > > > Giovanni Petris <gpet...@uark.edu> > > Associate Professor > > Department of Mathematical Sciences > > University of Arkansas - Fayetteville, AR 72701 > > Ph: (479) 575-6324, 575-8630 (fax) > > http://definetti.uark.edu/~gpetris/ > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.