Well, this would seem to work: e <- data.frame(Score = Score , Country = factor(Country) , Time = Time)
ncountry <- nlevels(e$Country) func= function(dat,idx) { if(length(unique(dat[idx,'Country'])) < ncountry) NA else coef(lm(Score~ Time + Country,data = dat[idx,])) } B <- boot(e, func, R=1000) boot.ci(B, index=2, type="perc") Caveats: 1) boot.ci handles the NA's by omitting them, which of course gives a smaller resample and longer CI's than the value of R specified in the call to boot(). 2) I do not know if the *nice* statistical properties of the nonparametric bootstrap, e.g. asymptotic correctness, hold when bootstrap samples are produced in this way. I leave that to wiser heads than me. Cheers, Bert On Sat, Jan 13, 2024 at 2:51 PM Ben Bolker <bbol...@gmail.com> wrote: > It took me a little while to figure this out, but: the problem is > that if your resampling leaves out any countries (which is very likely), > your model applied to the bootstrapped data will have fewer coefficients > than your original model. > > I tried this: > > cc <- unique(e$Country) > func <- function(data, idx) { > coef(lm(Score~ Time + factor(Country, levels =cc),data=data[idx,])) > } > > but lm() automatically drops coefficients for missing countries (I > didn't think about it too hard, but I thought they might get returned as > NA and that boot() might be able to handle that ...) > > If you want to do this I think you'll have to find a way to do a > *stratified* bootstrap, restricting the bootstrap samples so that they > always contain at least one sample from each country ... (I would have > expected "strata = as.numeric(e$Country)" to do this, but it doesn't > work the way I thought ... it tries to compute the statistics for *each* > stratum ...) > > > > ==== > > Debugging attempts: > > set.seed(101) > options(error=recover) > B= boot(e, func, R=1000) > > > Error in t.star[r, ] <- res[[r]] : > number of items to replace is not a multiple of replacement length > > Enter a frame number, or 0 to exit > > 1: boot(e, func, R = 1000) > > <enter 1> > > Selection: 1 > Called from: top level > Browse[1]> print(r) > [1] 2 > Browse[1]> t.star[r,] > [1] NA NA NA NA NA NA NA NA NA > > i[2,] > [1] 14 15 22 22 21 14 8 2 12 22 10 15 9 7 9 13 12 23 1 20 15 7 > 5 10 > > > > > On 2024-01-13 5:22 p.m., varin sacha via R-help wrote: > > Dear Duncan, > > Dear Ivan, > > > > I really thank you a lot for your response. > > So, if I correctly understand your answers the problem is coming from > this line: > > > > coef(lm(Score~ Time + factor(Country)),data=data[idx,]) > > > > This line should be: > > coef(lm(Score~ Time + factor(Country),data=data[idx,])) > > > > If yes, now I get an error message (code here below)! So, it still does > not work. > > > > Error in t.star[r, ] <- res[[r]] : > > number of items to replace is not a multiple of replacement length > > > > > > ########################################## > > > Score=c(345,564,467,675,432,346,476,512,567,543,234,435,654,411,356,658,432,345,432,345, > 345,456,543,501) > > > > Country=c("Italy", "Italy", "Italy", "Turkey", "Turkey", "Turkey", > "USA", "USA", "USA", "Korea", "Korea", "Korea", "Portugal", "Portugal", > "Portugal", "UK", "UK", "UK", "Poland", "Poland", "Poland", "Austria", > "Austria", "Austria") > > > > Time=c(1,2,3,1,2,3,1,2,3,1,2,3,1,2,3,1,2,3,1,2,3,1,2,3) > > > > e=data.frame(Score, Country, Time) > > > > > > library(boot) > > func= function(data, idx) { > > coef(lm(Score~ Time + factor(Country),data=data[idx,])) > > } > > B= boot(e, func, R=1000) > > > > boot.ci(B, index=2, type="perc") > > ############################################################# > > > > > > > > > > > > > > > > > > Le samedi 13 janvier 2024 à 21:56:58 UTC+1, Ivan Krylov < > ikry...@disroot.org> a écrit : > > > > > > > > > > > > В Sat, 13 Jan 2024 20:33:47 +0000 (UTC) > > > > varin sacha via R-help <r-help@r-project.org> пишет: > > > >> coef(lm(Score~ Time + factor(Country)),data=data[idx,]) > > > > > > Wrong place for the data=... argument. You meant to give it to lm(...), > > but in the end it went to coef(...). Without the data=... argument, the > > formula passed to lm() picks up the global variables inherited by the > > func() closure. > > > > Unfortunately, S3 methods really do have to ignore extra arguments they > > don't understand if the class is to be extended, so coef.lm isn't > > allowed to complain to you about it. > > > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.