Sorry. I just realized, after posting, that the "n" value in the dispersion calculation isn't correct. I'll have to revisit the simulation, tomorrow.
On Sat, Mar 27, 2021 at 9:11 PM Abby Spurdle <spurdl...@gmail.com> wrote: > > Hi Rolf, > > Let's say we have a course called Corgiology 101, with a single moderated > exam. > And let's say the moderators transform initial exam scores, such that > there are fixed percentages of pass rates and A grades. > > Rather than count the number of passes, we can count the number of "jumps". > That is, the number of people that pass the corgiology exam after > moderation, that would not have passed without moderation. > > I've created a function to test for underdispersion, based on your expression. > (I hope I got it right). > > Then I've gone on to create simulations, using both constant and > nonconstant class sizes. > The nonconstant simulations apply an (approx) discrete scaling > transformation, referred to previously. > > We can see from the examples that there are a lot of these jumps. > And more importantly, they appear to be underdispersed. > > ----code---- > PASS.SCORE <- 0.5 > A.SCORE <- 0.8 > > #target parameters > PASS.RATE <- 0.8 > A.RATE <- 0.2 > #unmoderated parameters > UNMOD.MEAN.SCORE <- 0.65 > UNMOD.SD.SCORE <- 0.075 > > NCLASSES <- 2000 > NSTUD.CONST <- 200 > NSTUD.NONCONST.LIMS <- c (50, 800) > > sim.njump <- function (nstud, mean0=UNMOD.MEAN.SCORE, sd0=UNMOD.SD.SCORE, > pass.score=PASS.SCORE, a.score=A.SCORE, > pass.rate=PASS.RATE, a.rate=A.RATE) > { x <- rnorm (nstud, mean0, sd0) > q <- quantile (x, 1 - c (pass.rate, a.rate), names=FALSE) > dq <- diff (q) > q <- (a.score - pass.score) / dq * q > y <- pass.score - q [1] + (a.score - pass.score) / dq * x > sum (x < a.score & y >= a.score) > } > > sim.nclasses <- function (nclasses, nstud, nstud.std) > { nstud <- rep_len (nstud, nclasses) > njump <- integer (nclasses) > for (i in 1:nclasses) > njump [i] <- sim.njump (nstud [i]) > if (missing (nstud.std) ) > njump > else > round (nstud.std / nstud * njump) > } > > is.under <- function (x, n) > var (x) < mean (x) * (1 - mean (x) / n) > > njump.hom <- sim.nclasses (NCLASSES, NSTUD.CONST) > nstud <- round (runif (NCLASSES, NSTUD.NONCONST.LIMS [1], > NSTUD.NONCONST.LIMS [2]) ) > njump.het <- sim.nclasses (NCLASSES, nstud, NSTUD.CONST) > > under.hom <- is.under (njump.hom, NSTUD.CONST) > under.het <- is.under (njump.het, NSTUD.CONST) > main.hom <- paste0 ("const class size (under=", under.hom, ")") > main.het <- paste0 ("diff class sizes (under=", under.het, ")") > > p0 <- par (mfrow = c (2, 1) ) > hist (njump.hom, main=main.hom) > hist (njump.het, main=main.het) > par (p0) > ----code---- > > best, > B. > > > On Thu, Mar 25, 2021 at 2:33 PM Rolf Turner <r.tur...@auckland.ac.nz> wrote: > > > > > > I would like a real-life example of a data set which one might think to > > model by a binomial distribution, but which is substantially > > underdispersed. I.e. a sample X = {X_1, X_2, ..., X_N} where each X_i > > is an integer between 0 and n (n known a priori) such that var(X) << > > mean(X)*(1 - mean(X)/n). > > > > Does anyone know of any such examples? Do any exist? I've done > > a perfunctory web search, and had a look at "A Handbook of Small > > Data Sets" by Hand, Daly, Lunn, et al., and drawn a blank. > > > > I've seen on the web some references to underdispersed "pseudo-Poisson" > > data, but not to underdispersed "pseudo-binomial" data. And of course > > there's lots of *over* dispersed stuff. But that's not what I want. > > > > I can *simulate* data sets of the sor that I am looking for (so far the > > only ideas I've had for doing this are pretty simplistic and > > artificial) but I'd like to get my hands on a *real* example, if > > possible. > > > > Grateful for any pointers/suggestions. > > > > cheers, > > > > Rolf Turner > > > > -- > > Honorary Research Fellow > > Department of Statistics > > University of Auckland > > Phone: +64-9-373-7599 ext. 88276 > > > > ______________________________________________ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.