Hi Marc:
Thank you for your help in this matter. With thanks Abou On Tue, Aug 10, 2021, 9:28 AM Marc Schwartz <marc_schwa...@me.com> wrote: > Hi, > > A search would suggest that there may not be an R function/package that > provides power/sample size calculations for the specific scenarios that > you are describing. There may be something that I am missing, and there > is also other dedicated software such as PASS > (https://www.ncss.com/software/pass/) which is not free, but provides a > large library of possibly relevant functions and support. > > That being said, you can run Monte Carlo simulations in R to achieve the > results you want, while providing yourself with options relative to > study design, intended tests, and adjustments for multiple comparisons > as apropos. Many prefer this approach, since it gives you specific > control over this process. > > Taking the simple case, where you are going to run a 3 x 2 chi-square as > your primary endpoint, and want to power for that, here is a possible > function, with the same sample size in each group: > > ThreeGroups <- function(n, p1, p2, p3, R = 10000, power = 0.8) { > > MCSim <- function(n, p1, p2, p3) { > ## Create a binary distribution for each group > G1 <- rbinom(n, 1, p1) > G2 <- rbinom(n, 1, p2) > G3 <- rbinom(n, 1, p3) > > ## Create a 3 x 2 matrix containing the 3 group counts > MAT <- cbind(table(G1), table(G2), table(G3)) > > ## Perform a chi-square and just return the p value > chisq.test(MAT)$p.value > } > > ## Replicate the above R times, and get > ## a distribution of p values > MC <- replicate(R, MCSim(n, p1, p2, p3)) > > ## Get the p value at the desired "power" quantile > quantile(MC, power) > } > > Essentially, the above internal MCSim() function generates 3 random > samples of size 'n' from the binomial distribution, at the 3 proportions > desired. For each run, it will perform a chi-square test of the 3 x 2 > matrix of counts, returning the p value for each run. The main function > will then return the p value at the quantile (power) within the > generated distribution of p values. > > You can look at the help pages for the various functions that I use > above, to get a sense for how they work. > > You increase the sample size ('n') until you get a p value returned <= > 0.05, if that is your desired alpha level. > > You also want 'R', the number of replications within each run, to be > large enough so that the returned p value quantile is relatively stable. > Values for 'R', once you get "close to" the desired p value should be on > the order of 1,000,000 or higher. Stay with lower values for 'R' until > you get in the ballpark of your target, since larger values take much > longer to run. > > Thus, using your example proportions of 0.25, 0.25, and 0.35: > > ## 250 per group, 750 total - Not enough > > ThreeGroups(250, 0.25, 0.25, 0.35, R = 10000) > 80% > 0.08884723 > > ## 350 per group, 1050 total - Too high > > ThreeGroups(350, 0.25, 0.25, 0.35, R = 10000) > 80% > 0.0270829 > > ## 300 per group, 900 total - Close! > > ThreeGroups(300, 0.25, 0.25, 0.35, R = 10000) > 80% > 0.04818842 > > > So, keep tweaking the sample size until you get a returned p value at > your target alpha level, with a large enough 'R', so that you get > consistent sample sizes for multiple runs. > > If I run 300 per group again, with 10,000 replicates: > > > ThreeGroups(300, 0.25, 0.25, 0.35, R = 10000) > 80% > 0.05033933 > > the returned p value is slightly higher. So, again, increase R to > improve the stability of the returned p value and run it multiple times > to be comfortable that the p value change is less than an acceptable > threshold. > > Now, the tricky part is to decide if the 3 x 2 is your primary endpoint, > and want to power only for that, or, if you also want to power for the > other two-group comparisons, possibly having to account for p value > adjustments for the multiple comparisons, resulting in the need to power > for a lower alpha level for those tests. In that scenario, you would end > up taking the largest sample size that you identify across the various > hypotheses, recognizing that while you are powering for one hypothesis, > you may be overpowering for others. > > That is something that you need to decide, and perhaps consider > consulting with other local statistical expertise, as may be apropos, in > the prospective study design, possibly influenced by other > relevant/similar research in your domain. > > You can easily modify the above function for the two-group scenario as > well, and I will leave that to you. > > Regards, > > Marc > > > AbouEl-Makarim Aboueissa wrote on 8/10/21 6:34 AM: > > Hi Marc: > > > > First, thank you very much for your help in this matter. > > > > > > Will perform an initial omnibus test of all three groups (e.g. 3 x 2 > > chi-square), possibly followed by > > all possible 2 x 2 pairwise comparisons (e.g. 1 versus 2, 1 versus 3, > > 2 versus 3), > > > > We can assume _either_ the desired sample size in each group is the same > > _or_ proportional to the population size. > > > > We can set p=0.25 and set p1=p2=p3=p so that the H0 is true. > > > > We can assume that the expected proportion of "Yes" values in each group > > is 0.25 > > > > For the alternative hypotheses, for example, we can set p1 = .25, > > p2=.25, p3=.35 > > > > > > Again thank you very much in advance. > > > > abou > > > > ______________________ > > > > *AbouEl-Makarim Aboueissa, PhD > > * > > * > > * > > *Professor, Statistics and Data Science* > > *Graduate Coordinator* > > *Department of Mathematics and Statistics > > * > > *University of Southern Maine* > > > > > > > > On Mon, Aug 9, 2021 at 10:53 AM Marc Schwartz <marc_schwa...@me.com > > <mailto:marc_schwa...@me.com>> wrote: > > > > Hi, > > > > You are going to need to provide more information than what you have > > below and I may be mis-interpreting what you have provided. > > > > Presuming you are designing a prospective, three-group, randomized > > allocation study, there is typically an a priori specification of the > > ratios of the sample sizes for each group such as 1:1:1, indicating > > that > > the desired sample size in each group is the same. > > > > You would also need to specify the expected proportions of "Yes" > values > > in each group. > > > > Further, you need to specify how you are going to compare the > > proportions in each group. Are you going to perform an initial > omnibus > > test of all three groups (e.g. 3 x 2 chi-square), possibly followed > by > > all possible 2 x 2 pairwise comparisons (e.g. 1 versus 2, 1 versus > 3, 2 > > versus 3), or are you just going to compare 2 versus 1, and 3 versus > 1, > > where 1 is a control group? > > > > Depending upon your testing plan, you may also need to account for p > > value adjustments for multiple comparisons, in which case, you also > > need > > to specify what adjustment method you plan to use, to know what the > > target alpha level will be. > > > > On the other hand, if you already have the data collected, thus have > > fixed sample sizes available per your wording below, simply go ahead > > and > > perform your planned analyses, as the notion of "power" is largely > an a > > priori consideration, which reflects the probability of finding a > > "statistically significant" result at a given alpha level, given that > > your a priori assumptions are valid. > > > > Regards, > > > > Marc Schwartz > > > > > > AbouEl-Makarim Aboueissa wrote on 8/9/21 9:41 AM: > > > Dear All: good morning > > > > > > *Re:* Sample Size Determination to Compare Three Independent > > Proportions > > > > > > *Situation:* > > > > > > Three Binary variables (Yes, No) > > > > > > Three independent populations with fixed sizes (*say:* N1 = 1500, > > N2 = 900, > > > N3 = 1350). > > > > > > Power = 0.80 > > > > > > How to choose the sample sizes to compare the three proportions > > of “Yes” > > > among the three variables. > > > > > > If you know a reference to this topic, it will be very helpful > too. > > > > > > with many thanks in advance > > > > > > abou > > > ______________________ > > > > > > > > > *AbouEl-Makarim Aboueissa, PhD* > > > > > > *Professor, Statistics and Data Science* > > > *Graduate Coordinator* > > > > > > *Department of Mathematics and Statistics* > > > *University of Southern Maine* > > > > > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.