Hi,

A search would suggest that there may not be an R function/package that provides power/sample size calculations for the specific scenarios that you are describing. There may be something that I am missing, and there is also other dedicated software such as PASS (https://www.ncss.com/software/pass/) which is not free, but provides a large library of possibly relevant functions and support.

That being said, you can run Monte Carlo simulations in R to achieve the results you want, while providing yourself with options relative to study design, intended tests, and adjustments for multiple comparisons as apropos. Many prefer this approach, since it gives you specific control over this process.

Taking the simple case, where you are going to run a 3 x 2 chi-square as your primary endpoint, and want to power for that, here is a possible function, with the same sample size in each group:

ThreeGroups <- function(n, p1, p2, p3, R = 10000, power = 0.8) {

  MCSim <- function(n, p1, p2, p3) {
    ## Create a binary distribution for each group
    G1 <- rbinom(n, 1, p1)
    G2 <- rbinom(n, 1, p2)
    G3 <- rbinom(n, 1, p3)

    ## Create a 3 x 2 matrix containing the 3 group counts
    MAT <- cbind(table(G1), table(G2), table(G3))

    ## Perform a chi-square and just return the p value
    chisq.test(MAT)$p.value
  }

  ## Replicate the above R times, and get
  ## a distribution of p values
  MC <- replicate(R, MCSim(n, p1, p2, p3))

  ## Get the p value at the desired "power" quantile
  quantile(MC, power)
}

Essentially, the above internal MCSim() function generates 3 random samples of size 'n' from the binomial distribution, at the 3 proportions desired. For each run, it will perform a chi-square test of the 3 x 2 matrix of counts, returning the p value for each run. The main function will then return the p value at the quantile (power) within the generated distribution of p values.

You can look at the help pages for the various functions that I use above, to get a sense for how they work.

You increase the sample size ('n') until you get a p value returned <= 0.05, if that is your desired alpha level.

You also want 'R', the number of replications within each run, to be large enough so that the returned p value quantile is relatively stable. Values for 'R', once you get "close to" the desired p value should be on the order of 1,000,000 or higher. Stay with lower values for 'R' until you get in the ballpark of your target, since larger values take much longer to run.

Thus, using your example proportions of 0.25, 0.25, and 0.35:

## 250 per group, 750 total - Not enough
> ThreeGroups(250, 0.25, 0.25, 0.35, R = 10000)
       80%
0.08884723

## 350 per group, 1050 total - Too high
> ThreeGroups(350, 0.25, 0.25, 0.35, R = 10000)
      80%
0.0270829

## 300 per group, 900 total - Close!
> ThreeGroups(300, 0.25, 0.25, 0.35, R = 10000)
       80%
0.04818842


So, keep tweaking the sample size until you get a returned p value at your target alpha level, with a large enough 'R', so that you get consistent sample sizes for multiple runs.

If I run 300 per group again, with 10,000 replicates:

> ThreeGroups(300, 0.25, 0.25, 0.35, R = 10000)
       80%
0.05033933

the returned p value is slightly higher. So, again, increase R to improve the stability of the returned p value and run it multiple times to be comfortable that the p value change is less than an acceptable threshold.

Now, the tricky part is to decide if the 3 x 2 is your primary endpoint, and want to power only for that, or, if you also want to power for the other two-group comparisons, possibly having to account for p value adjustments for the multiple comparisons, resulting in the need to power for a lower alpha level for those tests. In that scenario, you would end up taking the largest sample size that you identify across the various hypotheses, recognizing that while you are powering for one hypothesis, you may be overpowering for others.

That is something that you need to decide, and perhaps consider consulting with other local statistical expertise, as may be apropos, in the prospective study design, possibly influenced by other relevant/similar research in your domain.

You can easily modify the above function for the two-group scenario as well, and I will leave that to you.

Regards,

Marc


AbouEl-Makarim Aboueissa wrote on 8/10/21 6:34 AM:
Hi Marc:

First, thank you very much for your help in this matter.


Will perform an initial omnibus test of all three groups (e.g. 3 x 2 chi-square), possibly followed by all possible 2 x 2 pairwise comparisons (e.g. 1 versus 2, 1 versus 3, 2 versus 3),

We can assume _either_ the desired sample size in each group is the same _or_ proportional to the population size.

  We can set p=0.25 and set p1=p2=p3=p so that the H0 is true.

We can assume that the expected proportion of "Yes" values in each group is 0.25

For the alternative hypotheses, for example,  we can set  p1 = .25, p2=.25, p3=.35


Again thank you very much in advance.

abou

______________________

*AbouEl-Makarim Aboueissa, PhD
*
*
*
*Professor, Statistics and Data Science*
*Graduate Coordinator*
*Department of Mathematics and Statistics
*
*University of Southern Maine*



On Mon, Aug 9, 2021 at 10:53 AM Marc Schwartz <marc_schwa...@me.com <mailto:marc_schwa...@me.com>> wrote:

    Hi,

    You are going to need to provide more information than what you have
    below and I may be mis-interpreting what you have provided.

    Presuming you are designing a prospective, three-group, randomized
    allocation study, there is typically an a priori specification of the
    ratios of the sample sizes for each group such as 1:1:1, indicating
    that
    the desired sample size in each group is the same.

    You would also need to specify the expected proportions of "Yes" values
    in each group.

    Further, you need to specify how you are going to compare the
    proportions in each group. Are you going to perform an initial omnibus
    test of all three groups (e.g. 3 x 2 chi-square), possibly followed by
    all possible 2 x 2 pairwise comparisons (e.g. 1 versus 2, 1 versus 3, 2
    versus 3), or are you just going to compare 2 versus 1, and 3 versus 1,
    where 1 is a control group?

    Depending upon your testing plan, you may also need to account for p
    value adjustments for multiple comparisons, in which case, you also
    need
    to specify what adjustment method you plan to use, to know what the
    target alpha level will be.

    On the other hand, if you already have the data collected, thus have
    fixed sample sizes available per your wording below, simply go ahead
    and
    perform your planned analyses, as the notion of "power" is largely an a
    priori consideration, which reflects the probability of finding a
    "statistically significant" result at a given alpha level, given that
    your a priori assumptions are valid.

    Regards,

    Marc Schwartz


    AbouEl-Makarim Aboueissa wrote on 8/9/21 9:41 AM:
     > Dear All: good morning
     >
     > *Re:* Sample Size Determination to Compare Three Independent
    Proportions
     >
     > *Situation:*
     >
     > Three Binary variables (Yes, No)
     >
     > Three independent populations with fixed sizes (*say:* N1 = 1500,
    N2 = 900,
     > N3 = 1350).
     >
     > Power = 0.80
     >
     > How to choose the sample sizes to compare the three proportions
    of “Yes”
     > among the three variables.
     >
     > If you know a reference to this topic, it will be very helpful too.
     >
     > with many thanks in advance
     >
     > abou
     > ______________________
     >
     >
     > *AbouEl-Makarim Aboueissa, PhD*
     >
     > *Professor, Statistics and Data Science*
     > *Graduate Coordinator*
     >
     > *Department of Mathematics and Statistics*
     > *University of Southern Maine*
     >


______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to