Re: [R] Sample size calculation for differences between two very small proportions (Fisher's exact test or others)?

David Winsemius Mon, 08 Nov 2010 11:28:46 -0800

On Nov 8, 2010, at 1:00 PM, Giulio Di Giovanni wrote:

> Yep, it is 20.000 per arm, sorry. The reference it's about an  
> application of the method, and I cannot download the paper with the  
> main algorithm, so I don't know exactly how they did.
> Thanks everybody for the rich and interesting suggestions. Through  
> free web software (PS, others)  I found also an N around 47.000 per  
> arm. I guess these are the values (also seen Marc's Monte Carlo).
> Maybe the Poisson models approach suggested by David can be an  
> alternative, even if I guess at this point I won't get big  
> differences in numbers. Would I?


I certainly would not expect remarkable differences.  With 50,000/arm  
you would be expecting:
 > c(p1 = 0.00154, p2 = 0.00234)*50000
  p1  p2
  77 117
# with a rate ratio of:
 > 0.00234/0.00154
[1] 1.519481

A difference of 30 in expected counts would seem to give fairly  
significant power. It seems that a Poisson structured test might give  
you smaller numbers but probably not as small as 20,000
 > c(p1 = 0.00154, p2 = 0.00234)*20000
   p1   p2
30.8 46.8

(The sd() of a Poisson variable is sqrt(mean()) so that 31 is well  
within any sensibly constructed CI around 47.)

If you look up Table 7.5 in Breslow and Day (vol2, page 283) with a  
relative risk of 1.5,  the necessary expected value in the control  
group using and equal sized  control group ( for 80% power at 5%  
significance) is 64.9. So that a bit lower than the 77 above but  
implies that 42,207 would be needed.

-- 
David.


>
> Thanks a lot everybody again for your suggestions,
> if anybody has other comments, they are always welcome.
>
> Best,
>
> Giulio
>
>
> > Subject: Re: [R] Sample size calculation for differences between  
> two very small proportions (Fisher's exact test or others)?
> > From: marc_schwa...@me.com
> > Date: Mon, 8 Nov 2010 11:13:12 -0600
> > CC: perimessagg...@hotmail.com; r-h...@stat.math.ethz.ch
> > To: mmal...@gmail.com
> >
> > Hi,
> >
> > I don't have access to the article, but must presume that they are  
> doing something "radically different" if you are "only" getting a  
> total sample size of 20,000. Or is that 20,000 per arm?
> >
> > Using the G*Power app that Mitchell references below (which I have  
> used previously, since they have a Mac app):
> >
> > Exact - Proportions: Inequality, two independent groups (Fisher's  
> exact test)
> >
> > Options:    Exact distribution
> >
> > Analysis:   A priori: Compute required sample size
> > Input:       Tail(s) =      Two
> > Proportion p1 =     0.00154
> > Proportion p2 =     0.00234
> > Î± err prob =       0.05
> > Power (1-Î² err prob) =     0.8
> > Allocation ratio N2/N1 =    1
> > Output:      Sample size group 1 =  49851
> > Sample size group 2 =       49851
> > Total sample size = 99702
> > Actual power =      0.8168040
> > Actual Î± = 0.0462658
> >
> >
> >
> >
> > Using the base R power.prop.test() function:
> >
> > > power.prop.test(p1 = 0.00154, p2 = 0.00234, power = 0.8)
> >
> > Two-sample comparison of proportions power calculation
> >
> > n = 47490.34
> > p1 = 0.00154
> > p2 = 0.00234
> > sig.level = 0.05
> > power = 0.8
> > alternative = two.sided
> >
> > NOTE: n is number in *each* group
> >
> >
> >
> > Using Frank's bsamsize() function in Hmisc:
> >
> > > bsamsize(p1 = 0.00154, p2 = 0.00234, fraction = .5, alpha = .05,  
> power = .8)
> > n1 n2
> > 47490.34 47490.34
> >
> >
> >
> > Finally, throwing together a quick Monte Carlo simulation using  
> the FET, I get:
> >
> > TwoSampleFET <- function(n, p1, p2, power = 0.85,
> > R = 5000, correct = FALSE)
> > {
> > MCSim <- function(n, p1, p2)
> > {
> > Control <- rbinom(n, 1, p1)
> > Treat <- rbinom(n, 1, p2)
> > fisher.test(cbind(table(Control), table(Treat)))$p.value
> > }
> >
> > # Run MC Replicates
> > MC.res <- replicate(R, MCSim(n, p1, p2))
> >
> > # Get p value at power quantile
> > quantile(MC.res, power)
> > }
> >
> >
> > # 50,000 per arm
> > > TwoSampleFET(50000, p1 = 0.00154, p2 = 0.00234, power = 0.8, R =  
> 500)
> > 80%
> > 0.04628263
> >
> >
> >
> > So all four of these are coming back with numbers in the 48,000 to  
> 50,000 ***per arm***.
> >
> >
> > HTH,
> >
> > Marc Schwartz
> >
> >
> > On Nov 8, 2010, at 10:16 AM, Mitchell Maltenfort wrote:
> >
> > > Not with R, but look for G*Power3, a free tool for power calc,
> > > includes FIsher's test.
> > >
> > > http://www.psycho.uni-duesseldorf.de/abteilungen/aap/gpower3
> > >
> > > On Mon, Nov 8, 2010 at 10:52 AM, Giulio Di Giovanni
> > > <perimessagg...@hotmail.com> wrote:
> > >>
> > >>
> > >> Hi,
> > >> I'm try to compute the minimum sample size needed to have at  
> least an 80% of power, with alpha=0.05. The problem is that  
> empirical proportions are really small: 0.00154 in one case and  
> 0.00234. These are the estimated failure proportion of two medical  
> treatments.
> > >> Thomas and Conlon (1992) suggested Fisher's exact test and  
> proposed a computational method, which according to their table  
> gives a sample size of roughly 20000. Unfortunately I cannot find  
> any software applying their method.
> > >> -Does anyone know how to estimate sample size on Fisher's exact  
> test by using R?
> > >> -Even better, does anybody know other, maybe optimal, methods  
> for such a situation (small p1 and p2) and the corresponding R  
> software?
> > >>
> > >> Thanks in advance,
> > >> Giulio
> >

David Winsemius, MD
West Hartford, CT


        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sample size calculation for differences between two very small proportions (Fisher's exact test or others)?

Reply via email to