You are asking for a one sample test.  Using your own data:

connection <- textConnection("
GD2  1   8 12  GD2  3 -12 10  GD2  6 -52  7
GD2  7  28 10  GD2  8  44  6  GD2 10  14  8
GD2 12   3  8  GD2 14 -52  9  GD2 15  35 11
GD2 18   6 13  GD2 20  12  7  GD2 23  -7 13
GD2 24 -52  9  GD2 26 -52 12  GD2 28  36 13
GD2 31 -52  8  GD2 33   9 10  GD2 34 -11 16
GD2 36 -52  6  GD2 39  15 14  GD2 40  13 13
GD2 42  21 13  GD2 44 -24 16  GD2 46 -52 13
GD2 48  28  9  GD2  2  15  9  GD2  4 -44 10
GD2  5  -2 12  GD2  9   8  7  GD2 11  12  7
GD2 13 -52  7  GD2 16  21  7  GD2 17  19 11
GD2 19   6 16  GD2 21  10 16  GD2 22 -15  6
GD2 25   4 15  GD2 27  -9  9  GD2 29  27 10
GD2 30   1 17  GD2 32  12  8  GD2 35  20  8
GD2 37 -32  8  GD2 38  15  8  GD2 41   5 14
GD2 43  35 13  GD2 45  28  9  GD2 47   6 15
")

hsv <- data.frame(scan(connection, list(vac="", pat=0, wks=0, x=0)))
hsv <- transform(hsv, status= (wks >0), wks = abs(wks))

fit1 <- survreg(Surv(wks, status) ~ 1, data=hsv, dist='exponential')
temp <- predict(fit1, type='quantile', p=.5, se=TRUE)

c(median= temp$fit[1], std= temp$se[1])
  median    std
24.32723  4.36930

--
The predict function gives the predicted median survival and standard deviation for each observation in the data set. Since this was a mean only model all n of them are the same and I printed only the first. For prediction they make the assumption that the std error for my future study will be the same as the std from this one, you want the future 95% CI to not include the value of 16, so the future mean will need to be at least 16 + 1.96* 4.369.

A nonparmetric version of the argument would be

fit2 <- survfit(Surv(wks, status) ~ 1, data=hsv)
print(fit2)
records   n.max n.start  events  median 0.95LCL 0.95UCL
     48      48      48      31      21      15      35

Then make the argument that in our future study, the 95% CI will stretch 6 units to the left of the median, just like it did here. This argument is a bit more tenuous though. The exponential CI width depends on the total number of events and total follow-up time, and we can guess that the new study will be similar. The Kaplan-Meier CI also depends on the spacing of the deaths, which is less likely to replicate.

Notes:
1. Use summary(fit2)$table to extract the CI values. In R the print functions don't allow you to "grab" what was printed, summary normally does. 2. For the exponential we could work out the formula in closed form -- a good homework exercise for grad students perhaps but not an exciting way to spend my own afternoon. An advantage of the above approach is that we can easily use a more realistic model like the weibull. 3. I've never liked extracting out the "Surv(t,s)" part of a formula as a separate statement on another line. If I ever need to read this code again, or even just the printout from the run, keeping it all together gives much better documentation. 4. Future calculations for survival data, of any form, are always tenuous since they depend critically on the total number of events that will be in the future study. We can legislate the total enrollment and follow-up time for that future study, but the number of events is never better than a guess. Paraphrasing a motto found on the door of a well respected investigator I worked with 30 years ago (because I don't remember it exaclty):

"The incidence of the condition under consideration and its subsequent death rate will both drop by 1/2 at the commencement of a study, and will not return to their former values until the study finishes or the PI retires."


Terry T.

---------------------------------------------------------------------------

On 07/10/2014 05:00 AM, r-help-requ...@r-project.org wrote:
Hello All,

I'm trying to figure out how to perform a survival analysis with an historical 
control. I've spent some time looking online and in my boooks but haven't found 
much showing how to do this. Was wondering if there is a R package that can do 
it, or if there are resources somewhere that show the actual steps one takes, 
or if some knowledgeable person might be willing to share some code.

Here is a statement that describes the sort of analyis I'm being asked to do.

A one-sample parametric test assuming an exponential form of survival was used 
to test the hypothesis that the treatment produces a median PFS no greater than 
the historical control PFS of 16 weeks.  A sample median PFS greater than 20.57 
weeks would fall beyond the critical value associated with the null hypothesis, 
and would be considered statistically significant at alpha = .05, 1 tailed.

My understanding is that the cutoff of 20.57 weeks was obtained using an online 
calculator that can be found at:

http://www.swogstat.org/stat/public/one_survival.htm

Thus far, I've been unable to determine what values were plugged into the 
calculator to get the cutoff.

There's another calculator for a nonparamertric test that can be found at:

http://www.swogstat.org/stat/public/one_nonparametric_survival.htm

It would be nice to try doing this using both a parameteric and a 
non-parametric model.

So my first question would be whether the approach outlined above is valid or 
if the analysis should be done some other way. If the basic idea is correct, is 
it relatively easy (for a Terry Therneau type genius) to implement the whole 
thing using R? The calculator is a great tool, but, if reasonable, it would be 
nice to be able to look at some code to see how the numbers actually get 
produced.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to