I've re-looked at survexp with the question of efficiency. As it stands, the code will have 3-4 (I think it's 4) active copies of the X matrix at one point; this is likely the reason it takes so much memory when you have a large data set. Some of this is history; key parts of the code were written long before I understood all the "tricks" for smaller memory in S (Splus or R), 1 copy is the loss of the COPY= argument when going from Splus to R.
I can see how to redo it and reduce to 1 copy, but this involves 3 R functions and 3 C routines. I'll add it to my list but don't expect quick results due to a long list in front of it. It's been a good summer, but as one of my colleagues put it "No vacation goes unpunished." As a mid term suggestion I would use a subsample of your data. With the data set sizes you describe a 20% subsample will give all the precision that you need. Specifically: 1. Save the results of your current Cox model, call it fit1 2. Select a subset. 3. Fit a new Cox model on the subset, with the options iter=0, init=fit1$coef This ensures that the subset has exactly the same coefficients as the original. 4. Use survexp on the subset fit. Terry Therneau ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.