Re: [R] val.surv

David Winsemius Sun, 21 Aug 2011 09:37:38 -0700


On Aug 21, 2011, at 7:03 AM, Salvo Mac wrote:

 I've   attached a sample of the data sets, this is my full code.

#R 2.13
#library(survival)
#library(Hmisc)
#library(splines)
library(rms)
train<-as.data.frame( train<-read.csv("G:\\train.txt", header=T,sep="\t"))test<-as.data.frame( test<-read.table("G:\\test.txt", header=T,sep="\t"))
f.1<-cph(Surv(time,event)~age, x=T, y=T,surv=T, data=train)
val.surv(f.1, newdata=test, u=10)


#plot(calibrate(f.1, u=30, B=20))


Salvo;

I'm not sure where the error is coming from (although I will share myspeculation below). I modified your code somewhat. I was under theimpression that read.csv had sep="," hardcoded and the read.csv( ...seo="\t") would throw and error. The as.data.fame around either of theread.* statements is completely unnecessary:


Input code:
train<-read.table("~/train.txt", header=T, sep="\t")
 test<-read.table("~/test.txt", header=T, sep="\t")

I thought the newdata=test argument should succeed, but it does throwthe error that you report:


 f.1<-cph(Surv(time,event)~age, x=T, y=T,surv=T, data=train)
 val.surv(f.1, newdata=test, u=10)
Error in val.surv(f.1, newdata = test, u = 10) :
  dims [product 210] do not match the length of object [314]
In addition: Warning message:
In est.surv + S[, 1] :
  longer object length is not a multiple of shorter object length

I tried sampling from test with replacement to generate a dataframe ofequal extent and with that object I do not get the same errors:


> extend <- test[sample(1:nrow(test), 314, replace=TRUE), ]
>  f.1<-cph(Surv(time,event)~age, x=T, y=T,surv=T, data=train)
> val.surv(f.1, newdata=extend, u=10)

Validation of Predicted Survival at Time= 10    n= 314 , events= 64

hare fit:

dim A/D   loglik       AIC        penalty
                                min    max
  1 Add   -485.24    976.22  271.33     Inf
  2 Del   -349.57    710.64    5.67  271.33
  3 Del   -346.74    710.72    0.96    5.67
  4 Del   -346.26    715.51    0.01    0.96
  5 Add   -346.25    721.25    0.00    0.01

the present optimal number of dimensions is 2.
penalty(AIC) was 5.75, the default (BIC), would have been 5.75.

  dim1           dim2           beta        SE         Wald
Constant                             -10       0.55  -18.14
Time        43                      0.15      0.015   10.30

Function used to transform predictions:
function (p)  log(-log(p))

Mean absolute error in predicted probabilities: 0.0331
0.9 Quantile of absolute errors               : 0.0613

I have also tried looking at the code and addingoptions(error=utils::recover) to see if I can identify the point wherethe length mismatch is being generated, (but I am NOT an acedebugger). I can see that est.surv is created. I can also get thepredict(fit, newdata, type = "lp") call to run with train and givesensible numbers. You did not create (or at least say so) a datadist.I tried that when I saw that "lim" was dependent on datadist limits.


ddT <- datadist(train); options(datadist="ddT")

Seeing the usehare was set to TRUE; I submitted the code section thatwas intended for that situation to a browser session and the the firstline throws the error that I see at the console:


Enter a frame number, or 0 to exit

1: val.surv(f.1, newdata = test, u = 10)

Selection: 1
Called from: top level
Browse[1]> i <- !is.na(est.surv + S[, 1] + S[, 2])

Error during wrapup: dims [product 210] do not match the length ofobject [314]

As I read this that line is supposed to create an index of valid rowsin newdata and the fitted values but is failing in the situation wherethe nrows of the newdata does not match that of the fit.

At this point one option might be trying to generate a structure thatmatches the val.surv output by going through the code and building itup bit by bit:


 w <- structure(list(harefit = f, p = est.surv, actual = actual,
            pseq = pseq, actualseq = actualseq, u = u, fun = fun,

n = nrow(S), d = sum(S[, 2]), units = units), class ="val.survh")

        return(w)

But a much better option would be to report the error to FrankHarrell. I'm copying him since I think your .txt files probablyreached the list as well as my mailbox.


--
David.

________________________________
From: David Winsemius <dwinsem...@comcast.net>
To: Salvo Mac <salvo_...@yahoo.com>
Cc: "r-help@R-project.org" <r-help@r-project.org>
Sent: Sunday, August 21, 2011 5:29 AM
Subject: Re: [R] val.surv


On Aug 20, 2011, at 10:25 PM, Salvo Mac wrote:
The test and train are like split data sets, contain similarvariables but from different countries so the two sets are somehowindependent. And yes it is a data frame.
What is a data.frame? test and train may be dataframes, but test[,"age"] is not a dataframe.
So I extracted age, time and event.
Code? The code you offered before would have created a newdataobject (yes, a data.frame) with a single column bearing the samename as the vector argument ,,,,, "test1". Not named "age". Try it.do str on such an object:
str(dataframe(test1))
So test is data frame,(age, time, event). does that suffice?
It certainly does not allow me to reproduce the error you got (whichI still think is probably related to the structure of your argumentto newdata.)
That's all I can say without data and code.

--David.
________________________________
From: David Winsemius <dwinsem...@comcast.net>

Cc: "r-help@R-project.org" <r-help@r-project.org>
Sent: Sunday, August 21, 2011 3:19 AM
Subject: Re: [R] val.surv


On Aug 20, 2011, at 8:08 PM, Salvo Mac wrote:
Thanks David
However, I tried your trick on val.surv with newdata=test['age']but still didn't work.
Still gives the same error message:

Error in val.surv(f.1, newdata = test1["age"], u = 10) :
    dims [product 1797] do not match the length of object [2496]
In addition: Warning message:
In est.surv + S[, 1] :
    longer object length is not a multiple of shorter object length
As I said (and you did
not act  upon):
The fundamental thing you are doing wrong for q1 is failing tounambiguously describe the test object.
I said it was a guess. Now stop wasting our time and offer what isneeded.
--david.
Salvo
________________________________
From: David Winsemius <dwinsem...@comcast.net>

Cc: "r-help@R-project.org" <r-help@r-project.org>
Sent: Sunday, August 21, 2011 12:55 AM
Subject: Re: [R] val.surv


On Aug 20, 2011, at 3:32 PM, Salvo Mac wrote:
    Dear R-users,

I  have two questions regarding
validation and calibration of Survival regression models.
1. I am trying to calibrate and validate a cox model usingval.surv.
here is my code:
    f.1<-cph(Surv(time,event)~age, x=T, y=T, data=train)
    test1<-test[,"age"]
    val.surv(f.1, newdata=data.frame(test1), u=10)

    but I get an error message:

    Error in val.surv(f.1, newdata = data.frame(testi), u = 10) :
     dims [product 1797] do not match the length of object [2496]
    In addition: Warning message:
In est.surv + S[, 1] :
longer object length is not a multiple of shorter objectlength
I ran the example in the r-documentation but couldn'textract dxy from result.
    What am I doing wrong?
The fundamental thing you are doing wrong for q1 is failing tounambiguously describe the test object. I would think that if testwere a dataframe then wrapping data.frame around a vector mightnot get it named correctly as 'age'. You might try newdata=test['age']. Just a guess.
2. In validate and calibrate cph functions. If it is frailtyfit, does the the bootstrap resample clusters or just individuals
The code above appears to be dependent on the rms package. Thefrailty function is part of the underlying survival package and Ido not see it mentioned in the index for Harrell's RMS text. Youwill probably need to wait until Frank comes across this. He isgenerally very good about correction my errors and knowledge gaps.
David Winsemius, MD
West Hartford,CT<train.txt><test.txt>______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] val.surv

Reply via email to