On a Leopard Mac with the Urbanek compiled 64 bit R, one sees this:
> library(rpart)
> library(survival)
Loading required package: splines
> fit<-rpart(Surv(N,Y,type="interval2")~Salt+pH+Temp, data=myD)
*** caught segfault ***
address 0x0, cause 'memory not mapped'
Traceback:
1: .C(C_rpartexp2, as.integer(length(dtimes)),
as.double(dtimes), as.double(.Machine$double.eps), keep =
integer(length(dtimes)))
2: (get(paste("rpart", method, sep = ".")))(Y, offset, , wt)
3: rpart(Surv(N, Y, type = "interval2") ~ Salt + pH + Temp, data =
myD)
Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
Choosing "4" does save the workspace.
--
David Winsemius
On Jan 9, 2009, at 9:04 AM, Keith Jewell wrote:
Hi Everyone,
This example code results in R 'crashing'; that is the R application
closes
with no warnings or error messages.
#-----------------------
myD <- read.table(stdin(), header=TRUE, nrows=20)
Broth Salt pH Temp N Y Growth
1 310 9.0 2.92 10 90.0 NA 0
2 615 6.0 7.82 30 1.0 2 1
3 217 2.0 7.34 10 7.0 8 1
4 338 10.0 4.44 10 90.0 NA 0
5 240 4.0 7.33 10 20.0 21 1
6 336 10.0 3.90 10 90.0 NA 0
7 279 7.0 6.73 10 90.0 NA 0
8 1021 9.0 5.03 45 8.0 9 1
9 974 7.0 4.01 45 90.0 NA 0
10 265 7.0 2.93 10 90.0 NA 0
11 934 4.0 5.28 45 0.1 1 1
12 669 9.0 5.03 30 90.0 NA 0
13 875 10.0 6.24 37 1.0 2 1
14 385 2.0 5.84 20 1.0 2 1
15 562 2.0 5.84 30 0.1 1 1
16 718 0.5 5.54 37 0.1 1 1
17 845 9.0 5.03 37 3.0 6 1
18 913 2.0 5.84 45 0.1 1 1
19 577 4.0 4.10 30 90.0 NA 0
20 20 0.5 7.44 8 24.0 27 1
library(rpart)
library(survival)
fit<-rpart(Surv(N,Y,type="interval2")~Salt+pH+Temp, data=myD)
#---------------------
Professor Ripley helpfully pointed out that the documentation does
not say
that interval censoring is supported, and indeed this seems only to
happen
with interval censored data.
?rpart indicates that the dependent variable may be a survival object.
Neither ?rpart nor "An Introduction to Recursive Partitioning Using
the
RPART Routines" (Therneau et al 1997) suggest that the dependent
variable
may contain interval censored data, but neither do they suggest it
shouldn't; i.e. as far as I'm aware (!) this restriction is not
documented.
This post has three purposes:
1) Bring this behaviour - especially the crash in response to 'bad'
data -
to the attention of the authors.
2) Seek an explanation of the restriction (if intentional). In my
simplicity, it seems that interval censored data should be easier to
handle
than left or right censored - after all the information content is
greater.
3) Seek guidance on how to work around the problem. I'm minded to
replace
the interval censored data by the mid points of the intervals. Does
anyone
have any comments on such an approach?
Any comments gratefully received.
Keith Jewell
==========================================
Version:
platform = i386-pc-mingw32
arch = i386
os = mingw32
system = i386, mingw32
status = Patched
major = 2
minor = 8.1
year = 2009
month = 01
day = 07
svn rev = 47502
language = R
version.string = R version 2.8.1 Patched (2009-01-07 r47502)
Windows Server 2003 x64 (build 3790) Service Pack 2
Locale:
LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United
Kingdom.1252;LC_MONETARY=English_United
Kingdom.1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252
Search Path:
.GlobalEnv, package:stats, package:graphics, package:grDevices,
package:utils, package:datasets, package:methods, Autoloads,
package:base
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.