Re: [R] Interval censored Data in survreg() with zero values!

Geraldine Henningsen Mon, 12 Jan 2009 08:50:35 -0800

Hello again,

I studied your suggestion but still I disagree.  You wrote:


"From the way you wrote the problem I assumed 
that there is some number of n "looks" at the subject and then you count them 
up."

But this is not the case. My data is clearly continuous quantities and no 
discrete choices. I know nothing about the underlying choice process, the only 
thing I know is the final share of one of three regimes. So sorry for the bad 
description of the problem.
So I stick with my censored data model. Still the hint about the p-values is 
very helpful because I actually ran into this problem. So thank you for the 
hint.

Best, Geraldine 



Terry Therneau schrieb:
> Apologies -- you are being more subtle than I thought.  Nevertheless, I think 
> that the censoring language isn't quite right.
>
>   You are thinking of a hierarchical model:
>   
>     z ~ N(Xb, sigma), where Xb is the linear predictor, whatever covariates 
> you 
> think belong in the model.  Whether the distribution should be Gaussian or 
> somthing else depends not on the overall distribution of z, but on 
> distribution 
> of (z | Xb).  We could have a skewed predictor leading to skewed z, even if 
> the 
> distribution about any given expectation is symmetric.
>     
>     y = F(z) is what you observe.  The classic tobin model is y= max(0,z), 
> which 
> does lead to censored data. 
>     
>     In your case y_i = Binomial(n_i, p_i = H(z)).  Note a binomial is k heads 
> out of n tries with a coin of probability p, a "Bernouli" is a binomial 
> restricted to a single coin flip.  From the way you wrote the problem I 
> assumed 
> that there is some number of n "looks" at the subject and then you count them 
> up.  Note that var(y) = n p (1-p)
>     
>     H describes how the probability changes with z.  In biology we very 
> rarely 
> use H(z)= max(min(z,1),0) because it gives a hard threshold, and the 
> probability 
> of nearly anything doesn't go all the way to zero or one.  
>     
>     If H were as above and 
>       var(y) = constant and
>       n is sufficiently large so that Binomial dist is approx Gaussian and
>       var(y |p) << var(z| Xb)
>
> then your y will fit a censored Gaussian.  Since at least the second is 
> false, 
> it doesn't.  
>
>    A censored model may still be an ok first cut at fitting the data, but I 
> would be suspicious of variance estimates and particularly of any p-values.  
> The 
> bootstrap could help that.
>    
>       Terry T.
>        
>
>
>

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Interval censored Data in survreg() with zero values!

Reply via email to