Hi Julia folks,

Trying to get a feel for Julia, I decided to write a basic Cox proportional 
hazards model. While I was implementing the routines to calculate the 
gradient and Hessian of the model's log-likelihood, I realized that I was 
making a ton of array "type errors" (dimension errors) that weren't being 
caught by Julia's type system because the array operations that I was using 
were too overloaded. Here are some examples of what I mean:

- For a 2d array X, trying to get a row with X[i] instead of X[i,:]. This 
returns a scalar, but if you try to add it to another row vector you'll 
silently get a different row vector than you expected instead of a failure.
- Reversing the order of y * transpose(y) (for y an array) to get the 
scalar product instead of the outer product (similar silent failure as 
above).
- Doing y .* z when one side is a row vector and the other side is a column 
vector, and forgetting to transpose them, causing an accidental outer 
product (via broadcasting) instead of elementwise product. This one is 
harder to get a silent failure with but I'm pretty sure I managed somehow.

I caught them all in testing (I think) and am fairly satisfied my code does 
the right thing now, but I'd love to know if there are conventions or tools 
I can use to limit these errors or catch them earlier. I'm sure I'm missing 
a ton of stuff because I'm a Julia novice (as well as a numerics novice in 
general). Does anyone have any pointers? If it helps, the code I wrote is 
here 
<https://github.com/benkuhn/Survival.jl/blob/master/survival/src/coxph.jl#L60>
. (It's correct now, or at least agrees with R on a small but nontrivial 
model; the link is to the function that computes the Hessian of the 
log-likelihood, which was unsurprisingly the most error-prone part.)

Thanks!
Ben

Reply via email to