This kind of thing is a balancing act. All of these behaviors are really useful – or at least I find myself using them all the time. We tried for a while to disallow `scalar + vector` and require doing `scalar .+ vector` instead, but after a few months, everyone agreed that it was really annoying so we changed it back.
I find that one of the most effective ways to make sure your code actually does what you think it does is to develop it interactively in the REPL and only once you've got a procedure working with intermediate values that make sense, worrying about wrapping it all up into a function. All of these errors would be obvious if you tried them interactively since the result of some step would have the wrong shape. This approach is kind of the inverse of writing your code and then walking through it in a debugger. It feels a bit odd if you're used to static languages, but it's really very effective and once you get used to it, not being able to work that way feels somewhat stifling. On Fri, Jan 23, 2015 at 5:37 PM, Ben Kuhn <[email protected]> wrote: > Hi Julia folks, > > Trying to get a feel for Julia, I decided to write a basic Cox > proportional hazards model. While I was implementing the routines to > calculate the gradient and Hessian of the model's log-likelihood, I > realized that I was making a ton of array "type errors" (dimension errors) > that weren't being caught by Julia's type system because the array > operations that I was using were too overloaded. Here are some examples of > what I mean: > > - For a 2d array X, trying to get a row with X[i] instead of X[i,:]. This > returns a scalar, but if you try to add it to another row vector you'll > silently get a different row vector than you expected instead of a failure. > - Reversing the order of y * transpose(y) (for y an array) to get the > scalar product instead of the outer product (similar silent failure as > above). > - Doing y .* z when one side is a row vector and the other side is a > column vector, and forgetting to transpose them, causing an accidental > outer product (via broadcasting) instead of elementwise product. This one > is harder to get a silent failure with but I'm pretty sure I managed > somehow. > > I caught them all in testing (I think) and am fairly satisfied my code > does the right thing now, but I'd love to know if there are conventions or > tools I can use to limit these errors or catch them earlier. I'm sure I'm > missing a ton of stuff because I'm a Julia novice (as well as a numerics > novice in general). Does anyone have any pointers? If it helps, the code I > wrote is here > <https://github.com/benkuhn/Survival.jl/blob/master/survival/src/coxph.jl#L60> > . (It's correct now, or at least agrees with R on a small but nontrivial > model; the link is to the function that computes the Hessian of the > log-likelihood, which was unsurprisingly the most error-prone part.) > > Thanks! > Ben >
