Julia pathtracer is now rendering the test scene in 11s !! (all updated code is in the repository)
In the end Ivar Nesje's suggestion made me remove @devec and manually devectorize the 4 places which had complicated vector maths. Immediately the time dropped from 27s to 11s. So julia code is now the same speed as python code. Now to beat it! :) Thanks to everyone who helped out. mike.