As a template for creating a broadcast variable, the following code snippet within mllib was used:
val bcIdf = dataset.context.broadcast(idf) dataset.mapPartitions { iter => val thisIdf = bcIdf.value The new code follows that model: import org.apache.spark.mllib.linalg.{Vector => MVector} .. assert(crows.isInstanceOf[Array[MVector]]) val bcRows = sc.broadcast(crows) val GU = mat.rows.zipWithIndex.mapPartitions { case dataIter => val arrayVect = bcRows.value // bcRows.value is seen in debugger to be of type Array[Byte] .. ?? That last line is unhappy: java.lang.ClassCastException: [B cannot be cast to [Lorg.apache.spark.mllib.linalg.Vector; So the compiler is aware that the return type of the broadcast "value" method should be an array of vector (which it should). However the actual type is Array[Byte]. Any insights on this?