Hi all,
I am working on a matrix multiplication operation for Mahout Flink Bindings
that uses quite a few chained Flink Dataset operations,
When testing, I am getting the following error:
{...}
04/09/2016 22:30:35 CHAIN Reduce (Reduce at
org.apache.mahout.flinkbindings.blas.FlinkOpABt$.abt_nograph(FlinkOpABt.scala:147))
-> FlatMap (FlatMap at
org.apache.mahout.flinkbindings.drm.BlockifiedFlinkDrm.asRowWise(FlinkDrm.scala:93))(1/1)
switched to CANCELED
04/09/2016 22:30:35 CHAIN Partition -> Map (Map at
org.apache.mahout.flinkbindings.blas.FlinkOpABt$.pairwiseApply(FlinkOpABt.scala:240))
-> GroupCombine (GroupCombine at
org.apache.mahout.flinkbindings.blas.FlinkOpABt$.abt_nograph(FlinkOpABt.scala:129))
-> Combine (Reduce at
org.apache.mahout.flinkbindings.blas.FlinkOpABt$.abt_nograph(FlinkOpABt.scala:147))(3/3)
switched to FAILED
java.lang.StackOverflowError
at
com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:48)
at
com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495)
at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523)
at
com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61)
at
com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495)
at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523)
at
com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61)
at
com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495)
at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523)
at
com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61)
at
com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495)
{...}
I've seen similar issues on the dev@flink list (and other places), but I
believe that they were from recursive calls and objects which pointed back to
themselves somehow.
This is a relatively straightforward method, it just has several Flink
operations before execution is triggered. If I remove some operations, eg. a
reduce, i can get the method to complete on a simple test however the it will
then, of course be numerically incorrect.
I am wondering if there is any workaround for this type of problem?
Thank You,
Andy