Due to SPARK-2245, you can not use count to materialize VertexRDD. That
actually materialize PartitionRDD, so checkpoint for VertexRDD won't work.
I'll trying to fix that right now.
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/Checkpointed-RDD-still
Do not call collect as that will perform materialization as well as
transfer of data to driver (might actually cause driver to fail if the data
is huge). You have to materialize the RDD in some way(call save, count,
collect).
Mayur Rustagi
Ph: +1 (760) 203 3257
http://www.sigmoidanalytics.com
@may
Calling checkpoint() alone doesn't cut the lineage. It only marks the
RDD as to be checkpointed. The lineage is cut after the first time
this RDD is materialized. You see StackOverflow becaure the lineage is
still there. -Xiangrui
On Sun, Jun 22, 2014 at 6:37 PM, dash wrote:
> Hi Xiangrui,
>
> Ac
Hi Xiangrui,
According to my knowledge, calling count is for materialize the RDD, does
collect do the same thing since it also an action? I can not call count
because for a Graph object, count does not materialize the RDD. I already
send an issue on that.
My question is, why there still have stac
After checkpoint(), please call count(). This is similar to cache(),
the RDD is only marked as to be checked with checkpoint(). -Xiangrui
On Sun, Jun 22, 2014 at 3:14 PM, dash wrote:
> Hi,
>
> I'm doing iterative computing now, and due to lineage chain, we need to
> checkpoint the RDD in order to