Agreed with Jerry. Aside from Tachyon, seeing this for general debugging
would be very helpful.
Haoyuan, is that feature you are referring to related to
https://issues.apache.org/jira/browse/SPARK-975?
In the interim, I've found the "toDebugString()" method useful (but it
renders execution as a t
Jerry,
Great question. Spark and Tachyon capture lineage information at different
granularities. We are working on an integration between Spark/Tachyon about
this. Hope to get it ready to be released soon.
Best,
Haoyuan
On Fri, Jan 2, 2015 at 12:24 PM, Jerry Lam wrote:
> Hi spark developers,
Hi spark developers,
I was thinking it would be nice to extract the data lineage information
from a data processing pipeline. I assume that spark/tachyon keeps this
information somewhere. For instance, a data processing pipeline uses
datasource A and B to produce C. C is then used by another proce