Actually HY emailed me offline about this and this is supported in the latest version of Tachyon. It is a hard problem to push this into storage; need to think about how to handle isolation, resource allocation, etc.
https://github.com/amplab/tachyon/blob/master/core/src/main/java/tachyon/master/Dependency.java On Thu, Dec 11, 2014 at 3:54 PM, Reynold Xin <r...@databricks.com> wrote: > I don't think the lineage thing is even turned on in Tachyon - it was > mostly a research prototype, so I don't think it'd make sense for us to use > that. > > > On Thu, Dec 11, 2014 at 3:51 PM, Andrew Ash <and...@andrewash.com> wrote: > >> I'm interested in understanding this as well. One of the main ways >> Tachyon >> is supposed to realize performance gains without sacrificing durability is >> by storing the lineage of data rather than full copies of it (similar to >> Spark). But if Spark isn't sending lineage information into Tachyon, then >> I'm not sure how this isn't a durability concern. >> >> On Wed, Dec 10, 2014 at 5:47 AM, Jun Feng Liu <liuj...@cn.ibm.com> wrote: >> >> > Dose Spark today really leverage Tachyon linage to process data? It >> seems >> > like the application should call createDependency function in TachyonFS >> > to create a new linage node. But I did not find any place call that in >> > Spark code. Did I missed anything? >> > >> > Best Regards >> > >> > >> > *Jun Feng Liu* >> > IBM China Systems & Technology Laboratory in Beijing >> > >> > ------------------------------ >> > [image: 2D barcode - encoded with contact information] *Phone: >> *86-10-82452683 >> > >> > * E-mail:* *liuj...@cn.ibm.com* <liuj...@cn.ibm.com> >> > [image: IBM] >> > >> > BLD 28,ZGC Software Park >> > No.8 Rd.Dong Bei Wang West, Dist.Haidian Beijing 100193 >> > China >> > >> > >> > >> > >> > >> > >