Re: Get full RDD lineage for a spark job

2017-07-23 Thread Ron Gonzalez
Cool thanks. Will give that a try... --Ron On Friday, July 21, 2017 8:09 PM, Keith Chapman wrote: You could also enable it with --conf spark.logLineage=true if you do not want to change any code. Regards,Keith. http://keith-chapman.com On Fri, Jul 21, 2017 at 7:57 PM, Keith Chapman

Re: Get full RDD lineage for a spark job

2017-07-21 Thread Keith Chapman
You could also enable it with --conf spark.logLineage=true if you do not want to change any code. Regards, Keith. http://keith-chapman.com On Fri, Jul 21, 2017 at 7:57 PM, Keith Chapman wrote: > Hi Ron, > > You can try using the toDebugString method on the RDD, this will print > the RDD lineag

Re: Get full RDD lineage for a spark job

2017-07-21 Thread Keith Chapman
Hi Ron, You can try using the toDebugString method on the RDD, this will print the RDD lineage. Regards, Keith. http://keith-chapman.com On Fri, Jul 21, 2017 at 11:24 AM, Ron Gonzalez wrote: > Hi, > Can someone point me to a test case or share sample code that is able to > extract the RDD g

Get full RDD lineage for a spark job

2017-07-21 Thread Ron Gonzalez
Hi,  Can someone point me to a test case or share sample code that is able to extract the RDD graph from a Spark job anywhere during its lifecycle? I understand that Spark has UI that can show the graph of the execution so I'm hoping that is using some API somewhere that I could use.  I know RDD