[ https://issues.apache.org/jira/browse/HIVE-18368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16313998#comment-16313998 ]
Sahil Takiar commented on HIVE-18368: ------------------------------------- [~chengxiang li] I saw you added the RDD graph in HIVE-10550, [~chinnalalam] I saw you added the SparkPlan graph in HIVE-8858. Could you take a look at this patch? - RB: https://reviews.apache.org/r/64996/ > Improve Spark Debug RDD Graph > ----------------------------- > > Key: HIVE-18368 > URL: https://issues.apache.org/jira/browse/HIVE-18368 > Project: Hive > Issue Type: Sub-task > Components: Spark > Reporter: Sahil Takiar > Assignee: Sahil Takiar > Attachments: HIVE-18368.1.patch, HIVE-18368.2.patch, Spark UI - Named > RDDs.png > > > The {{SparkPlan}} class does some logging to show the mapping between > different {{SparkTran}}, what shuffle types are used, and what trans are > cached. However, there is room for improvement. > When debug logging is enabled the RDD graph is logged, but there isn't much > information printed about each RDD. > We should combine both of the graphs and improve them. We could even make the > Spark Plan graph part of the {{EXPLAIN EXTENDED}} output. > Ideally, the final graph shows a clear relationship between Tran objects, > RDDs, and BaseWorks. Edge should include information about number of > partitions, shuffle types, Spark operations used, etc. -- This message was sent by Atlassian JIRA (v6.4.14#64029)