If you are talking about a tree, then the RDDs are nodes, and the dependencies
are the edges.
If you are talking about a DAG, then the partitions in the RDDs are the nodes,
and the dependencies between the partitions are the edges.
On Thu, Apr 16, 2020 at 4:02 PM, Mania Abdi < abdi...@husky.neu
Is it correct to say, the nodes in the DAG are RDDs and the edges are
computations?
On Thu, Apr 16, 2020 at 6:21 PM Reynold Xin wrote:
> The RDD is the DAG.
>
>
> On Thu, Apr 16, 2020 at 3:16 PM, Mania Abdi wrote:
>
>> Hello everyone,
>>
>> I am implementing a caching mechanism for analytic wor
The RDD is the DAG.
On Thu, Apr 16, 2020 at 3:16 PM, Mania Abdi < abdi...@husky.neu.edu > wrote:
>
> Hello everyone,
>
> I am implementing a caching mechanism for analytic workloads running on
> top of Spark and I need to retrieve the Spark DAG right after it is
> generated and the DAG schedule