Hi,

Can someone validate the following regarding the ExecutionGraph:

Each IntermediateResult can only be consumed by a single ExecutionJobVertex, 
i.e. if two ExecutionJobVertex consume the same tuples (same “stream") that is 
produced by the same ExecutionJobVertex, then the producer will have two 
IntermediateResult, one per consumer. 

In other words: if an ExecutionJobVertex performs a map operation, and has two 
consumers (different ExecutionJobVertex), the ExecutionJobVertex will produce 
two datasets/IntermediateResults (both with the same “content”, but different 
consumers).

Each ExecutionVertex will then have the same amount of 
IntermediateResultPartitions as the number of ExecutionJobVertex that consume 
the datasets generated by the respective ExecutionJobVertex.

Thus, at runtime:

ResultPartition maps to an IntermediateResultPartition (as documented in the 
javadoc). Thus, 3. is also valid for ResultPartitions.

ResultSubPartition maps to an ExecutionEdge (since it contains the information 
on how to send the partition to the actual consumer Task).

Thanks,

Luís Alves

Reply via email to