[ 
https://issues.apache.org/jira/browse/FLINK-972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ufuk Celebi updated FLINK-972:
------------------------------
    Component/s:     (was: New Components)
                 Local Runtime

> Run Flink on Tez
> ----------------
>
>                 Key: FLINK-972
>                 URL: https://issues.apache.org/jira/browse/FLINK-972
>             Project: Flink
>          Issue Type: New Feature
>          Components: Local Runtime
>    Affects Versions: 0.6-incubating
>            Reporter: Stephan Ewen
>
> The current status is:
>   - A prototype that explores how Tez/Flink classes can interoperate was 
> created by Filip Haase and is at 
> https://github.com/filiphaase/incubator-tez/tree/stratosphere-input-output-proto2
>   - There is a version that runs "WordCount" in Tez, using the Flink input 
> formats, output formats, and UDFs.
> Next steps towards generic support for Flink programs are:
> 1) Integrate the Flink Memory Manager with Tez. This means actually defining 
> how much memory of each container Flink may allocate for its internal 
> algorithms. In Flink's core, we allow users to set the amount of memory, or 
> define it relative to the heap size (with 0.7*heap_size) being used if 
> nothing else is specified.
> 2) Create a version of the Flink task context (PactTaskContext) for Tez: This 
> would allow to run the Flink runtime operators on a Tez processor.
> 3) Integrate Flink "ship strategies" (partitioning, replication, 
> redistribution, etc) with the way Tez parameterizes connections.
> 4) Integrate the Flink Sorting/Caching with Tez. This should be simple, if 
> the memory manager is there, these classes should work out of the box.
> 5) Create a component that creates the Tez DAG from a flink "OptimizedPlan". 
> We currently have a component that creates a "Job Graph" (Flink's DAG) from 
> an OptimizedPlan, it is the last step of the "pre-flight phase" before the 
> job is given to the master to be scheduled. We need an equivalent component 
> to create a Tez DAG.
> 6) Create a distribution that uses Tez as distributed runtime. Create a 
> "client" that creates a Tez AM on Yarn and submits the DAG there. Adopt the 
> bash scripts to pick up the Tez and Yarn parameters and set up the client 
> correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to