[
https://issues.apache.org/jira/browse/HIVE-1107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12933170#action_12933170
]
Arun C Murthy commented on HIVE-1107:
-------------------------------------
+1 on the direction to get Pig and Hive to use common infrastructure for DAG
execution.
{quote}
1) A way to serialize and exchange this DAG (e.g. Avro, JSON, XML)
2) A service to execute the DAG and ensure it runs to completion
{quote}
+1
Some more:
# Ability to modify the DAG on the fly, potentially in reaction to execution of
parents of the nodes.
# Maybe shared infrastructure for ability to restart the necessary components
of the DAG etc.
----
I agree with Russell that Oozie seems too complicated for this task.
Potentially, as Zheng suggested, a generalized form of JobControl from
Map-Reduce could be the answer, it could be something that Pig, Hive and
potentially even Oozie can co-opt.
> Generic parallel execution framework for Hive (and Pig, and ...)
> ----------------------------------------------------------------
>
> Key: HIVE-1107
> URL: https://issues.apache.org/jira/browse/HIVE-1107
> Project: Hive
> Issue Type: New Feature
> Components: Query Processor
> Reporter: Carl Steinbach
>
> Pig and Hive each have their own libraries for handling plan execution. As we
> prepare to invest more time improving Hive's plan execution mechanism we
> should also start to consider ways of building a generic plan execution
> mechanism that is capable of supporting the needs of Hive and Pig, as well as
> other Hadoop data flow programming environments.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.