[ https://issues.apache.org/jira/browse/HIVE-7926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507333#comment-15507333 ]
Siddharth Seth commented on HIVE-7926: -------------------------------------- bq. In the sentence: “The initial stage of the query is pushed into #LLAP, large shuffle is performed in their own containers” - What does "their own containers" refer to? Is there only one large shuffle, or multiple shuffles? When executing a query, it's possible to launch separate containers (Java processes, fallback to regular Tez execution) to perform the large Shuffles. In many cases, running a Shuffle / Reduce within LLAP may not be beneficial (no caching gains, etc). That said - it's also possible to run these Shuffle/Reduce steps within LLAP itself, and that is the typical case for short running queries. Multiple shuffles are possible. This point primarily talks about where a reduce will run - within the LLAP daemon itself, or as a separate container (process). bq. In the sentence: "The node allows parallel execution for multiple query fragments from different queries and sessions” - what does "the node" refer to? A single LLAP node? Yes - that refers to an LLAP instance. A single LLAP process can handle multiple fragments from different queries, or the same query. > long-lived daemons for query fragment execution, I/O and caching > ---------------------------------------------------------------- > > Key: HIVE-7926 > URL: https://issues.apache.org/jira/browse/HIVE-7926 > Project: Hive > Issue Type: New Feature > Reporter: Sergey Shelukhin > Assignee: Sergey Shelukhin > Labels: TODOC2.0 > Fix For: 2.0.0 > > Attachments: LLAPdesigndocument.pdf > > > We are proposing a new execution model for Hive that is a combination of > existing process-based tasks and long-lived daemons running on worker nodes. > These nodes can take care of efficient I/O, caching and query fragment > execution, while heavy lifting like most joins, ordering, etc. can be handled > by tasks. > The proposed model is not a 2-system solution for small and large queries; > neither it is a separate execution engine like MR or Tez. It can be used by > any Hive execution engine, if support is added; in future even external > products (e.g. Pig) can use it. > The document with high-level design we are proposing will be attached shortly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)