[ 
https://issues.apache.org/jira/browse/HIVE-7926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507333#comment-15507333
 ] 

Siddharth Seth commented on HIVE-7926:
--------------------------------------

bq. In the sentence: “The initial stage of the query is pushed into #LLAP, 
large shuffle is performed in their own containers” - What does "their own 
containers" refer to? Is there only one large shuffle, or multiple shuffles?
When executing a query, it's possible to launch separate containers (Java 
processes, fallback to regular Tez execution) to perform the large Shuffles. In 
many cases, running a Shuffle / Reduce within LLAP may not be beneficial (no 
caching gains, etc). That said - it's also possible to run these Shuffle/Reduce 
steps within LLAP itself, and that is the typical case for short running 
queries. Multiple shuffles are possible.
This point primarily talks about where a reduce will run - within the LLAP 
daemon itself, or as a separate container (process).

bq. In the sentence: "The node allows parallel execution for multiple query 
fragments from different queries and sessions” - what does "the node" refer to? 
A single LLAP node?
Yes - that refers to an LLAP instance. A single LLAP process can handle 
multiple fragments from different queries, or the same query.

> long-lived daemons for query fragment execution, I/O and caching
> ----------------------------------------------------------------
>
>                 Key: HIVE-7926
>                 URL: https://issues.apache.org/jira/browse/HIVE-7926
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>              Labels: TODOC2.0
>             Fix For: 2.0.0
>
>         Attachments: LLAPdesigndocument.pdf
>
>
> We are proposing a new execution model for Hive that is a combination of 
> existing process-based tasks and long-lived daemons running on worker nodes. 
> These nodes can take care of efficient I/O, caching and query fragment 
> execution, while heavy lifting like most joins, ordering, etc. can be handled 
> by tasks.
> The proposed model is not a 2-system solution for small and large queries; 
> neither it is a separate execution engine like MR or Tez. It can be used by 
> any Hive execution engine, if support is added; in future even external 
> products (e.g. Pig) can use it.
> The document with high-level design we are proposing will be attached shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to