[ 
https://issues.apache.org/jira/browse/HIVE-9768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HIVE-9768:
------------------------------
    Description: 
Feature request for Hive LLAP to preload table metadata across all running 
nodes to reduce query latency (this is what Impala does).

The design decision behind this in Impala was to avoid the latency overhead of 
fetching the metadata at query time, since that's an extra database query (or 
possibly HBase query in future HIVE-9452) that must first be completely 
fullfilled before the Hive LLAP query even starts to run, which would slow down 
the response to the user if not pre-loaded. Also, any temporary outage of the 
metadata layer would affect the speed LLAP layer so pre-loading and caching the 
metadata adds resilience against this.

This pre-loaded metadata also requires a cluster-wide "refresh metadata" 
operation, something Impala added later, and now calls "INVALIDATE METADATA" in 
it's SQL dialect. I propose using a more intuitive "REFRESH METADATA" Hive 
command instead.

(Fyi I was in the first trio of Impala SMEs at Cloudera in early 2013)

Regards,

Hari Sekhon
ex-Cloudera
http://www.linkedin.com/in/harisekhon

  was:
Feature request for Hive LLAP to preload table metadata across all running 
nodes to reduce query latency (this is what Impala does).

The design decision behind this in Impala was to avoid the latency overhead of 
fetching the metadata at query time, since that's an extra database query (or 
possibly HBase query in future) that must first be completely fullfilled before 
the Hive LLAP query even starts to run, which would slow down the response to 
the user if not pre-loaded.

This pre-loaded metadata also requires a cluster-wide "refresh metadata" 
operation, something Impala added later, and now calls "INVALIDATE METADATA" in 
it's SQL dialect. I propose using a more intuitive "REFRESH METADATA" Hive 
command instead.

(Fyi I was in the first trio of Impala SMEs at Cloudera in early 2013)

Regards,

Hari Sekhon
ex-Cloudera
http://www.linkedin.com/in/harisekhon


> Hive LLAP Metadata pre-load for low latency, + cluster-wide metadata 
> refresh/invalidate command
> -----------------------------------------------------------------------------------------------
>
>                 Key: HIVE-9768
>                 URL: https://issues.apache.org/jira/browse/HIVE-9768
>             Project: Hive
>          Issue Type: New Feature
>          Components: Database/Schema
>    Affects Versions: 0.14.0, llap
>         Environment: HDP 2.2
>            Reporter: Hari Sekhon
>
> Feature request for Hive LLAP to preload table metadata across all running 
> nodes to reduce query latency (this is what Impala does).
> The design decision behind this in Impala was to avoid the latency overhead 
> of fetching the metadata at query time, since that's an extra database query 
> (or possibly HBase query in future HIVE-9452) that must first be completely 
> fullfilled before the Hive LLAP query even starts to run, which would slow 
> down the response to the user if not pre-loaded. Also, any temporary outage 
> of the metadata layer would affect the speed LLAP layer so pre-loading and 
> caching the metadata adds resilience against this.
> This pre-loaded metadata also requires a cluster-wide "refresh metadata" 
> operation, something Impala added later, and now calls "INVALIDATE METADATA" 
> in it's SQL dialect. I propose using a more intuitive "REFRESH METADATA" Hive 
> command instead.
> (Fyi I was in the first trio of Impala SMEs at Cloudera in early 2013)
> Regards,
> Hari Sekhon
> ex-Cloudera
> http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to