[jira] [Work logged] (HIVE-25867) Partition filter condition should pushed down to metastore query if it is equivalence Predicate

ASF GitHub Bot (Jira) Mon, 21 Mar 2022 05:58:05 -0700


     [ 
https://issues.apache.org/jira/browse/HIVE-25867?focusedWorklogId=745031&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-745031
 ]


ASF GitHub Bot logged work on HIVE-25867:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 21/Mar/22 12:57
            Start Date: 21/Mar/22 12:57
    Worklog Time Spent: 10m 
      Work Description: zzzzming95 commented on pull request #2947:
URL: https://github.com/apache/hive/pull/2947#issuecomment-1073862137


   > With this change I think the filtering on the HMS DB side is effectively 
turned off. This would cause queries with smaller number of partitions become 
slow. Is this issue happening when there are too many partitions in the filter? 
Could we just turn off this filter if the number of partitions are too high? 
What is the number of partitions in the query when you had experienced problems?
   
   thanks @pvary .
   
   in our case , We have a table (two partition fields) with about 60w 
partitions. When multiple sql jobs are executed at the same time, the HMS DB 
load will increase.
   
   > Could we just turn off this filter if the number of partitions are too 
high?
   
   According to my understanding, there should be no way for us to get the 
number of partitions before sql execution. Do you mean to cache the partitions 
number information in HMS to optimize ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 745031)
    Time Spent: 50m  (was: 40m)

> Partition filter condition should pushed down to metastore query if it is 
> equivalence Predicate
> -----------------------------------------------------------------------------------------------
>
>                 Key: HIVE-25867
>                 URL: https://issues.apache.org/jira/browse/HIVE-25867
>             Project: Hive
>          Issue Type: Improvement
>          Components: Standalone Metastore
>            Reporter: shezm
>            Assignee: shezm
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0
>
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> The colnum type of the partition is different from the column type of the hql 
> query, the metastore will not push down the query to the RDBMS, but will 
> instead get all PARTITIONS.PART_NAME of the hive table then filter it 
> according to the hql Expression. 
> https://github.com/apache/hive/blob/5b112aa6dcc4e374c0a7c2b24042f24ae6815da1/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java#L1316
> If the hive table has too many partitions and there are multiple hql queries 
> at the same time,RDBMS will increasing CPU IO_WAIT and affect performance.
> If the partition filter condition in hql is an equivalent predicate, the 
> metastore should be pushed down to RDBMS, which can optimize the query 
> performance of hive large tables.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Work logged] (HIVE-25867) Partition filter condition should pushed down to metastore query if it is equivalence Predicate

Reply via email to