[ https://issues.apache.org/jira/browse/HIVE-21354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17079388#comment-17079388 ]
Peter Vary commented on HIVE-21354: ----------------------------------- I have found this in Hive 3.1: {code:java} EXPLAIN LOCKS SELECT * FROM web_logs; LOCK INFORMATION: default.web_logs -> SHARED_READ default.web_logs.date=2015-11-18 -> SHARED_READ default.web_logs.date=2015-11-19 -> SHARED_READ default.web_logs.date=2015-11-20 -> SHARED_READ default.web_logs.date=2015-11-21 -> SHARED_READ {code} [~gopalv], [~belugabehr]: Are you aware of any change in 4.0 which changes this, but not backported to 3.1? Thanks, Peter > Lock The Entire Table If Majority Of Partitions Are Locked > ---------------------------------------------------------- > > Key: HIVE-21354 > URL: https://issues.apache.org/jira/browse/HIVE-21354 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 > Affects Versions: 4.0.0, 3.2.0 > Reporter: David Mollitor > Priority: Major > > One of the bottlenecks of any Hive query is the ZooKeeper locking mechanism. > When a Hive query interacts with a table which has a lot of partitions, this > may put a lot of stress on the ZK system. > Please add a heuristic that works like this: > # Count the number of partitions that a query is required to lock > # Obtain the total number of partitions in the table > # If the number of partitions accessed by the query is greater than or equal > to half the total number of partitions, simply create one ZNode lock at the > table level. > This would improve performance of many queries, but in particular, a {{select > count(1) from table}} ... or ... {{select * from table limit 5}} where the > table has many partitions. -- This message was sent by Atlassian Jira (v8.3.4#803005)