Venugopal Reddy K created HIVE-28145:
----------------------------------------

             Summary: getPartitionsByNames API returns partition objects with 
empty values in many fields when it is executed concurrently with dropPartition 
API 
                 Key: HIVE-28145
                 URL: https://issues.apache.org/jira/browse/HIVE-28145
             Project: Hive
          Issue Type: Bug
            Reporter: Venugopal Reddy K


*Description:*

getPartitionsByNames API returns partition objects with empty values in many 
fields when it is executed concurrently with dropPartition API.

org.apache.hadoop.hive.metastore.MetaStoreDirectSql#getPartitionsViaPartNames 
method does multiple queries to backend db to populate the various fields in 
the partition object. First it queries for part ids using partition names, then 
joins PARTITIONS, SDS, SERDES tables for those part ids and creates partition 
objects. Then another query to PARTITION_KEY_VALS table to get the partition 
values for those part ids and populates in already created partition objects. 

So if the partition is deleted just before PARTITION_KEY_VALS table query, it 
can lead to empty values in partition object. This issue can happen for other 
fields(like, partition params, storage descriptor params, serde params, sort 
cols, bucket cols, skewed cols etc) too in partition object that require 
queries to populate those fields.

*Note: Issue can be observed with both directsql and JDO based query.  Need to 
check for all APIs that involves multiple queries to backend database within a 
transaction.*

*Root Cause:*

Transaction is opened with default isolation level(read-committed). The default 
(in DataNucleus) is read-committed.

*Steps to reproduce:*
 # Create a partitioned table and add 500~1000 dynamic partitions(can add dummy 
partition param, sd param, serde param).
 # Create a thread pool of size 2 and submit 2 tasks. One task to submit 
getPartitionsByNames and another task to submit dropPartition in loop
 # Verify the fields in partition objects returned from getPartitionsByNames().



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to