Re: ALTER TABLE DROP PARTITION not working on S3

Richard A. Bross Thu, 05 Apr 2018 07:02:32 -0700

I can't duplicate the issue now.  Works like it always has . . 

----- Original Message -----
From: "Furcy Pin" <pin.fu...@gmail.com>
To: user@hive.apache.org
Sent: Thursday, April 5, 2018 9:57:39 AM
Subject: Re: ALTER TABLE DROP PARTITION not working on S3



Indeed. 


If I remember correctly, s3 does not really have the concept of "folder" like 
HDFS has, and Hive sort of makes up for it by creating a descriptor file 
where the partition "folder" is supposed to be. Maybe this is what is missing 
here. 


Perhaps you could try doing a "MSCK REPAIR TABLE tablename" to make sure that 
the partitions are correctly loaded and then try again dropping that particular 
partition? 


Or look at your s3 folder if you see any such "partition folder file" and check 
if it is missing for this particular partition? 




On 5 April 2018 at 15:40, Richard A. Bross < r...@oaktreepeak.com > wrote: 


Leaving the column list out, here you go: 

# Detailed Table Information 
Database: default 
Owner: hadoop 
CreateTime: Thu Apr 05 13:24:33 UTC 2018 
LastAccessTime: UNKNOWN 
Retention: 0 
Location: s3://zoomi-proto-warehouse-measurements/ 
Table Type: MANAGED_TABLE 
Table Parameters: 
COLUMN_STATS_ACCURATE {\"BASIC_STATS\":\"true\"} 
numFiles 10 
numPartitions 7 
numRows 153 
orc.compress ZLIB 
orc.create.index true 
rawDataSize 113563 
totalSize 37801 
transient_lastDdlTime 1522934673 

# Storage Information 
SerDe Library: org.apache.hadoop.hive.ql.io .orc.OrcSerde 
InputFormat: org.apache.hadoop.hive.ql.io .orc.OrcInputFormat 
OutputFormat: org.apache.hadoop.hive.ql.io .orc.OrcOutputFormat 
Compressed: No 
Num Buckets: 61 
Bucket Columns: [crs_id] 
Sort Columns: [] 
Storage Desc Params: 
serialization.format 1 
Time taken: 0.467 seconds, Fetched: 98 row(s) 


----- Original Message ----- 
From: "Furcy Pin" < pin.fu...@gmail.com > 
To: user@hive.apache.org 
Sent: Thursday, April 5, 2018 9:21:06 AM 


Subject: Re: ALTER TABLE DROP PARTITION not working on S3 


Hi Richard, 


could you please check if your table is EXTERNAL? 
You can see it with a "DESCRIBE FORMATTED table_name ;" 


That's what external tables are for, they don't delete underlying data when you 
drop them. 




On 5 April 2018 at 15:18, Richard A. Bross < r...@oaktreepeak.com > wrote: 


I think that someone put a file in there manually. Would that prevent Hive from 
dropping the partition. I also did a "drop table" and the s3 object keys 
persisted. 



----- Original Message ----- 
From: "Richard A. Bross" < r...@oaktreepeak.com > 
To: user@hive.apache.org 
Sent: Thursday, April 5, 2018 9:14:52 AM 
Subject: ALTER TABLE DROP PARTITION not working on S3 

Hi, 

I have a Hive managed table on S3, "api_measurements". I've tried dropping a 
partition like so: 

hive> alter table api_measurements drop if exists 
partition(daydate='2018-04-04', epoch=1522876500); 
Dropped the partition daydate=2018-04-04/epoch=1522876500 
OK 
Time taken: 2.109 seconds 

Yet the data is still on S3. Because object keys on S3 are always strings, I 
also tried this: 

hive> alter table api_measurements drop partition(daydate='2018-04-04', 
epoch='1522876500'); 
OK 
Time taken: 0.135 seconds 

Yet the object keys and data are still there. I assume that Im missing 
something really simple. Can anyone shed some light on this? 

Thanks

Re: ALTER TABLE DROP PARTITION not working on S3

Reply via email to