[ https://issues.apache.org/jira/browse/HIVE-25669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17438138#comment-17438138 ]
Gopal Vijayaraghavan commented on HIVE-25669: --------------------------------------------- bq. why aren't my old folders(base_0000009, base_00000010) deleted? https://cwiki.apache.org/confluence/display/hive/hive+transactions#HiveTransactions-BaseandDeltaDirectories > After Insert overwrite (managed table), the previous data of the table is not > deleted > ------------------------------------------------------------------------------------- > > Key: HIVE-25669 > URL: https://issues.apache.org/jira/browse/HIVE-25669 > Project: Hive > Issue Type: Bug > Components: Hive > Affects Versions: 3.1.0 > Environment: 1. hadoop eco versions > - hive : 3.1.0 > - Tez : 0.9.1 > - hdfs : 3.1.1 > 2. Table info > - table name : test_t1 (*sample name) > - table : Managed table > - partitioning : X (non partition) > 3. Table properties > - transactional = true > - transactional_properties = insert_only > - bucketing_version = 2 > - auto.purge = true / false (*apply both) > > Reporter: Jihoon Lee > Priority: Minor > > When insert overwrite table, 'auto.purge' does not seem to work well. > h2. Step1. Create table > create table test_t1 ( > col1 string, > col2 string, > col3 string, > col4 string > ) > > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' > LOCATION > 'hdfs://nameservice1/user/hive/warehouse/st.db/test_ljh5' > TBLPROPERTIES ( > 'auto.purge'='{color:#de350b}*false*{color}', > 'bucketing_version'='2', > 'transactional'='true', > 'transactional_properties'='insert_only') > > h2. 2. Insert overwrite > 2-1) > insert overwrite table test_t1 > select * from origin_t1 limit 10000; > 2-2) > insert overwrite table test_t1 > select * from origin_t1 limit 20000; > 2-3) > insert overwrite table test_t1 > select * from origin_t1 limit 30000; > > h2. 3. Check HDFS files > - Hue file browser > > !https://mail.google.com/mail/u/0?ui=2&ik=10577dc09a&attid=0.1&permmsgid=msg-f:1715412595915827826&th=17ce5eaad6cb2a72&view=fimg&fur=ip&sz=s0-l75-ft&attbid=ANGjdJ9ygBFCoYIqI3etBmYvvRfg1l7ea2lSBC5QLxHMFhuWOh8f5u_JbzO2d65-t5I6v4Xxn9zF-ZKVya4uwIL_nDsELRTYiZ321XsPwqXzHZmG_HYA0wL3tAGLAN8&disp=emb! > why aren't my old folders(base_0000009, base_00000010) deleted? > It's the same even if i set the setting to '*auto.purge=true*' and to > '*auto.purge=false*'. > > And I have referenced here. > [https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML] > * INSERT OVERWRITE will overwrite any existing data in the table or partition > ** unless {{IF NOT EXISTS}} is provided for a partition (as of Hive 0.9.0). > ** As of Hive 2.3.0 (HIVE-15880), if the table has > [TBLPROPERTIES|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-listTableProperties] > ("auto.purge"="true") the previous data of the table is not moved to Trash > when INSERT OVERWRITE query is run against the table. This functionality is > applicable only for managed tables (see [managed > tables|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-ManagedandExternalTables]) > and is turned off when "auto.purge" property is unset or set to false. -- This message was sent by Atlassian Jira (v8.3.4#803005)