Further update:

I see this:

+-----------+----------+------------+--------+---------------------+-----------+----------------+--+
|  dbname   | tabname  |  partname  |  type  |        state        |
workerid  |   starttime    |
+-----------+----------+------------+--------+---------------------+-----------+----------------+--+
| Database  | Table    | Partition  | Type   | State               |
Worker    | Start Time     |
| txntest   | txntab3  | NULL       | MAJOR  | ready for cleaning  |
NULL      | 1479346924000  |
+-----------+----------+------------+--------+---------------------+-----------+----------------+--+

However, I don't see cleaner actually run in hive logs (I am looking for a
string like "compactor.Cleaner").


On Wed, Nov 16, 2016 at 5:30 PM, Manoj Murumkar <manoj.murum...@gmail.com>
wrote:

> Quick update:
>
> After each compaction, files under base directory (for the buckets) have
> latest data. However, I am expecting to see all delta files (and
> directories) gone, as they should be merged in the base directory.
> Otherwise, we'll start seeing too many small files on HDFS which is a
> problem. Am I understanding this feature correctly in assuming so?
>
> On Wed, Nov 16, 2016 at 5:24 PM, Manoj Murumkar <manoj.murum...@gmail.com>
> wrote:
>
>> Hi,
>>
>> We are trying to implement transaction feature in hive. I created
>> following table:
>>
>> +-----------------------------------------------------------
>> -----------------------------+--+
>> |                                     createtab_stmt
>> |
>> +-----------------------------------------------------------
>> -----------------------------+--+
>> | CREATE TABLE `txntest.txntab3`(
>>                                             |
>> |   `id` int,
>> |
>> |   `name` string)
>> |
>> | CLUSTERED BY (
>>                                             |
>> |   id)
>> |
>> | INTO 2 BUCKETS
>> |
>> | ROW FORMAT SERDE
>> |
>> |   'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
>> |
>> | STORED AS INPUTFORMAT
>> |
>> |   'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
>> |
>> | OUTPUTFORMAT
>> |
>> |   'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
>> |
>> | LOCATION
>> |
>> |   'hdfs://or1010051016175.corp.adobe.com:8020/user/hive/wareho
>> use/txntest.db/txntab3'  |
>> | TBLPROPERTIES (
>>                                            |
>> |   'COLUMN_STATS_ACCURATE'='true',
>> |
>> |   'numFiles'='22',
>> |
>> |   'numRows'='90000',
>> |
>> |   'rawDataSize'='0',
>> |
>> |   'totalSize'='3564019',
>> |
>> |   'transactional'='true',
>> |
>> |   'transient_lastDdlTime'='1479329198')
>> |
>> +-----------------------------------------------------------
>> -----------------------------+--+
>>
>> I inserted 90000 rows in it in multiple iterations, so it created 22
>> files (as is visible above). I have run multiple compactions (major and
>> minor), but nothing seems to happen on HDFS. What am I missing?
>>
>> I have following configuration:
>>
>> Metastore:
>>
>> hive.compactor.initiator.on = true;
>> hive.compactor.worker.threads = 2;
>>
>> Client:
>>
>> hive.support.concurrency = true;
>> hive.exec.dynamic.partition.mode = nonstrict;
>> hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
>>
>> Can someone point me in right direction? Compaction process did run
>> (verified via SHOW COMPACTIONS) and I also see we have base directory
>> created on HDFS.
>> I was expecting all the delta directories gone when major compation runs.
>>
>> drwxrwxrwt   - admin    hive          0 2016-11-16 20:47
>> /user/hive/warehouse/txntest.db/txntab3/base_0000021
>> -rw-r--r--   3 admin    hive     227916 2016-11-16 20:47
>> /user/hive/warehouse/txntest.db/txntab3/base_0000021/bucket_00000
>> drwxr-xr-x   - nex37045 hive          0 2016-11-16 01:33
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000003_0000003
>> -rw-r--r--   3 nex37045 hive        640 2016-11-16 01:33
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000003_000000
>> 3/bucket_00000
>> drwxr-xr-x   - nex37045 hive          0 2016-11-16 01:33
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000004_0000004
>> -rw-r--r--   3 nex37045 hive        640 2016-11-16 01:33
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000004_000000
>> 4/bucket_00000
>> drwxr-xr-x   - nex37045 hive          0 2016-11-16 01:34
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000005_0000005
>> -rw-r--r--   3 nex37045 hive        640 2016-11-16 01:33
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000005_000000
>> 5/bucket_00000
>> drwxr-xr-x   - nex37045 hive          0 2016-11-16 01:34
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000006_0000006
>> -rw-r--r--   3 nex37045 hive        640 2016-11-16 01:34
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000006_000000
>> 6/bucket_00000
>> drwxr-xr-x   - nex37045 hive          0 2016-11-16 01:34
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000007_0000007
>> -rw-r--r--   3 nex37045 hive        636 2016-11-16 01:34
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000007_000000
>> 7/bucket_00000
>> drwxr-xr-x   - nex37045 hive          0 2016-11-16 01:36
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000008_0000008
>> -rw-r--r--   3 nex37045 hive        640 2016-11-16 01:36
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000008_000000
>> 8/bucket_00000
>> drwxr-xr-x   - nex37045 hive          0 2016-11-16 01:36
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000009_0000009
>> -rw-r--r--   3 nex37045 hive        640 2016-11-16 01:36
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000009_000000
>> 9/bucket_00000
>> drwxr-xr-x   - nex37045 hive          0 2016-11-16 01:37
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000010_0000010
>> -rw-r--r--   3 nex37045 hive        640 2016-11-16 01:37
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000010_000001
>> 0/bucket_00000
>> drwxr-xr-x   - nex37045 hive          0 2016-11-16 01:37
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000011_0000011
>> -rw-r--r--   3 nex37045 hive        644 2016-11-16 01:37
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000011_000001
>> 1/bucket_00000
>> drwxr-xr-x   - nex37045 hive          0 2016-11-16 01:37
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000012_0000012
>> -rw-r--r--   3 nex37045 hive        644 2016-11-16 01:37
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000012_000001
>> 2/bucket_00000
>> drwxr-xr-x   - nex37045 hive          0 2016-11-16 01:44
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000013_0000013
>> -rw-r--r--   3 nex37045 hive        644 2016-11-16 01:44
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000013_000001
>> 3/bucket_00000
>> drwxr-xr-x   - nex37045 hive          0 2016-11-16 01:45
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000014_0000014
>> -rw-r--r--   3 nex37045 hive        644 2016-11-16 01:45
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000014_000001
>> 4/bucket_00000
>> drwxr-xr-x   - nex37045 hive          0 2016-11-16 02:02
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000015_0000015
>> -rw-r--r--   3 nex37045 hive        644 2016-11-16 02:02
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000015_000001
>> 5/bucket_00000
>> drwxrwxrwt   - admin    hive          0 2016-11-16 02:03
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000015_0000016
>> -rw-r--r--   3 admin    hive        531 2016-11-16 02:03
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000015_000001
>> 6/bucket_00000
>> drwxr-xr-x   - nex37045 hive          0 2016-11-16 02:03
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000016_0000016
>> -rw-r--r--   3 nex37045 hive        644 2016-11-16 02:03
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000016_000001
>> 6/bucket_00000
>> drwxr-xr-x   - nex37045 hive          0 2016-11-16 20:37
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000017_0000017
>> -rw-r--r--   3 nex37045 hive     156395 2016-11-16 20:37
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000017_000001
>> 7/bucket_00000
>> drwxrwxrwt   - admin    hive          0 2016-11-16 20:40
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000017_0000019
>> -rw-r--r--   3 admin    hive    2598250 2016-11-16 20:40
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000017_000001
>> 9/bucket_00000
>> drwxr-xr-x   - nex37045 hive          0 2016-11-16 20:39
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000018_0000018
>> -rw-r--r--   3 nex37045 hive       4737 2016-11-16 20:39
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000018_000001
>> 8/bucket_00000
>> drwxr-xr-x   - nex37045 hive          0 2016-11-16 20:39
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000019_0000019
>> -rw-r--r--   3 nex37045 hive     192658 2016-11-16 20:39
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000019_000001
>> 9/bucket_00000
>> drwxr-xr-x   - nex37045 hive          0 2016-11-16 20:45
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000020_0000020
>> -rw-r--r--   3 nex37045 hive     192835 2016-11-16 20:45
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000020_000002
>> 0/bucket_00000
>> drwxr-xr-x   - nex37045 hive          0 2016-11-16 20:46
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000021_0000021
>> -rw-r--r--   3 nex37045 hive     201206 2016-11-16 20:46
>> /user/hive/warehouse/txntest.db/txntab3/delta_0000021_000002
>> 1/bucket_00000
>>
>>
>>
>>
>> Thanks,
>>
>> Manoj
>>
>>
>

Reply via email to